News
Developers
Java
New interfaces and interactions
Unix/Linux
XML Webservices
jQuery
Miscellaneous
VO Partner's corner
Astronomers
under construction
Information Scientists and Professionals
under construction
Outreach, education
Useful links
Read before any use of available Java classes
You can send any idea or comment concerning SAVOT to question@astro.unistra.fr
The main goals of this work are :
These different parsers are based on Pull and SAX like parsing methods :
The data model has been created to be able to load a VOTable document into memory.
The data model is independent from the parsers.
This model is based on the VOTable following schema (Documentation) :
(packages : cds.savot.pull, cds.savot.model and cds.savot.common)
SAVOT pull parsing can be implemented in two ways : FULL or SEQUENTIAL
Usefull informations about the work which is done around the pull parsing method.
(package : cds.savot.sax and cds.savot.common, cds.savot.model is optional)
In some use cases, it can be important to use a SAX parsing mode because it is possible to execute actions in the different steps of the parsing.
In this mode SAVOT does not save the data into memory, the developer has to manage a part of the process.
This mode is also a good solution if the available memory is short or if the VOTable files are very large.
Compared to the Pull mode, it requires often more work on the developer side.
How to start with the Pull Parser ?
* The usual questions…
Q - Which packages ?
You can download these packages in the Download corner
Q - And the CLASSPATH ?
Put the four above packages in the CLASSPATH
Q - Does it work ?
Download one of the samples and execute it
If it works, cheers, if not goto *
To start a basic source code, you must choose in which mode, FULL or SEQUENTIAL, you want to parse the VOTable file.
In this example we show how to create an object which contains the whole VOTable document (FULL mode).
// the whole VOTable file is put into memory SavotPullParser sb = new SavotPullParser(source, SavotPullEngine.FULL); !!! parsing of the whole source System.out.println("Resource name : " + ((SavotResource)sb.getVOTable().getResources().getItemAt(0)).getName()); // get the VOTable object SavotVOTable sv = sb.getVOTable(); !!! sv is now a reference to a VOTable object try { BufferedWriter bw = null; if (target != null) { bw = new BufferedWriter(new FileWriter(target)); } // for each resource for (int l = 0; l < sb.getResourceCount(); l++) { SavotResource currentResource = (SavotResource)(sv.getResources().getItemAt(l)); // for each table of the current resource for (int m = 0; m < currentResource.getTableCount(); m++) { // get all the rows of the table TRSet tr = currentResource.getTRSet(m); System.out.println("Number of items in TRset (= number of <TR></TR>) : " + tr.getItemCount()); // for each row for (int i = 0; i < tr.getItemCount(); i++) { // get all the data of the row TDSet theTDs = tr.getTDSet(i); String currentLine = new String(); System.out.println("Number of items in TDSet for the index " + (i+1) + " tr (= number of <TD></TD>) : " + theTDs.getItemCount()); // for each data of the row for (int j = 0; j < theTDs.getItemCount(); j++) { currentLine = currentLine + theTDs.getContent(j); System.out.println("<"+theTDs.getContent(j)+">"); } if (target != null) { if (target.compareTo("") != 0) { bw.write(currentLine); bw.newLine(); } } else System.out.println(currentLine); } } if (target != null) { bw.flush(); bw.close(); } } } ...
In this example we show how to use the SEQUENTIAL mode
// begin the parsing SavotPullParser sb = new SavotPullParser(source, SavotPullEngine.SEQUENTIAL);!!! parsing starting // get the next resource of the VOTable file SavotResource currentResource = sb.getNextResource(); !!! get the next resource // while a resource is available while (currentResource != null) { // for each table of this resource for (int i = 0; i < currentResource.getTableCount(); i++) { tr = currentResource.getTRSet(i); if (tr != null) { System.out.println("Number of items in TRset (= number of <TR></TR>) : " + tr.getItemCount()); // for each row of the table for (int j = 0; j < tr.getItemCount(); j++) { // get all the data of the row TDSet theTDs = tr.getTDSet(j); String currentLine = new String(); System.out.println("Number of items in TDSet for the index " + (j+1) + " tr (= number of <TD></TD>) : " + theTDs.getItemCount()); // for each data of the row for (int k = 0; k < theTDs.getItemCount(); k++) { currentLine = currentLine + theTDs.getContent(k); System.out.println("<" + theTDs.getContent(k) + ">"); } } } } // get the next resource currentResource = sb.getNextResource(); }
* The usual questions…
Which packages ?
You can download these packages in the Download corner
And the CLASSPATH ?
Put the above packages in the CLASSPATH
In this mode the developer must implement a SavotSAXConsumer interface which contains all the methods which will be executed during the parsing.
The developer decided what is done when :
…
See the following example (SavotSAXSample).
In this trivial example, the <VOTABLE …> attributes, the <RESOURCE …> attributes and the <TD>…</TD> content are printed on the standard output.
import java.util.Vector; import cds.savot.sax.*; public class SavotSAXSample implements SavotSAXConsumer { public SavotSAXSample() { } // attributes is a Vector containing couples of (attribute name, attribute value) // exemple : (attributes.elementAt(0), attributes.elementAt(1)), (attributes.elementAt(2), attributes.elementAt(3)), ... /** * * @param attributes Vector */ public void showAttributes(Vector attributes) { for (int i = 0; i < attributes.size(); i = i + 2) { System.out.println("Attribute name : " + attributes.elementAt(i) + " Attribute value : " + attributes.elementAt(i + 1)); } } // start elements public void startVotable(Vector attributes) { showAttributes(attributes); } public void startDescription(){ } public void startResource(Vector attributes){ showAttributes(attributes); } public void startTable(Vector attributes){ } ... // end elements public void endVotable(){} public void endDescription(){} public void endResource(){} public void endTable(){} ... // TEXT public void textTD(String text){ System.out.println(text); } public void textMin(String text){} public void textMax(String text){} ... // document public void startDocument(){} public void endDocument(){} }
The following lines must be included in your application :
...
SavotSAXSample consumer = new SavotSAXSample();
SavotSAXParser sb = new SavotSAXParser(consumer, file);
...
The SavotSaxSample consumer will be taken into account during the parsing process.
Q : Why not DOM ?
Parsers which implements DOM need often very large memory size (20 times the XML document size and it is not a joke !!), so we decided to use a pull parser to load the document in our own data model.
Q : Why different parsers ?
We need parsers for different use cases, sometimes for very small applications, so we cannot use a 2MB parser which would be 10 times bigger than the application itself !!! Sometimes it is very interesting to load all the document in the memory and sometimes it is better to use a SAX parser. It depends on what you want really to do.
Test hardware configuration :
Test software configuration :
These tests have been done with the pull parser kXML
All the VOTable document is loaded into the SAVOT internal data model and is available in memory for access through the API
| File | Size (KBytes) | Resources | Tables | Data Cells | Parsing time (seconds) |
|---|---|---|---|---|---|
| simbad1.xml | 9 | 2 | 8 | 64 | 0.32 |
| simbad2.xml | 70 | 20 | 109 | 1009 | 0.37 |
| simbad3.xml | 398 | 200 | 747 | 6831 | 0.5 |
| simbad4.xml | 2854 | 2000 | 5821 | 54515 | 1.3 |
| simbad5.xml | 29360 | 20000 | 61747 | 557944 | 10.45 |
| File | Size (KBytes) | Resources | Tables | Data Cells | Parsing time (seconds) |
|---|---|---|---|---|---|
| m31.xml | 3260 | 135 | 166 | 189020 | 1.68 |
| 3c273.xml | 9634 | 1 | 1 | 639991 | 3.6 |
Here we have put some links pointing to XML parsers which have been tested
© UDS/CNRS