Chapter 2. Technologies i n XML 19
only after it has parsed the whole document. However, once the document has
been created in memory, it can be navigated and changed. A DOM parser would
be a tree-based parser.

Event-based parsing

These parsers process the document as it encounters the tags of the doc ument.
It is a data-centric view of the XML. Whenever an element or tag is encountered,
it (or its contents) can be processed. However, it cannot backtrack once the tag
has been passed. The parser returns the element, its attributes and the contents.
The event-based parser never attempts to build a structure of the data, and
therefore, its memory requirements are less. It comes in useful, when one is
looking in the document only for certain elements. A SAX parser would be an
example of a event-based parser.
The most popular XML parsers on the market is the Apache XML Projects
Xerces. The parsers provides XML parsing and generation, and are
fully-validating parsers available for both Java and C++, implementing the W3C
XML and DOM (Level 1 and 2) standards, as well as SAX (Level 2) standard. The
parsers also support for XML Schema. This parser has been incorporated into
the IBM set of products (WebSphere, Application Studio and DB2).
Another parser is IBMs XML Parser for Java (XML4J and XML4C). The XML4J is
a validating XML parser written in 100% pure Java, whereas XML4C is a
validating XML parser written for C++. It provides classes for parsing, generating,
manipulating, and validating XML documents. Both parsers are support the XML
1.0 Recommendation and associated standards (DOM 1.0, SAX 1.0, DOM 2.0).
XML4J contains implementations of the DOM Level 2, the SAX Level 2
implementations, and parts of W3C schema, but these are experimental at this
stage. XML4C is supported on most operating systems including AIX and Li nux.
Both parsers are open source and have the same code base, where the XML4J
parser has the latest code enhancements, while Xerces has been through
production level testing.
2.2 DTD and XML Schema
DTDs and XML Schema are both used to describe structured information,
however, in the last two years acceptance of XML Schema has gained
momentum. Both DTDs and schemas are building blocks for XML docum ents
and consists of elements, tags, attributes, and entities
XML Schemas evolved to overcome limitations in DTDs. W3C has three
documents published, the latest update being in May 2001: