Skip to: Site menu | Main content

Woodstox

High-performance XML processor.

ConfiguringStreamReaders Print

Configuring Woodstox XMLStreamReaders

(to be edited!)

Woodstox readers have two kinds of configurable properties; ones defined by the

StAX specification, and ones added by Woodstox itself.

StAX 1.0 specified properties

All property ids in this property group refer to constants defined in javax.xml.stream.XMLInputFactory.

StAX specification explains the properties to some degree; following table lists actual implementation details of Woodstox' implementation of the properties.

IS_COALESCING:

<br /> <span class="className">java.lang.Boolean</span>

<br />  False

Whether the reader should combine all adjacent text

events (events of type CHARACTERS, CDATA and SPACE) into a single event.

If set to true, the reader will combine all such adjacent text events

into a single event of type CHARACTERS.

If set to false, Woodstox will not combine events of different

type (and may in fact – depending on setting

<a href="#P_MIN_TEXT_SEGMENT">P_MIN_TEXT_SEGMENT</a>, see below – split

individual physical events into multiple returned events).

<p>

Turning this option on may slightly reduce performance.

</td>

</tr>

<tr>

<td><span class="propertyId">IS_NAMESPACE_AWARE</span>

<br /> <span class="className">java.lang.Boolean</span>

<br />  False

</td>

<td>

Whether parser will do namespace handling (as specified by the XML Namespaces

specification) or not.

<p>

If true, Reader will do namespace handling, proper URI

resolution is done using prefixes and namespace declarations, and local names

can not contain commas.

<p>

If false, namespace declarations get no special

handling but are included as normal attributes, full element names

(prefix and local name) are accessible via "local name" accessor methods.

Prefix accessors will then always return null, and namespace URI accessors

empty String to indicate the default namespace.

</td>

</tr>

<tr>

<td><a name="IS_REPLACING_ENTITY_REFERENCES"></a><span

class="propertyId">IS_REPLACING_ENTITY_REFERENCES</span>

<br /> <span class="className">java.lang.Boolean</span>

<br />  False

</td>

<td>

Whether the Reader will automatically expand general parsed entities or not.

<p>

If set to true, the reader will automatically resolve reference if

necessary (external entities), and then expand entity value.

<p>

If set to false, the reader will return return all general entities

(except for the 4 pre-defined entities – &amp;, &apos;, &lt;

and &gt as events of type ENTITY_REFERENCE.

<p>

Note: whether external (parsed) entities are handled at all depends

on value of property <a href="#IS_SUPPORTING_EXTERNAL_ENTITIES"

>IS_SUPPORTING_EXTERNAL_ENTITIES</a>.

<br />

Note: this does NOT affect the way <b>character entities</b> (like

<code>&#38;</code>) are handled – they are always automatically expanded.

</td>

</tr>

<tr>

<td><a name="IS_SUPPORTING_EXTERNAL_ENTITIES"></a><span

class="propertyId">IS_SUPPORTING_EXTERNAL_ENTITIES</span>

<br /> <span class="className">java.lang.Boolean</span>

<br />  True

</td>

<td>

Whether the reader will support references to general external (parsed)

entities or not.

<p>

If true, the reader will support such entities normally, either

automatically resolving and replacing such entities (if enabled by

property <a href="#IS_REPLACING_ENTITY_REFERENCES"

>IS_REPLACING_ENTITY_REFERENCES</a>), or, returning entity reference

event.

<p>

If false, reader will not support such references, and will throw an

exception if one is encountered. It is legal to define,

but not use (refer to), such entities.

</td>

</tr>

<tr>

<td><span class="propertyId">SUPPORT_DTD</span>

<br /> <span class="className">java.lang.Boolean</span>

<br />  true

</td>

<td>

Whether the reader will do any handling of internal and external DTD

subsets.

<p>

If true, the reader will parse both the internal and external DTD subsets,

reading all constructs. General entities declared can then be used by

the document; external DTD subsets read may also be cached (if property

<a href="#CACHE_DTDS">CACHE_DTDS</a> is set to true; internal subsets

are never cached as there is no way to reliably identify reuse).

Also, if property

<a href="#IS_VALIDATING">IS_VALIDATING</a> is set to true, document

will be validated.

<p>

Note: Turning this feature off will also prevent

<a href="#IS_VALIDATING">IS_VALIDATING</a> from having any effect as

DTD subsets will not be read.

</td>

<tr>

<td><span class="propertyId">IS_VALIDATING</span>

<br /> <span class="className">java.lang.Boolean</span>

<br />  False

</td>

<td>

Whether the reader will validate the XML document against DTD specified

by the document or not

<p>

If true (and property <a href="#SUPPORT_DTD">SUPPORT_DTD</a>

is true), and the document contains the DTD declaration (DOCTYPE directive

that refers to a DTD that can be read and/or has embedded internal DTD subset),

the reader will try to do following things:

<ul>

<li>Validate element structure using content declarations of ELEMENT entries

found in DTD

</li>

<li>Resolve attribute types and default values for attributes, accessible

via StAX accessors.

</li>

<li>Validate attribute values against definitions, including id uniqueness

checks, if any.

</li>

<li>Recognize type of whitespace, so that <b>ignorable whitespace</b>

can be detected.

</li>

</ul>

<p>

If false, the reader may still process DTD declaration (internal and external

subsets), but will not do validating, nor access or use attribute type or default

value information. Ignorable white space detection may be done by the

reader if that is feasible [note: need to clarify exact rules]

</td>

</tr>

<tr>

<td><span class="propertyId">REPORTER</span>

<br /> <span class="className">javax.xml.stream.XMLReporter</span>

<br />  null

</td>

<td>

Object to use for notifying calling application about recoverable problems

the document has. These include things like multiple ENTITY and ATTLIST

declarations in DTDs. If null, no problem notifications are sent.

</td>

</tr>

<tr>

<td><span class="propertyId">RESOLVER</span>

<br /> <span class="className">javax.xml.stream.XMLResolver</span>

<br />  null

</td>

<td>

Object that will be called to try to resolve external references to

the external DTD subset, general entities and parameter entities.

If set to non-null value, this resolver will be called first, before

the default resolution mechanism. If resolver returns a valid return

value (see below), it will be used as the source for the entity

value.

<p>

Note: Currently Woodstox only supports return values of type

<code>java.io.InputStream</code>; other types that StAX API suggests

are not accepted, and will cause an Exception to be thrown.

<br />

Note: if using Woodstox reader, it is recommended that the specific

<a href="#P_ENTITY_RESOLVER">P_ENTITY_RESOLVER</a>

(for general entity references)

and

<a href="#P_DTD_RESOLVER">P_DTD_RESOLVER</a>

(for external DTD subset and parameter entities)

properties are used instead: <code>XMLResolver</code> interface

unfortunately lacks some of context handling features it should have

(problem with StAX specification). In fact, internally Woodstox

will just wrap this Object and set it as <a href="#P_ENTITY_RESOLVER">

value.

</td>

</tr>

<tr>

<td><span class="propertyId">ALLOCATOR</span>

<br /> <span class="className">javax.xml.stream.XMLEventAllocator</span>

<br />  null

</td>

<td>

Defines the factory object used to create the event objects created by

the Event API (<code>XMLEventReader</code>). If left to null, Woodstox

will use the default implementation,

<code>com.ctc.wstx.stax.event.DefaultEventAllocator</code>.

<p>

Note: although it is possible to implement instances from scratch, it

is strongly encouraged that instances used with Woodstox are created by

extending

<code>com.ctc.wstx.stax.event.DefaultEventAllocator</code>;

mostly because it takes care of some of problems with the specification.

Specifically, some data needed for some events is not available via

basic StAX API: and as a result, default implementation accesses some

information via extended Woodstox API, when used with Woodstox event

reader.

</td>

</tr>

</table>

StAX2 (v1.0) specified properties

All property ids in this property group refer to constants defined in org.codehaus.stax2.XMLInputFactory2

  • P_REPORT_ALL_TEXT_AS_CHARACTERS
  • P_REPORT_PROLOG_WHITESPACE
    • Short desc:
    • Type: java.lang.Boolean
    • Default value: **
  • P_INTERN_NAMES
  • P_REPORT_PROLOG_WHITESPACE
    • Short desc:
    • Type: java.lang.Boolean
    • Default value: **
  • P_INTERN_NS_URIS
  • P_REPORT_PROLOG_WHITESPACE
    • Short desc:
    • Type: java.lang.Boolean
    • Default value: **
  • P_PRESERVE_LOCATION
    • Short desc:
    • Type: java.lang.Boolean
    • Default value: **
  • P_REPORT_PROLOG_WHITESPACE
    • Short desc:
    • Type: java.lang.Boolean
    • Default value: **

Woodstox custom properties

All property ids in this property group refer to constants defined in

interface <code>com.ctc.wstx.stax.WstxInputProperties</code>.

</p>

<p>

Default values are current as of version 0.8.8.

</p><p>

Note, also, that in some cases there may be more detailed information

available about specific properties in

<a href="../curr/javadocs/index.html">Javadocs</a> for

classes:

</p>

<ul>

<li><code>com.ctc.wstx.stax.WstxInputFactory</code></li>

<li><code>com.ctc.wstx.stax.WstxInputProperties</code></li>

<li><code>com.ctc.wstx.stax.ReaderConfig</code></li>

</ul>

<table border="1" frame="border" rules="all">

<tr valign="top">

<th align="left">Property id

<br /> Value space

<br />  Default value

</th>

<th width="*">Effects</th>

</tr>

<tr>

<td><a name="P_NORMALIZE_LFS"></a><span

class="propertyId">P_NORMALIZE_LFS</span>

<br /> <span class="className">java.lang.Boolean</span>

<br />  True

</td><td>

Whether the reader should normalize linefeeds in textual content (text, CDATA sections,

processing instruction data, comments, CDATA attribute values) according to XML

specifications.

<p>

If set to true, the reader will normalize such linefeeds; if false,

will leave the linefeeds as they are in the input.

<p>

Main reasons for setting this to false is to preserve native linefeeds on platforms

that do not use 'standard' XML linefeed (Windows, MacOS). It may also slightly

improve performance.

</td>

</tr>

<tr>

<td><a name="P_NORMALIZE_ATTR_VALUES"></a><span

class="propertyId">P_NORMALIZE_ATTR_VALUES</span>

<br /> <span class="className">java.lang.Boolean</span>

<br />  True

</td><td>

Whether the reader should normalize white space in attribute values

according to XML specifications. This is in addition to (optional) linefeed conversion,

which may or may not be done depending on value of

<a href="#P_NORMALIZE_LFS">P_NORMALIZE_LFS</a>.

<br />

If set to true, the reader will normalize such white space; if to false will not

normalize (except for the optional linefeed conversion, if enabled).

<p>

Main reasons for setting this to false is to minimize any changes to the input document

format. It may also slightly improve performance.

</td>

</tr>

<tr>

<td><a name="P_REPORT_PROLOG_WHITESPACE"><span

class="propertyId">P_REPORT_PROLOG_WHITESPACE</span>

<br /> <span class="className">java.lang.Boolean</span>

<br />  False

</td>

<td>

Whether the reader should report (ignorable) white space events in XML document prolog

and epilogs, ie. outside the actual XML Tree.

<p>

If set to true, will return SPACE events to indicate the ignorable white space; if set

to false will quietly just skip the white space.

<p>

Main reason to set this property on is to minimize changes to the input document

formatting. Turning it to true may have slight performance overhead.

</td>

</tr>

<tr>

<td><a name="P_REPORT_ALL_TEXT_AS_CHARACTERS"></a><span

class="propertyId">P_REPORT_ALL_TEXT_AS_CHARACTERS</span>

<br /> <span class="className">java.lang.Boolean</span>

<br />  False

</td>

<td>

Whether the reader should report all text events inside the document content as being of

type CHARACTERS or not (note that prolog/epilog white space will always be reported as

SPACE).

<p>

If true, all text (including ignorable white space) is to be reported

as type CHARACTERS; if false, the real type is returned for all cases but

for coalesced CDATA (which is always reported as CHARACTERS). [note: CDATA can only be

coalesced when <a href="#IS_COALESCING">IS_COALESCING</a> is set to True].

</td>

</tr>

<tr>

<td><a name="P_INTERN_URIS"></a><span

class="propertyId">P_INTERN_NS_URIS</span>

<br /> <span class="className">java.lang.Boolean</span>

<br />  True

</td><td>

Whether the reader should intern the values of namespace URIs or not.

<p>

If set to true, the reader will call String.intern() on all parsed namespace URIs.

If not, URI Strings are left as is, and since they are constructed from parsed data

are generally never intern()ed.

<p>

Usually there is no need to set this feature to false, since intern()ing overhead should

not be significant. Having this option set to true is good for performance especially

when accessing namespace prefixed attribute values.

<p>

Note: this option only matter is namespaces are supported, ie.

<a href="#IS_NAMESPACE_AWARE">IS_NAMESPACE_AWARE</a> is set to True.

</td>

</tr>

<tr>

<td><span class="propertyId">P_VALIDATE_TEXT_CHARS</span>

<br /> <span class="className">java.lang.Boolean</span>

<br />  False

</td>

<td>

Whether the Reader should validate text content (text segments, CDATA sections,

processing instruction data, comment contents, attribute values attribute type CDATA)

according to XML 1.1 rules or not.

<p>

If set to True, should verify that all characters included are in valid XML character

range (not just valid Unicode characters); if False will only do basic null character

checks but otherwise assume content is ok.

<p>

Note: Turning this option on will impose some processing overhead on parsing.

<br />

<b>NOTE</b>: Not yet fully implemented.

</td>

</tr>

<tr>

<td><a name="CACHE_DTDS"></a><span

class="propertyId">P_CACHE_DTDS</span>

<br /> <span class="className">java.lang.Boolean</span>

<br />  True

</td>

<td>

Whether the Reader should cache external DTD subsets read (which is done for

documents that have such subsets when <a href="#SUPPORT_DTD">SUPPORT_DTD</a>

is set to True) or not.

<p>

If set to True, will cache limited set of external DTD subsets (in the order

of 20 - 50 subsets max., depending on whether J2ME implementation or

'normal' one is used) in hopes of being able to reuse them;

if false, will do no caching.

<p>

Setting this option to false will prevent DTD subset reuse; setting it to True

will add some memory overhead for cached DTDs.

<p>

Note: as DTDs are cached on per-factory basis, it is important to try to reuse input

factory instances for parsing.

</td>

</tr>

<tr>

<td><a name="P_LAZY_PARSING"></a>

<span class="propertyId">P_LAZY_PARSING</span>

<br /> <span class="className">java.lang.Boolean</span>

<br />  True

</td>

<td>

Whether the reader is allowed to so-called "lazy" parsing, ie. only parse

parts of contents that the calling application has requested.

Benefit of this approach is most significant for long textual content

that is skipped (often the case for comments, ignorable white space,

sometimes for CDATA and text segments); if so there is no need to

allocate memory for reading the textual content. The (only?) downside is

that this may also lead to "lazy exceptions"; since full parsing may not

be done on call to <code>reader.next()</code>, but later on, exceptions

are also only thrown when the problem is encountered later on.

<p>

If set to True, will allow Reader to only parse/load data as needed;

if set to False will force Reader to always read in all the data.

</td>

</tr>

<tr>

<td><a name="P_PRESERVE_LOCATION"></a><span

class="propertyId">P_PRESERVE_LOCATION</span>

<br /> <span class="className">java.lang.Boolean</span>

<br />  True

</td>

<td>

Whether the Event objects (created when using Event API, using

<code>javax.xml.stream.XMLEventReader</code>) will store actual

accurate Location information (private

<code>javax.xml.stream.Location</code> Object) or not.

<p>

Turning this feature off reduces memory usage somewhat, as well

as increases performance due to lessened garbage collection

time (when reclaiming discarded Event objects). Performance

improvement may be up to 25% in some cases.

</td>

</tr>

<tr>

<td><a name="P_INPUT_BUFFER_LENGTH"></a><span

class="propertyId">P_INPUT_BUFFER_LENGTH</span>

<br /> <span class="className">java.lang.Integer</span>

<br />  4000/2000 (J2SE/J2ME)

</td>

<td>

Determines the size of the input buffers the readers use for reading XML content

(for input streams, size in bytes; for stream readers in characters).

<p>

Setting this property to a low value helps in saving some memory, but negatively

impacts performance. Setting this property to reasonably high value may help

in improving performance, but the benefit decreases for bigger buffer sizes.

</td>

</tr>

<tr>

<td><span class="propertyId">P_TEXT_BUFFER_LENGTH</span>

<br /> <span class="className">java.lang.Integer</span>

<br />  2000/1000 (J2SE/J2ME)

</td>

<td>

Determines the initial text buffer segment size used internally to hold (processed)

text segment (text, CDATA, comment, proc. instr) contents. As with

<a href="#P_INPUT_BUFFER_LENGTH">INPUT_BUFFER_LENGTH</a> has some effect on

performance. However, this property is less critical since the segment size

will be incrementally increased on as-needed basis (since it has to, in order to

be able to store the whole text segment in question).

</td>

</tr>

<tr>

<td><a name="P_MIN_TEXT_SEGMENT"></a><span

class="propertyId">P_MIN_TEXT_SEGMENT</span>

<br /> <span class="className">java.lang.Integer</span>

<br />  64

</td>

<td>

Determines the shortest text segment (text, CDATA) length that the reader is

allowed to return to caller (but only if <a href="#IS_COALESCING">IS_COALESCING</a>

is set to False!).

<p>

Setting this property to a high value prevents splitting of physical text segments

into multiple events, but may slightly decrease parser performance. Leaving the

value to reasonably low value will let the reader optimize segmentation.

<p>

Note: setting this to a low value does not guarantee that the reader will only return

short segments; it just allows it to do so. Actual length of segments depends on

readers internal state and size of the text buffer (see <a href="#P_TEXT_BUFFER_LENGTH"

>P_TEXT_BUFFER_LENGTH</a>).

</td>

</tr>

<tr>

<td><span class="propertyId">P_CUSTOM_INTERNAL_ENTITIES</span>

<br /> <span class="className">Map</span>

<br />  null

</td>

<td>

This property allows calling application to specify further pre-defined general

internal entity values, in addition to the standard ones (amp, lt, gt, apos). Note that

the values need to be normally encoded, as if they were actually declared in

a DTD subset; meaning they do get re-parsed properly and thus can refer to other

entities (character and general entities).

<p>

Note: These entities are not used as parameter entities in DTD subsets.

</td>

</tr>

<tr>

<td><a name="P_DTD_RESOLVER"></a><span class="propertyId">P_DTD_RESOLVER</span>

<br /> <span class="className">com.ctc.wstx.stax.WstxInputResolver</span>

<br />  null

</td>

<td>

Resolver object that will be used as the primary DTD reference resolver, instead

of the default one. This will thus get called when resolving reference to the

external DTD subset, AND when resolving external parsed parameter entities (entities

declared in DTD subsets that have '%' prefix).

<p>

Note that the default entity resolver will always be used after calling this resolver,

if this resolver returns null.

</td>

</tr>

<tr>

<td><a name="P_ENTITY_RESOLVER"></a><span class="propertyId">P_ENTITY_RESOLVER</span>

<br /> <span class="className">com.ctc.wstx.stax.WstxInputResolver</span>

<br />  null

</td>

<td>

Resolver object that will be used as the primary entity reference resolver, instead

of the default one. This will thus get called when resolving declared external parsed

entity references (but not external DTD subset reference or parameter entities – see

<a href="#P_DTD_RESOLVER">P_DTD_RESOLVER</a>).

<p>

Note that the default entity resolver will always be used after calling this resolver,

if this resolver returns null.

</td>

<tr>

<td><a name="P_BASE_URL"></a><span class="propertyId">P_BASE_URL</span>

<br /> <span class="className">java.net.URL</span>

<br />  null

</td>

<td>

Basic reference location that can be used by the resolvers (DTD, ENTITY) when resolving

relative references. If set to non-null, will be used as the main context for resolution;

if left as null, system id (if any passed) will be used instead, assuming it is either

a valid URL, or reference from the current directory at the server.

<p>

It is often good idea to set this property when the application will be run as

a managed service (in an application server etc.), to ensure that the 'root' location

is well-known.

</td>

</tr>

</table>

<h2>Profiles (property groups)</h2>

In addition to being able to set individual values separate, Woodstox also

allows for using "profiles"; pre-set values for group of properties to

optimize readers for specific goal.

<p>

To use profiles, you need to use Woodstox-specific method calls, since

StAX API does not have similar concept.

</p><p>

As with Woodstox-specific properties, in some cases there may be more

detailed information available about specific profiles in

<a href="../curr/javadocs/index.html">Javadocs</a> for

classes:

</p>

<ul>

<li><code>com.ctc.wstx.stax.WstxInputFactory</code></li>

<li><code>com.ctc.wstx.stax.ReaderConfig</code></li>

</ul>

<table border="1" frame="border" rules="all">

<tr valign="top">

<th>Method call</th>

<th width="*">Effects</th>

</tr>

<tr>

<td>configureForMaxConformance()</td>

<td>

Profile that will try to make processing as close to the one defined

by the XML specification as possible. This may have some (usually slight)

performance overhead, but no increased memory usage.

<p>

Will set following property values:

<ul>

<li><a href="#P_NORMALIZE_LFS">P_NORMALIZE_LFS</a>: True

</li>

<li><a href="#P_NORMALIZE_ATTR_VALUES">P_NORMALIZE_ATTR_VALUES</a>: True

</li>

</ul>

</td>

</tr>

<tr>

<td>configureForMaxConvenience()</td>

<td>

Profile that will try to use the settings that will make parsing and

exception handling "as easy as possible". This means trying to ensure

that some of the things that might be done to optimize performance

are disabled; things like splitting of text segments (disabled with

this profile), as well as to suppress reporting things that are usually

not very useful (like ignorable white space in prolog/epilog).

<p>

Will set following property values:

<ul>

<li><a href="#IS_COALESCING">IS_COALESCING</a>: True

</li>

<li><a href="#P_REPORT_ALL_TEXT_AS_CHARACTERS">P_REPORT_ALL_TEXT_AS_CHARACTERS</a>: True

</li>

<li><a href="#IS_REPLACING_ENTITY_REFERENCES">IS_REPLACING_ENTITY_REFERENCES</a>: True

</li>

<li><a href="#P_REPORT_PROLOG_WHITESPACE">P_REPORT_PROLOG_WHITESPACE</a>: False (seldom interesting; only useful for round-tripping)

</li>

<li><a href="#P_LAZY_PARSING">P_LAZY_PARSING</a>: False (to prevent "lazy exceptions")

</li>

<li><a href="#P_PRESERVE_LOCATION">P_PRESERVE_LOCATION</a>: True (to make sure Event objects if used will have accurate Location information if needed)

</li>

</ul>

<td>

</td>

</tr>

<tr>

<td>configureForMaxSpeed()</td>

<td>

Profile that will try to optimize performance of the reader (ie. make

parsing as fast as possible), possibly by increasing memory usage

somewhat.

<p>

Will set following property values:

<ul>

<li><a href="#IS_COALESCING">IS_COALESCING</a>: False

</li>

<li><a href="#P_MIN_TEXT_SEGMENT">P_MIN_TEXT_SEGMENT</a>: 8 (characters)

to allow reader versatility in reporting short segments

</li>

<li><a href="#P_INPUT_BUFFER_LENGTH">P_INPUT_BUFFER_LENGTH</a>: 8000

(twice the default; can use even bigger values, but values above 64k

are unlikely to yield additional improvements)

</li>

<li><a href="#P_TEXT_BUFFER_LENGTH">P_TEXT_BUFFER_LENGTH</a>: 4000

(twice the default; increasing this amount is unlikely to yield

significant performance improvements)

</li>

<li><a href="#P_NORMALIZE_LFS">P_NORMALIZE_LFS</a>: False (most likely

to affect performance on Windows platform)

</li>

<li><a href="#P_NORMALIZE_ATTR_VALUES">P_NORMALIZE_ATTR_VALUES</a>: False

</li>

<li><a href="#P_CACHE_DTDS">P_INTERN_URIS</a>: True

</li>

<li><a href="#P_CACHE_DTDS">P_CACHE_DTDS</a>: True

</li>

<li><a href="#P_LAZY_PARSING">P_LAZY_PARSING</a>: True

</li>

<li><a href="#P_PRESERVE_LOCATION">P_PRESERVE_LOCATION</a>: False

(to minimize memory usage, and thereby increasing speed directly and via

reduced GC activity).

</li>

</ul>

</td>

</tr>

<tr>

<td>configureForMinMemUsage()</td>

<td>

Profile that will try to minimize memory usage of the reader, possibly

at some expense of performance

<p>

Will set following property values:

<ul>

<li><a href="#IS_COALESCING">IS_COALESCING</a>: False (coalescing may

require use of longer text buffers)

</li>

<li><a href="#P_MIN_TEXT_SEGMENT">P_MIN_TEXT_SEGMENT</a>: 16 (default)

</li>

<li><a href="#P_INPUT_BUFFER_LENGTH">P_INPUT_BUFFER_LENGTH</a>: 512

</li>

<li><a href="#P_TEXT_BUFFER_LENGTH">P_TEXT_BUFFER_LENGTH</a>: 512

</li>

<li><a href="#P_CACHE_DTDS">P_CACHE_DTDS</a>: False

</li>

<li><a href="#P_LAZY_PARSING">P_LAZY_PARSING</a>: True (in addition to

possible performance improvement, lazy parsing also prevents having to

store unneeded data in memory before needed)

</li>

<li><a href="#P_PRESERVE_LOCATION">P_PRESERVE_LOCATION</a>: False;

less Location objects used by Events, less memory usage.

</li>

</ul>

</td>

</tr>

<tr>

<td>configureForRoundTripping()</td>

<td>

Profile that will try to minimize changes to output formatting during

input processing and parsing as possible, to allow for output to resemble

the input as closely as possible (where structure is not changed). This

means suppressing all mandated character conversions, for one thing.

<p>

Will set following property values:

<ul>

<li><a href="#P_REPORT_PROLOG_WHITESPACE">P_REPORT_PROLOG_WHITESPACE</a>: True

</li>

<li><a href="#P_NORMALIZE_LFS">P_NORMALIZE_LFS</a>: False

</li>

<li><a href="#P_NORMALIZE_ATTR_VALUES">P_NORMALIZE_ATTR_VALUES</a>: False

</li>

<li><a href="#IS_COALESCING">IS_COALESCING</a>: False (to prevent reader from

combining adjacent CDATA/text sections)

</li>

<li><a href="#P_MIN_TEXT_SEGMENT">P_MIN_TEXT_SEGMENT</a>:

<code>Integer.MAX_VALUE</code> (read "unlimited", to prevent reader from

chopping text/CDATA sections into smaller chunks)

</li>

<li><a href="#IS_REPLACING_ENTITY_REFERENCES">IS_REPLACING_ENTITY_REFERENCES</a>: False (to allow writer to output unexpanded entities)

</li>

<li><a href="#P_REPORT_ALL_TEXT_AS_CHARACTERS">P_REPORT_ALL_TEXT_AS_CHARACTERS</a>: False (to preserve CDATA type when output)

</li>

</ul>

</td>

</tr>

</table>