public interface LSParser
LSParser
provides an API for parsing XML and building the
corresponding DOM document structure. A LSParser
instance
can be obtained by invoking the
DOMImplementationLS.createLSParser()
method.
As specified in [DOM Level 3 Core] , when a document is first made available via the LSParser:
- there will never be two adjacent nodes of type NODE_TEXT, and there will never be empty text nodes.
- it is expected that the
value
andnodeValue
attributes of anAttr
node initially return the XML 1.0 normalized value. However, if the parameters "validate-if-schema" and "datatype-normalization" are set totrue
, depending on the attribute normalization used, the attribute values may differ from the ones obtained by the XML 1.0 attribute normalization. If the parameters "datatype-normalization" is set tofalse
, the XML 1.0 attribute normalization is guaranteed to occur, and if the attributes list does not contain namespace declarations, theattributes
attribute onElement
node represents the property [attributes] defined in [XML Information Set].
Asynchronous LSParser
objects are expected to also
implement the events::EventTarget
interface so that event
listeners can be registered on asynchronous LSParser
objects.
Events supported by asynchronous LSParser
objects are:
- load
-
The
LSParser
finishes to load the document. See also the definition of theLSLoadEvent
interface. - progress
- The
LSParser
signals progress as data is parsed. This specification does not attempt to define exactly when progress events should be dispatched. That is intentionally left as implementation-dependent. Here is one example of how an application might dispatch progress events: Once the parser starts receiving data, a progress event is dispatched to indicate that the parsing starts. From there on, a progress event is dispatched for every 4096 bytes of data that is received and processed. This is only one example, though, and implementations can choose to dispatch progress events at any time while parsing, or not dispatch them at all. See also the definition of theLSProgressEvent
interface.
Note: All events defined in this specification use the
namespace URI "http://www.w3.org/2002/DOMLS"
.
While parsing an input source, errors are reported to the application
through the error handler (LSParser.domConfig
's
"error-handler"
parameter). This specification does in no way try to define all possible
errors that can occur while parsing XML, or any other markup, but some
common error cases are defined. The types (DOMError.type
) of
errors and warnings defined by this specification are:
-
"check-character-normalization-failure" [error]
- Raised if the parameter "check-character-normalization" is set to true and a string is encountered that fails normalization checking.
"doctype-not-allowed" [fatal]
- Raised if the
configuration parameter "disallow-doctype" is set to
true
and a doctype is encountered. "no-input-specified" [fatal]
-
Raised when loading a document and no input is specified in the
LSInput
object. -
"pi-base-uri-not-preserved" [warning]
- Raised if a processing
instruction is encountered in a location where the base URI of the
processing instruction can not be preserved. One example of a case where
this warning will be raised is if the configuration parameter
"entities"
is set to
false
and the following XML file is parsed:<!DOCTYPE root [ <!ENTITY e SYSTEM 'subdir/myentity.ent' ]> <root> &e; </root>
Andsubdir/myentity.ent
contains:<one> <two/> </one> <?pi 3.14159?> <more/>
"unbound-prefix-in-entity" [warning]
- An
implementation dependent warning that may be raised if the configuration parameter
"namespaces"
is set to
true
and an unbound namespace prefix is encountered in an entity's replacement text. Raising this warning is not enforced since some existing parsers may not recognize unbound namespace prefixes in the replacement text of entities. -
"unknown-character-denormalization" [fatal]
- Raised if the
configuration parameter "ignore-unknown-character-denormalizations" is
set to
false
and a character is encountered for which the processor cannot determine the normalization properties. -
"unsupported-encoding" [fatal]
- Raised if an unsupported encoding is encountered.
"unsupported-media-type" [fatal]
-
Raised if the configuration parameter "supported-media-types-only" is set
to
true
and an unsupported media type is encountered.
In addition to raising the defined errors and warnings, implementations are expected to raise implementation specific errors and warnings for any other error and warning cases such as IO errors (file not found, permission denied,...), XML well-formedness errors, and so on.
See also the Document Object Model (DOM) Level 3 Load and Save Specification.
- Since:
- 1.5
-
Field Summary
Fields Modifier and Type Field Description static short
ACTION_APPEND_AS_CHILDREN
Append the result of the parse operation as children of the context node.static short
ACTION_INSERT_AFTER
Insert the result of the parse operation as the immediately following sibling of the context node.static short
ACTION_INSERT_BEFORE
Insert the result of the parse operation as the immediately preceding sibling of the context node.static short
ACTION_REPLACE
Replace the context node with the result of the parse operation.static short
ACTION_REPLACE_CHILDREN
Replace all the children of the context node with the result of the parse operation. -
Method Summary
Modifier and Type Method Description void
abort()
Abort the loading of the document that is currently being loaded by theLSParser
.boolean
getAsync()
true
if theLSParser
is asynchronous,false
if it is synchronous.boolean
getBusy()
true
if theLSParser
is currently busy loading a document, otherwisefalse
.DOMConfiguration
getDomConfig()
TheDOMConfiguration
object used when parsing an input source.LSParserFilter
getFilter()
When a filter is provided, the implementation will call out to the filter as it is constructing the DOM tree structure.Document
parse(LSInput input)
Parse an XML document from a resource identified by aLSInput
.Document
parseURI(String uri)
Parse an XML document from a location identified by a URI reference [IETF RFC 2396].Node
parseWithContext(LSInput input, Node contextArg, short action)
Parse an XML fragment from a resource identified by aLSInput
and insert the content into an existing document at the position specified with thecontext
andaction
arguments.void
setFilter(LSParserFilter filter)
When a filter is provided, the implementation will call out to the filter as it is constructing the DOM tree structure.
-
Field Details
-
ACTION_APPEND_AS_CHILDREN
static final short ACTION_APPEND_AS_CHILDRENAppend the result of the parse operation as children of the context node. For this action to work, the context node must be anElement
or aDocumentFragment
.- See Also:
- Constant Field Values
-
ACTION_REPLACE_CHILDREN
static final short ACTION_REPLACE_CHILDRENReplace all the children of the context node with the result of the parse operation. For this action to work, the context node must be anElement
, aDocument
, or aDocumentFragment
.- See Also:
- Constant Field Values
-
ACTION_INSERT_BEFORE
static final short ACTION_INSERT_BEFOREInsert the result of the parse operation as the immediately preceding sibling of the context node. For this action to work the context node's parent must be anElement
or aDocumentFragment
.- See Also:
- Constant Field Values
-
ACTION_INSERT_AFTER
static final short ACTION_INSERT_AFTERInsert the result of the parse operation as the immediately following sibling of the context node. For this action to work the context node's parent must be anElement
or aDocumentFragment
.- See Also:
- Constant Field Values
-
ACTION_REPLACE
static final short ACTION_REPLACEReplace the context node with the result of the parse operation. For this action to work, the context node must have a parent, and the parent must be anElement
or aDocumentFragment
.- See Also:
- Constant Field Values
-
-
Method Details
-
getDomConfig
DOMConfiguration getDomConfig()TheDOMConfiguration
object used when parsing an input source. ThisDOMConfiguration
is specific to the parse operation. No parameter values from thisDOMConfiguration
object are passed automatically to theDOMConfiguration
object on theDocument
that is created, or used, by the parse operation. The DOM application is responsible for passing any needed parameter values from thisDOMConfiguration
object to theDOMConfiguration
object referenced by theDocument
object.
In addition to the parameters recognized in on the DOMConfiguration interface defined in [DOM Level 3 Core] , theDOMConfiguration
objects forLSParser
add or modify the following parameters:-
"charset-overrides-xml-encoding"
-
true
- [optional] (default) If a higher level protocol such as HTTP
[IETF RFC 2616] provides an
indication of the character encoding of the input stream being
processed, that will override any encoding specified in the XML
declaration or the Text declaration (see also section 4.3.3,
"Character Encoding in Entities", in [XML 1.0]).
Explicitly setting an encoding in the
LSInput
overrides any encoding from the protocol. false
- [required] The parser ignores any character set encoding information from higher-level protocols.
"disallow-doctype"
-
-
true
- [optional] Throw a fatal "doctype-not-allowed" error if a doctype node is found while parsing the document. This is useful when dealing with things like SOAP envelopes where doctype nodes are not allowed.
false
- [required] (default) Allow doctype nodes in the document.
-
-
"ignore-unknown-character-denormalizations"
-
-
true
- [required] (default) If, while verifying full normalization when [XML 1.1] is supported, a processor encounters characters for which it cannot determine the normalization properties, then the processor will ignore any possible denormalizations caused by these characters. This parameter is ignored for [XML 1.0].
-
false
- [optional] Report an fatal "unknown-character-denormalization" error if a character is encountered for which the processor cannot determine the normalization properties.
-
"infoset"
- See
the definition of
DOMConfiguration
for a description of this parameter. Unlike in [DOM Level 3 Core] , this parameter will default totrue
forLSParser
. "namespaces"
-
true
- [required] (default) Perform the namespace processing as defined in [XML Namespaces] and [XML Namespaces 1.1] .
false
- [optional] Do not perform the namespace processing.
-
"resource-resolver"
- [required] A reference to a
LSResourceResolver
object, or null. If the value of this parameter is not null when an external resource (such as an external XML entity or an XML schema location) is encountered, the implementation will request that theLSResourceResolver
referenced in this parameter resolves the resource. "supported-media-types-only"
-
-
true
- [optional] Check that the media type of the parsed resource is a supported media type. If an unsupported media type is encountered, a fatal error of type "unsupported-media-type" will be raised. The media types defined in [IETF RFC 3023] must always be accepted.
false
- [required] (default) Accept any media type.
-
"validate"
- See the definition of
DOMConfiguration
for a description of this parameter. Unlike in [DOM Level 3 Core] , the processing of the internal subset is always accomplished, even if this parameter is set tofalse
. -
"validate-if-schema"
- See the definition of
DOMConfiguration
for a description of this parameter. Unlike in [DOM Level 3 Core] , the processing of the internal subset is always accomplished, even if this parameter is set tofalse
. -
"well-formed"
- See the definition of
DOMConfiguration
for a description of this parameter. Unlike in [DOM Level 3 Core] , this parameter cannot be set tofalse
.
-
-
getFilter
LSParserFilter getFilter()When a filter is provided, the implementation will call out to the filter as it is constructing the DOM tree structure. The filter can choose to remove elements from the document being constructed, or to terminate the parsing early.
The filter is invoked after the operations requested by theDOMConfiguration
parameters have been applied. For example, if "validate" is set totrue
, the validation is done before invoking the filter. -
setFilter
When a filter is provided, the implementation will call out to the filter as it is constructing the DOM tree structure. The filter can choose to remove elements from the document being constructed, or to terminate the parsing early.
The filter is invoked after the operations requested by theDOMConfiguration
parameters have been applied. For example, if "validate" is set totrue
, the validation is done before invoking the filter. -
getAsync
boolean getAsync()true
if theLSParser
is asynchronous,false
if it is synchronous. -
getBusy
boolean getBusy()true
if theLSParser
is currently busy loading a document, otherwisefalse
. -
parse
Parse an XML document from a resource identified by aLSInput
.- Parameters:
input
- TheLSInput
from which the source of the document is to be read.- Returns:
- If the
LSParser
is a synchronousLSParser
, the newly created and populatedDocument
is returned. If theLSParser
is asynchronous,null
is returned since the document object may not yet be constructed when this method returns. - Throws:
DOMException
- INVALID_STATE_ERR: Raised if theLSParser
'sLSParser.busy
attribute istrue
.LSException
- PARSE_ERR: Raised if theLSParser
was unable to load the XML document. DOM applications should attach aDOMErrorHandler
using the parameter "error-handler" if they wish to get details on the error.
-
parseURI
Parse an XML document from a location identified by a URI reference [IETF RFC 2396]. If the URI contains a fragment identifier (see section 4.1 in [IETF RFC 2396]), the behavior is not defined by this specification, future versions of this specification may define the behavior.- Parameters:
uri
- The location of the XML document to be read.- Returns:
- If the
LSParser
is a synchronousLSParser
, the newly created and populatedDocument
is returned, ornull
if an error occured. If theLSParser
is asynchronous,null
is returned since the document object may not yet be constructed when this method returns. - Throws:
DOMException
- INVALID_STATE_ERR: Raised if theLSParser.busy
attribute istrue
.LSException
- PARSE_ERR: Raised if theLSParser
was unable to load the XML document. DOM applications should attach aDOMErrorHandler
using the parameter "error-handler" if they wish to get details on the error.
-
parseWithContext
Node parseWithContext(LSInput input, Node contextArg, short action) throws DOMException, LSExceptionParse an XML fragment from a resource identified by aLSInput
and insert the content into an existing document at the position specified with thecontext
andaction
arguments. When parsing the input stream, the context node (or its parent, depending on where the result will be inserted) is used for resolving unbound namespace prefixes. The context node'sownerDocument
node (or the node itself if the node of typeDOCUMENT_NODE
) is used to resolve default attributes and entity references.
As the new data is inserted into the document, at least one mutation event is fired per new immediate child or sibling of the context node.
If the context node is aDocument
node and the action isACTION_REPLACE_CHILDREN
, then the document that is passed as the context node will be changed such that itsxmlEncoding
,documentURI
,xmlVersion
,inputEncoding
,xmlStandalone
, and all other such attributes are set to what they would be set to if the input source was parsed usingLSParser.parse()
.
This method is always synchronous, even if theLSParser
is asynchronous (LSParser.async
istrue
).
If an error occurs while parsing, the caller is notified through theErrorHandler
instance associated with the "error-handler" parameter of theDOMConfiguration
.
When callingparseWithContext
, the values of the following configuration parameters will be ignored and their default values will always be used instead: "validate", "validate-if-schema", and "element-content-whitespace". Other parameters will be treated normally, and the parser is expected to call theLSParserFilter
just as if a whole document was parsed.- Parameters:
input
- TheLSInput
from which the source document is to be read. The source document must be an XML fragment, i.e. anything except a complete XML document (except in the case where the context node of typeDOCUMENT_NODE
, and the action isACTION_REPLACE_CHILDREN
), a DOCTYPE (internal subset), entity declaration(s), notation declaration(s), or XML or text declaration(s).contextArg
- The node that is used as the context for the data that is being parsed. This node must be aDocument
node, aDocumentFragment
node, or a node of a type that is allowed as a child of anElement
node, e.g. it cannot be anAttribute
node.action
- This parameter describes which action should be taken between the new set of nodes being inserted and the existing children of the context node. The set of possible actions is defined inACTION_TYPES
above.- Returns:
- Return the node that is the result of the parse operation. If the result is more than one top-level node, the first one is returned.
- Throws:
DOMException
- HIERARCHY_REQUEST_ERR: Raised if the content cannot replace, be inserted before, after, or as a child of the context node (see alsoNode.insertBefore
orNode.replaceChild
in [DOM Level 3 Core] ).
NOT_SUPPORTED_ERR: Raised if theLSParser
doesn't support this method, or if the context node is of typeDocument
and the DOM implementation doesn't support the replacement of theDocumentType
child orElement
child.
NO_MODIFICATION_ALLOWED_ERR: Raised if the context node is a read only node and the content is being appended to its child list, or if the parent node of the context node is read only node and the content is being inserted in its child list.
INVALID_STATE_ERR: Raised if theLSParser.busy
attribute istrue
.LSException
- PARSE_ERR: Raised if theLSParser
was unable to load the XML fragment. DOM applications should attach aDOMErrorHandler
using the parameter "error-handler" if they wish to get details on the error.
-
abort
void abort()Abort the loading of the document that is currently being loaded by theLSParser
. If theLSParser
is currently not busy, a call to this method does nothing.
-