|
|
|
|
|
doc(self)
Get the document tree from a parser context. |
source code
|
|
|
isValid(self)
Get the validity information from a parser context. |
source code
|
|
|
lineNumbers(self,
linenumbers)
Switch on the generation of line number for elements nodes. |
source code
|
|
|
loadSubset(self,
loadsubset)
Switch the parser to load the DTD without validating. |
source code
|
|
|
pedantic(self,
pedantic)
Switch the parser to be pedantic. |
source code
|
|
|
replaceEntities(self,
replaceEntities)
Switch the parser to replace entities. |
source code
|
|
|
validate(self,
validate)
Switch the parser to validation mode. |
source code
|
|
|
wellFormed(self)
Get the well formed information from a parser context. |
source code
|
|
|
|
|
|
|
|
|
|
|
|
|
htmlCtxtUseOptions(self,
options)
Applies the options to the parser context |
source code
|
|
|
|
|
htmlParseCharRef(self)
parse Reference declarations [66] CharRef ::= '&#' [0-9]+ ';'
| '&#x' [0-9a-fA-F]+ ';' |
source code
|
|
|
htmlParseChunk(self,
chunk,
size,
terminate)
Parse a Chunk of memory |
source code
|
|
|
htmlParseDocument(self)
parse an HTML document (and build a tree if using the standard SAX
interface). |
source code
|
|
|
htmlParseElement(self)
parse an HTML element, this is highly recursive [39] element ::=
EmptyElemTag | STag content ETag [41] Attribute ::= Name Eq
AttValue |
source code
|
|
|
byteConsumed(self)
This function provides the current index of the parser relative to
the start of the current entity. |
source code
|
|
|
clearParserCtxt(self)
Clear (release owned resources) and reinitialize a parser
context |
source code
|
|
|
|
|
ctxtReadFd(self,
fd,
URL,
encoding,
options)
parse an XML from a file descriptor and build a tree. |
source code
|
|
|
|
|
|
|
|
|
ctxtResetPush(self,
chunk,
size,
filename,
encoding)
Reset a push parser context |
source code
|
|
|
ctxtUseOptions(self,
options)
Applies the options to the parser context |
source code
|
|
|
initParserCtxt(self)
Initialize a parser context |
source code
|
|
|
parseChunk(self,
chunk,
size,
terminate)
Parse a Chunk of memory |
source code
|
|
|
|
|
parseExtParsedEnt(self)
parse a general parsed entity An external general parsed entity is
well-formed if it matches the production labeled extParsedEnt. |
source code
|
|
|
setupParserForBuffer(self,
buffer,
filename)
Setup the parser context to parse a new buffer; Clears any prior
contents from the parser context. |
source code
|
|
|
stopParser(self)
Blocks further parser processing |
source code
|
|
|
decodeEntities(self,
len,
what,
end,
end2,
end3)
This function is deprecated, we now always process entities
content through xmlStringDecodeEntities TODO: remove it in next
major release. |
source code
|
|
|
handleEntity(self,
entity)
Default handling of defined entities, when should we define a new
input stream ? When do we just handle that as a set of chars ?
OBSOLETE: to be removed at some point. |
source code
|
|
|
|
|
namespaceParseNSDef(self)
parse a namespace prefix declaration TODO: this seems not in use
anymore, the namespace handling is done on top of the SAX interfaces,
i.e. |
source code
|
|
|
nextChar(self)
Skip to the next char input char. |
source code
|
|
|
parseAttValue(self)
parse a value for an attribute Note: the parser won't do
substitution of entities here, this will be handled later
in xmlStringGetNodeList [10] AttValue ::= '"' ([^<&"] |
Reference)* '"' | "'" ([^<&'] | Reference)* "'" 3.3.3
Attribute-Value Normalization: Before the value of an
attribute is passed to the application or checked for
validity, the XML processor must normalize it as follows:
- a character reference is processed by appending the
referenced character to the attribute value - an entity
reference is processed by recursively processing the
replacement text of the entity - a whitespace character
(#x20, #xD, #xA, #x9) is processed by appending #x20 to
the normalized value, except that only a single #x20 is
appended for a "#xD#xA" sequence that is part of an
external parsed entity or the literal entity value of an
internal parsed entity - other characters are processed by
appending them to the normalized value If the declared
value is not CDATA, then the XML processor must further
process the normalized attribute value by discarding any
leading and trailing space (#x20) characters, and by
replacing sequences of space (#x20) characters by a single
space (#x20) character. |
source code
|
|
|
parseAttributeListDecl(self)
: parse the Attribute list def for an element [52] AttlistDecl
::= '<!ATTLIST' S Name AttDef* S? '>' [53] AttDef ::= S Name S
AttType S DefaultDecl |
source code
|
|
|
|
|
|
|
parseCharRef(self)
parse Reference declarations [66] CharRef ::= '&#' [0-9]+ ';'
| '&#x' [0-9a-fA-F]+ ';' [ WFC: Legal Character ] Characters
referred to using character references must match the production for
Char. |
source code
|
|
|
|
|
parseContent(self)
Parse a content: [43] content ::= (element | CharData | Reference
| CDSect | PI | Comment)* |
source code
|
|
|
parseDocTypeDecl(self)
parse a DOCTYPE declaration [28] doctypedecl ::= '<!DOCTYPE' S
Name (S ExternalID)? S? ('[' (markupdecl | PEReference | S)* ']' S?)?
'>' [ VC: Root Element Type ] The Name in the document type
declaration must match the element type of the root element. |
source code
|
|
|
parseElement(self)
parse an XML element, this is highly recursive [39] element ::=
EmptyElemTag | STag content ETag [ WFC: Element Type Match ] The
Name in an element's end-tag must match the element type in the
start-tag. |
source code
|
|
|
|
|
parseEncName(self)
parse the XML encoding name [81] EncName ::= [A-Za-z]
([A-Za-z0-9._] | '-')* |
source code
|
|
|
parseEncodingDecl(self)
parse the XML encoding declaration [80] EncodingDecl ::= S
'encoding' Eq ('"' EncName '"' | "'" EncName
"'") this setups the conversion filters. |
source code
|
|
|
parseEndTag(self)
parse an end of tag [42] ETag ::= '</' Name S? '>' With
namespace [NS 9] ETag ::= '</' QName S? '>' |
source code
|
|
|
parseEntityDecl(self)
parse <!ENTITY declarations [70] EntityDecl ::= GEDecl |
PEDecl [71] GEDecl ::= '<!ENTITY' S Name S EntityDef S? '>'
[72] PEDecl ::= '<!ENTITY' S '%' S Name S PEDef S? '>' [73]
EntityDef ::= EntityValue | (ExternalID NDataDecl?) [74] PEDef ::=
EntityValue | ExternalID [76] NDataDecl ::= S 'NDATA' S Name [ VC:
Notation Declared ] The Name must match the declared name of a
notation. |
source code
|
|
|
parseEntityRef(self)
parse ENTITY references declarations [68] EntityRef ::= '&'
Name ';' [ WFC: Entity Declared ] In a document without any DTD, a
document with only an internal DTD subset which contains no parameter
entity references, or a document with "standalone='yes'",
the Name given in the entity reference must match that in an entity
declaration, except that well-formed documents need not declare any
of the following entities: amp, lt, gt, apos, quot. |
source code
|
|
|
parseExternalSubset(self,
ExternalID,
SystemID)
parse Markup declarations from an external subset [30] extSubset
::= textDecl? extSubsetDecl [31] extSubsetDecl ::= (markupdecl |
conditionalSect | PEReference | S) * |
source code
|
|
|
parseMarkupDecl(self)
parse Markup declarations [29] markupdecl ::= elementdecl |
AttlistDecl | EntityDecl | NotationDecl | PI | Comment [ VC: Proper
Declaration/PE Nesting ] Parameter-entity replacement text must be
properly nested with markup declarations. |
source code
|
|
|
|
|
|
|
|
|
|
|
parseNotationDecl(self)
parse a notation declaration [82] NotationDecl ::=
'<!NOTATION' S Name S (ExternalID | PublicID) S? '>' Hence
there is actually 3 choices: 'PUBLIC' S PubidLiteral 'PUBLIC' S
PubidLiteral S SystemLiteral and 'SYSTEM' S SystemLiteral See the
NOTE on xmlParseExternalID(). |
source code
|
|
|
parsePEReference(self)
parse PEReference declarations The entity content is handled
directly by pushing it's content as a new input stream. |
source code
|
|
|
|
|
parsePITarget(self)
parse the name of a PI [17] PITarget ::= Name - (('X' | 'x') ('M'
| 'm') ('L' | 'l')) |
source code
|
|
|
parsePubidLiteral(self)
parse an XML public literal [12] PubidLiteral ::= '"'
PubidChar* '"' | "'" (PubidChar - "'")*
"'" |
source code
|
|
|
parseQuotedString(self)
Parse and return a string between quotes or doublequotes TODO:
Deprecated, to be removed at next drop of binary compatibility |
source code
|
|
|
parseReference(self)
parse and handle entity references in content, depending on the
SAX interface, this may end-up in a call to character() if this is a
CharRef, a predefined entity, if there is no reference()
callback. |
source code
|
|
|
parseSDDecl(self)
parse the XML standalone declaration [32] SDDecl ::= S
'standalone' Eq (("'" ('yes' | 'no') "'") |
('"' ('yes' | 'no')'"')) [ VC: Standalone Document
Declaration ] TODO The standalone document declaration must have the
value "no" if any external markup declarations contain
declarations of: - attributes with default values, if elements to
which these attributes apply appear in the document without
specifications of values for these attributes, or - entities (other
than amp, lt, gt, apos, quot), if references to those entities appear
in the document, or - attributes with values subject to
normalization, where the attribute appears in the document with a
value which will change as a result of normalization, or - element
types with element content, if white space occurs directly within any
instance of those types. |
source code
|
|
|
|
|
parseSystemLiteral(self)
parse an XML Literal [11] SystemLiteral ::= ('"' [^"]*
'"') | ("'" [^']* "'") |
source code
|
|
|
parseTextDecl(self)
parse an XML declaration header for external entities [77]
TextDecl ::= '<?xml' VersionInfo? EncodingDecl S? '?>'
Question: Seems that EncodingDecl is mandatory ? Is that a typo ? |
source code
|
|
|
|
|
|
|
parseXMLDecl(self)
parse an XML declaration header [23] XMLDecl ::= '<?xml'
VersionInfo EncodingDecl? SDDecl? S? '?>' |
source code
|
|
|
parserHandlePEReference(self)
[69] PEReference ::= '%' Name ';' [ WFC: No Recursion ] A parsed
entity must not contain a recursive reference to itself, either
directly or indirectly. |
source code
|
|
|
|
|
popInput(self)
xmlPopInput: the current input pointed by ctxt->input came to
an end pop it and return the next char. |
source code
|
|
|
scanName(self)
Trickery: parse an XML name but without consuming the input flow
Needed for rollback cases. |
source code
|
|
|
|
|
|
|
|
Inherited from parserCtxtCore :
addLocalCatalog ,
getErrorHandler ,
setErrorHandler
|