REBOL [
Title: "A more XML 1.0 compliant set of XML parsing tools."
File: %xml-parse.r
Date: 1-jul-2009
Version: 0.7.6
Author: "Gavin F. McKenzie"
Email: brianwisti@yahoo.com
Purpose: {
REBOL's built-in parse-xml function lacks a number of
XML 1.0 compliant features, including:
- support for CDATA sections
- support for XML Namespaces
- exposure of the internal DTD subset
The intent of this script is to create an XML parser
that can operate either via an event/callback mechanism,
or produce a block structure similar to REBOL's built-in
parse-xml function.
This XML parser is designed to call-back into a
'parse-handler' object that has been designed to be similar
to the well-known XML parsing interface known as "SAX"
(Simple API for XML) by David Megginson.
For more information on SAX, see David's website at:
http://www.megginson.com/SAX/index.html
Several parse-handlers are included here:
- the base 'class' xml-parse-handler that contains only empty
callback stubs
- the debugging aid echo-handler that prints out the callback
event stream
- the block-handler that produces a superset of the block structure
created by REBOL's built-in parse-xml function
Alternatively, you may choose to build your own parse
event handler rather than use the functionality provided
here.
A new function is defined "parse-xml+" that represents
the enhanced counterpart to the built-in REBOL parse-xml.
Additional features provided by this parser:
1. Document prolog information
The built-in REBOL parse-xml function returns a set of
nested blocks where the first two items in the outermost
block are the words document and none, such as:
[document none [...]]
The parse-xml+ function provided herein can produce a
set of nested blocks where the second item of the outermost
block is used to represent prolog and document type
information.
An example of this block is:
[ version "1.0" encoding "utf-8" standalone "yes"
doctype none pubid none sysid none subset none
]
2. CDATA Section Processing
XML provides for enclosing data content within CDATA
sections for the convenience of avoiding the need to
escape certain XML sensitive characters in the data
such as the ampersand (&) and less-than-sign (<).
An example of a CDATA section:
abc xyz
A compliant XML parser would report that the content
of element 'foo' is "abc Jack & Jill xyz".
CDATA sections are also useful when putting text samples
of XML within the content of an XML document.
bar]]>
Here the value of element 'example' is the text
"bar"
3. Comments
This parser provides the opportunity to process
comments embedded within the XML.
4. Processing Instructions
This parser provides the opportunity to process
processing instructions embedded within the XML.
5. Automatic Character Entity Expansion
In XML document it is common to encounter "character
entities" within the content of the document. These
entities are the means for escaping sensitive XML
characters so that the character will be processed as
data rather than markup. The most common characters
that are subjected to this treatment are the
ampersand (&) and less-than-sign (<).
This parser recognizes these common entities and
automatically converts them to their character
equivalents.
For example:
Jack & Jill
This parser will automatically replace the &
character entity reference to the ampersand (&)
character; hence, the value of element 'foo' is
"Jack & Jill".
Character entities can also be encoded with their
Unicode numeric equivalent rather than the symbolic
name in either decimal or hex form, such as:
Ampersands: &&&
The value of element 'foo' is "Ampersands: &&&".
6. Namespace Processing
Namespace processing is vital to handling real-world
XML.
@@TBD: say more here
}
History: [
0.7.4 { Fixed a defect to allow optional space around
the '=' on an attribute.
Thanks to Brett Handley for reporting the defect.}
0.7.3 { Fixed bug where attr-ns-prefix wasn't getting cleared
when processing an un-prefixed attribute.}
0.7.2 { Changed the start-document in the block-handler
to perform a copy/deep, fixing a bug that occurred
on successive invocations of parse-xml+.}
0.7.1 { First public release. }
]
Acknowledgements: {
Gavin F. MacKenzie wrote the original releases of this file,
so it just plain wouldn't have happened without him. I hope he
knows we're grateful and we hope he's doing well - wherever he's
disappeared to!
}
]
; TO DO
; - ** WARNING ** Namespace processing is not ready for primetime!
; - anything that uses xmlQuote is wrong; there is the potential for
; uncaught mismatched quotes
; - do some start/end-tag matching and error-checking
; - process entities defined in the internal DTD subset
; - add comments, comments, comments!!
xml-parse: make object! [
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;
; XML PARSE CONTENT HANDLERS
;
; This XML parser is designed to call-back into a 'parse-handler' object
; that has been designed to be similar to the well-known XML parsing
; interface known as "SAX" (Simple API for XML) by David Megginson.
;
; For more information on SAX, see David's website at:
; http://www.megginson.com/SAX/index.html
;
; Several parse-handlers are included here:
; - the base 'class' xml-parse-handler that contains only empty
; callback stubs
; - the debugging aid echo-handler that prints out the callback
; event stream
; - the block-handler that produces a superset of the block structure
; created by REBOL's built-in parse-xml function
;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;
; This is an 'empty' xml-parse-handler that is provided as a base-object
; for extending into custom xml-parse-handlers.
;
xml-parse-handler: make object! [
start-document: func [
][
]
xml-decl: func [
version-info [string! none!]
encoding [string! none!]
standalone [string! none!]
][
]
document-type: func [
document-type [string!]
public-id [string! none!]
system-id [string! none!]
internal-subset [string! none!]
][
]
start-element: func [
ns-uri [string! none!]
local-name [string! none!]
q-name [string!]
attr-list [block!]
][
]
end-element: func [
ns-uri [string! none!]
local-name [string! none!]
q-name [string!]
][
]
characters: func [
characters [string! none!]
][
]
pi: func [
pi-target [string! none!]
pi [string! none!]
][
]
comment: func [
comment [string! none!]
][
]
end-document: func [] [
]
start-prefix-mapping: func [
ns-prefix-uri-pairs [block!]
][
]
end-prefix-mapping: func [
ns-prefix-uri-pairs [block!]
][
]
get-parse-result: func [{
This function can be used to return a specific result from
the parse operation, such as returning the parse XML as a
series of blocks similar to REBOL's built-in parse-xml.
By default, returning none from this function will cause the
return code from the REBOL parse function to be passed back
to the caller of the parse.}
][
none
]
]
;
; This xml-parse-handler simply echoes the parsing to the console.
; This was primarily useful as a debugging aid during the development
; of the XML parse production rules.
;
echo-handler: make xml-parse-handler [
start-document: func [
][
print remold ['start-doc]
]
xml-decl: func [
version-info [string! none!]
encoding [string! none!]
standalone [string! none!]
][
print remold ['xml-decl 'version-info version-info
'encoding encoding 'standalone standalone
]
]
document-type: func [
document-type [string!]
public-id [string! none!]
system-id [string! none!]
internal-subset [string! none!]
][
print remold ['doc-type document-type
'public-id public-id
'system-id system-id
'internal-subset internal-subset
]
]
start-element: func [
ns-uri [string! none!]
local-name [string! none!]
q-name [string!]
attr-list [block!]
][
print remold ['start-elem ns-uri local-name q-name
'attr-list attr-list
]
]
end-element: func [
ns-uri [string! none!]
local-name [string! none!]
q-name [string!]
][
print remold ['end-elem ns-uri local-name q-name]
]
characters: func [
characters [string! none!]
][
print remold ['characters characters]
]
pi: func [
pi-target [string! none!]
pi [string! none!]
][
print remold ['pi pi-target pi]
]
comment: func [
comment [string! none!]
][
print remold ['comment comment]
]
end-document: func [
][
print remold ['end-doc]
]
start-prefix-mapping: func [
ns-prefix-uri-pairs [block!]
][
print remold ['start-prefix ns-prefix-uri-pairs]
]
end-prefix-mapping: func [
ns-prefix-uri-pairs [block!]
][
print remold ['end-prefix ns-prefix-uri-pairs]
]
]
;
; This xml-parse-handler produces a set of nested blocks representing
; the parsed XML content. The blocks structure is a compatible superset
; of the block structure produced by REBOL's built-in parse-xml function.
; Extensions to the structure are appended to any applicable block;
; hence, existing code intended for use with REBOL's existing post-parse
; block structure should continue to work.
;
block-handler: make xml-parse-handler [
xml-doc: copy []
xml-block: copy []
xml-content: copy ""
start-document: func [
][
;
; Seed the document
;
xml-block: reduce copy/deep ['document [version none
encoding none
standalone none
doctype none
pubid none
sysid none
subset none
]
none
]
]
xml-decl: func [
version-info [string! none!]
encoding [string! none!]
standalone [string! none!]
][
change next (find xml-block/2 'version) version-info
change next (find xml-block/2 'encoding) encoding
change next (find xml-block/2 'standalone) standalone
]
document-type: func [
document-type [string!]
public-id [string! none!]
system-id [string! none!]
internal-subset [string! none!]
][
change next (find xml-block/2 'doctype) document-type
change next (find xml-block/2 'pubid) public-id
change next (find xml-block/2 'sysid) system-id
change next (find xml-block/2 'subset) internal-subset
]
start-element: func [
ns-uri [string! none!]
local-name [string! none!]
q-name [string!]
attr-list [block!]
][
;
; Is there any pending content to add before
; we start a new element?
;
if not empty? xml-content [
add-child copy xml-content
clear head xml-content
]
insert/only tail xml-doc xml-block
xml-block: add-child copy reduce [local-name none none]
;
; Add the attribute list
;
if not empty? attr-list [
xml-block/2: copy attr-list
]
]
characters: func [
characters [string! none!]
][
;
; Accumulate more character data
;
if not none? characters [
append xml-content characters
]
]
end-element: func [
ns-uri [string! none!]
local-name [string! none!]
q-name [string!]
][
;
; Is there any pending content to add before
; we terminate this element?
;
if not empty? xml-content [
add-child copy xml-content
clear head xml-content
]
;
; Basic well-formedness check
;
; while [q-name <> first xml-block] [
; if empty? xml-doc [
; print ["End tag error:" q-name]
; halt
; ]
; pop-xml-block
; ]
pop-xml-block
]
add-child: func [child] [
if none? third xml-block [xml-block/3: make block! 1]
insert/only tail third xml-block child
child
]
pop-xml-block: func [] [
xml-block: last xml-doc
remove back tail xml-doc
]
get-parse-result: func [] [
xml-block
]
]
;
; This xml-parse-handler enhances the block-handler with namespace
; processing. It should only be used with a parser that has been
; set to namespace-aware true.
;
ns-block-handler: make xml-parse-handler [
xml-doc: copy []
xml-block: copy []
xml-content: copy ""
nsinfo-stack: copy []
start-document: func [
][
;
; Seed the document
;
xml-block: reduce copy/deep ['document [version none
encoding none
standalone none
doctype none
pubid none
sysid none
subset none
]
none
]
]
xml-decl: func [
version-info [string! none!]
encoding [string! none!]
standalone [string! none!]
][
change next (find xml-block/2 'version) version-info
change next (find xml-block/2 'encoding) encoding
change next (find xml-block/2 'standalone) standalone
]
document-type: func [
document-type [string!]
public-id [string! none!]
system-id [string! none!]
internal-subset [string! none!]
][
change next (find xml-block/2 'doctype) document-type
change next (find xml-block/2 'pubid) public-id
change next (find xml-block/2 'sysid) system-id
change next (find xml-block/2 'subset) internal-subset
]
start-element: func [
ns-uri [string! none!]
local-name [string! none!]
q-name [string!]
attr-list [block!]
][
;
; Is there any pending content to add before
; we start a new element?
;
if not empty? xml-content [
add-child copy xml-content
clear head xml-content
]
insert/only tail xml-doc xml-block
xml-block: add-child copy reduce [local-name
none
none
ns-uri
]
;
; Add the attribute list
;
if not empty? attr-list [
xml-block/2: copy attr-list
]
]
characters: func [
characters [string! none!]
][
;
; Accumulate more character data
;
if not none? characters [
append xml-content characters
]
]
end-element: func [
ns-uri [string! none!]
local-name [string! none!]
q-name [string!]
][
;
; Is there any pending content to add before
; we terminate this element?
;
if not empty? xml-content [
add-child copy xml-content
clear head xml-content
]
;
; Basic well-formedness check
;
; while [q-name <> first xml-block] [
; if empty? xml-doc [
; print ["End tag error:" q-name]
; halt
; ]
; pop-xml-block
; ]
pop-xml-block
]
start-prefix-mapping: func [
ns-prefix-uri-pairs [block!]
][
insert/only nsinfo-stack ns-prefix-uri-pairs
]
end-prefix-mapping: func [
ns-prefix-uri-pairs [block!]
][
remove nsinfo-stack
]
add-child: func [child] [
if none? third xml-block [xml-block/3: make block! 1]
insert/only tail third xml-block child
child
]
pop-xml-block: func [] [
xml-block: last xml-doc
remove back tail xml-doc
]
get-parse-result: func [] [
xml-block
]
]
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;
; ACTUAL XML PARSER OBJECT
;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
parser: make object! [
element-q-name: none
element-local-name: none
document-type: none
system-id: none
public-id: none
internal-subset: none
encoding: none
characters: none
entity-ref: none
char-ref-value: none
comment: none
pi-target: none
pi: none
version-info: none
encoding: none
standalone: none
attr-name: none
attr-value: none
attr-list: copy []
attr-ns-prefix: none
element-ns-prefix: none
ns-uri: none
nsinfo-stack: copy []
nsinfo: copy []
;
; Set a default xml-parse-handler
;
handler: block-handler
;
; Namespace processing, set true to process namespaces
;
namespace-aware: no
;
; XML 1.0 Production Rules
;
xmlLetter: charset [#"A" - #"Z" #"a" - #"z"]
xmlDigit: charset [#"0" - #"9"]
xmlHexDigit: charset [#"0" - #"9" #"a" - #"f" #"A" - #"F"]
xmlAlpha: charset [#"A" - #"Z" #"a" - #"z"]
xmlAlphaNum: charset [#"0" - #"9" #"A" - #"Z" #"a" - #"z"]
xmlQuote: charset [#"^"" #"^'"]
xmlSpace: charset [#"^(20)" #"^(09)" #"^(0D)" #"^(0A)"]
xmlNotMarkupChar: complement charset [#"<" #"&"] ;fix this
xmlS: [some xmlSpace]
xmlEq: [[any xmlSpace] "=" [any xmlSpace]]
xmlChar: [any [xmlAlphaNum | xmlSpace]]
xmlCharData: [ copy characters
some xmlNotMarkupChar
(handler/characters characters)
]
xmlNameProd: [[xmlLetter | #"_" | #":"] any xmlNameChar]
xml10Name: xmlNameProd
xmlName: xml10Name
xmlNameChar: [ xmlLetter | xmlDigit |
#"." | #"-" | #"_" | #":"
]
xmlNames: [xmlName any [xmlS xmlName]]
xmlNMToken: [some xmlNameChar]
xmlNMTokens: [xmlNMToken any [xmlS xmlNMToken]]
xmlMisc: [xmlComment | xmlPI | xmlS]
xmlPERef: [#"%" xmlNameProd #"%"]
xmlEntIntro: charset [#"^^" #"^%" #"^&" #"^""]
xmlEntityVal: [#"^"" any [xmlEntIntro | xmlPERef] #"^""]
xmlContent: [any [ xmlElement | xmlComment | xmlPI |
xmlCDSect | xmlCharData | xmlReference
]
] ;fix this
xmlAttValueStrict: [ [#"^"" copy attr-value to #"^"" #"^""] |
[#"'" copy attr-value to #"'" #"'"]
] ; fix this
xmlAttValue: xmlAttValueStrict
xmlAttType: [ xmlStringType |
xmlTokenizedType |
xmlEnumeratedType
]
xml10AttrStrict: [ copy attr-name xmlName
xmlEq
xmlAttValue
(append append
attr-list attr-name attr-value
)
]
xmlAttribute: xml10AttrStrict
xmlSTag: [ #"<"
copy element-q-name xmlName
[ (clear head attr-list clear head nsinfo)
any [xmlS xmlAttribute]
]
[any xmlSpace]
#">"
(either namespace-aware [
handler/start-prefix-mapping nsinfo
insert/only nsinfo-stack copy nsinfo
handler/start-element
ns-uri
element-local-name
element-q-name
attr-list
][
handler/start-element
none
element-q-name
element-q-name
attr-list
]
)
]
xmlETag: [ ""
copy element-q-name xmlName
[any xmlSpace]
#">"
(either namespace-aware [
handler/end-element
ns-uri
element-local-name
element-q-name
handler/end-prefix-mapping
first nsinfo-stack
remove nsinfo-stack
][
handler/end-element
none
element-q-name
element-q-name
]
)
]
xmlEmptyElemTag: [ #"<"
copy element-q-name xmlName
[ (clear head attr-list clear head nsinfo)
any [xmlS xmlAttribute]
]
[any xmlSpace]
"/>"
(either namespace-aware [
handler/start-prefix-mapping nsinfo
insert/only nsinfo-stack copy nsinfo
handler/start-element
ns-uri
element-local-name
element-q-name
attr-list
][
handler/start-element
none
element-q-name
element-q-name
attr-list
]
characters: none
handler/characters characters
either namespace-aware [
handler/end-element
ns-uri
element-local-name
element-q-name
handler/end-prefix-mapping
first nsinfo-stack
remove nsinfo-stack
][
handler/end-element
none
element-q-name
element-q-name
]
)
]
xmlEmptyElem: [ #"<"
copy element-q-name xmlName
[ (clear head attr-list clear head nsinfo)
any [xmlS xmlAttribute]
]
[any xmlSpace]
">"
xmlName
[any xmlSpace]
#">"
(either namespace-aware [
handler/start-prefix-mapping nsinfo
insert/only nsinfo-stack copy nsinfo
handler/start-element
ns-uri
element-local-name
element-q-name
attr-list
][
handler/start-element
none
element-q-name
element-q-name
attr-list
]
characters: none
handler/characters characters
either namespace-aware [
handler/end-element
ns-uri
element-local-name
element-q-name
handler/end-prefix-mapping
first nsinfo-stack
remove nsinfo-stack
][
handler/end-element
none
element-q-name
element-q-name
]
)
]
xmlElementStrict: [ xmlEmptyElemTag |
xmlEmptyElem |
[xmlSTag opt xmlContent xmlETag]
]
xmlElement: xmlElementStrict
xmlPI: [ ""
copy pi-target xmlPITarget
copy pi to "?>" "?>"
(handler/pi pi-target pi)
]
xmlPITarget: [xmlNameProd]
xmlComment: [ "" "-->"
(handler/comment comment)
]
xmlDecl: [ (version-info: encoding: standalone: none)
""
(handler/xml-decl
version-info
encoding
standalone
)
]
xmlVersionInfo: [ "version"
xmlEq
[ #"^"" copy version-info xmlVersionNum "^"" |
#"^'" copy version-info xmlVersionNum "^'"
]
]
xmlVersionNum: [some [xmlVersionNumChars | "-"]]
xmlVersionNumChars: charset [ #"0" - #"9"
#"A" - #"Z"
#"a" - #"z"
"_.:"
]
xmlEncodingDecl: [ (encoding: none)
"encoding"
xmlEq
[ #"^"" copy encoding xmlEncName "^"" |
#"^'" copy encoding xmlEncName "^'"
]
]
xmlEncName: [xmlAlpha any [xmlEncNameChars]]
xmlEncNameChars: charset [ #"0" - #"9"
#"A" - #"Z"
#"a" - #"z"
"_.-"
]
xmlDocument: [ (handler/start-document)
xmlProlog
xmlElement
any xmlMisc
(handler/end-document)
]
xmlProlog: [ opt xmlDecl
any xmlMisc
opt [xmlDocTypeDecl any xmlMisc]
]
xmlDocTypeDecl: [ ""
(handler/document-type
document-type
public-id
system-id
internal-subset
)
]
xmlSDDecl: [ (standalone: none)
"standalone"
xmlEq
[ xmlQuote
copy standalone ["yes" | "no"]
xmlQuote
]
]
xmlStringType: "CDATA"
xmlTokenizedType: [ "ID" | "IDREF" | "IDREFS" |
"ENTITY" | "ENTITIES" |
"NMTOKEN" | "NMTOKENS"
]
xmlEnumeratedType: [] ; fix this
xmlReference: [xmlCharRef | xmlEntityRef]
xmlEntityRef: [ ["&" copy entity-ref xmlNameProd ";"]
( char-ref-value:
convert-character-entity entity-ref
either none? char-ref-value [
;
; couldn't convert the
; chararacter-entity, so pass
; it through as character data,
; unconverted
;
handler/characters
rejoin ["&" entity-ref ";"]
][
;
; converted the chararacter-entity
; to a character
;
handler/characters char-ref-value
]
)
]
xmlCharRef: [ [ [ "&"
[copy entity-ref
["#" some xmlDigit]
]
";"
] |
[ "&"
[copy entity-ref
["#x" some xmlHexDigit]
]
";"
]
]
( char-ref-value:
convert-character-entity entity-ref
either none? char-ref-value [
;
; couldn't convert the
; chararacter-entity, so pass
; it through as character data,
; unconverted
;
handler/characters
rejoin ["&" entity-ref ";"]
][
;
; converted the chararacter-entity
; to a character
;
handler/characters char-ref-value
]
)
]
xmlExternalID: [ ["SYSTEM" xmlSpace xmlSystemLiteral] |
["PUBLIC" xmlSpace xmlPubIDLiteral
xmlSpace xmlSystemLiteral
]
]
xmlSystemLiteral: [ [#"^"" copy system-id to #"^"" #"^""] |
[#"'" copy system-id to #"'" #"'"]
]
xmlPubIDLiteral: [ [#"^"" copy public-id to #"^"" #"^""] |
[#"'" copy public-id to #"'" #"'"]
]
xmlNDataDecl: [xmlS "NDATA" xmlS xmlNameProd]
xmlCDSect: [ ""
"]]>"
(handler/characters characters)
]
; XML Namespace-Specific Production Rules
;
xmlNSAttribute: [ [ copy attr-name xmlPrefixedAttName
xmlEq xmlAttValue
(ns-uri: copy attr-value
append nsinfo
reduce [attr-ns-prefix
attr-value
]
)
] |
[ copy attr-name xmlDefaultAttName
xmlEq
xmlAttValue
(ns-uri: copy attr-value
append nsinfo
reduce [attr-ns-prefix
attr-value
]
)
] |
[ xmlAQName
xmlEq
xmlAttValue
(append attr-list
reduce [attr-name
attr-value
attr-ns-prefix
]
)
]
]
xmlPrefixedAttName: ["xmlns:" copy attr-ns-prefix xmlNCName]
xmlDefaultAttName: ["xmlns" (attr-ns-prefix: none)]
xmlNCName: [[xmlLetter | #"_"] any xmlNCNameChar]
xmlNCNameChar: [xmlLetter | xmlDigit | #"." | #"-" | #"_"]
xmlAQName: [ [ copy attr-ns-prefix xmlNCName
#":"
copy attr-name xmlNCName
] |
[ copy attr-name xml10Name
(attr-ns-prefix: none)
]
]
xmlQName: [ [ copy element-ns-prefix xmlNCName
#":"
copy element-local-name xmlNCName
(element-q-name: copy rejoin
[element-ns-prefix
":"
element-local-name
]
)
] |
[ copy element-local-name xml10Name
(element-q-name: element-ns-prefix: "")
]
]
;
;
; Private XML Parser Methods
;
convert-character-entity: func [{
Accepts the name reference portion of an entity
reference and attempts to return a string containing
the actual character referenced by the entity.
If the conversion is not successful, the value of
none is returned.
For example, for the ampersand character this function
could accept a entity-ref parameter of either "amp",
"#38" or "#x26".
}
entity-ref [string!]
][
switch/default entity-ref [
"lt" [ return "<" ]
"gt" [ return ">" ]
"amp" [ return "&" ]
"quot" [ return "^"" ]
"apos" [ return "'" ]
][
either (first entity-ref) = #"#" [
either (second entity-ref) = #"x" [
to-string to-char to-integer to-issue
skip entity-ref 2
][
to-string to-char to-integer
skip entity-ref 1
]
][
none
]
]
]
;
;
; Public XML Parser Methods
;
parse-xml: func [{
Parses XML code and executes an associated event handler
during processing.
This is a more XML 1.0 compliant parse than the built-in
REBOL parse-xml function.
}
xml-string [string!]
/local parse-result
][
;
; Parse the document and capture the return code from the REBOL
; parse.
;
parse-result: parse/case/all xml-string xmlDocument
;
; If the handler doesn't return a specific parse result, then
; return the parse-result we obtained from the REBOL parse.
;
either handler/get-parse-result = none [
parse-result
][
handler/get-parse-result
]
]
set-parse-handler: func [
arg-handler [object!]
][
handler: arg-handler
]
get-parse-handler: does [
handler
]
set-namespace-aware: func [{
This function enables the namespace processing
of the parser. As a result, the parser will
process xmlns attributes and namespace prefixes.
The parse-handlers will receive additional
namespace specific information.
}
arg-namespace-aware [logic!]
][
namespace-aware: arg-namespace-aware
either arg-namespace-aware [
xmlName: xmlQName
xmlAttribute: xmlNSAttribute
][
xmlName: xml10Name
xmlAttribute: xml10AttrStrict
]
namespace-aware
]
get-namespace-aware: does [
namespace-aware
]
]
]
parse-xml+: func [{
Parses XML code and returns a tree of blocks.
This is a more XML 1.0 compliant parse than the built-in
REBOL parse-xml function.
}
code [string!] "XML code to parse"
][
xml-parse/parser/parse-xml code
]
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;
; Some samples...
;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
comment {
TBD
}