KXML
Creator: |
Contents
Overview
This project tracks the development of the DOM XML parsing library written in D called KXML. This project was born of the need for an XML parser where none existed that relied on Phobos, supported unparsed cdata nodes, and worked properly with D1.0 (it supports D2.0 as well via the exact same codebase). The parser operates on strings and attempts to allocate as little memory as possible by using slice references into the original string provided. The parser is loosely based on the Yage XML parser, also written in D. KXML has a completely new parsing engine, but attempts to remain mostly api-compatible with the Yage parser. The parser can currently deal with XML processing instructions, comments, unparsed character data, standard, and self-closing nodes. The code is on my SVN server under kxml. There is also a D2/Phobos2 range-oriented version available in libdxml2.
Usage Examples
Create an XML element with the name "foo".
XmlNode foo = new XmlNode("foo");
Add an attribute with the name "bar" and value "foobar"
foo.setAttribute("bar","foobar");
Get the first child node of foo
XmlNode firstchild = foo.getChildren()[0];
Get the attribute named bar
string barval = foo.getAttribute("bar");
Parse a string for XML
XmlNode foo = readDocument(xmlstring);
Search a node's children for a match to the XPath String
XmlNode[]xpathresults = foo.parseXPath("bar/toast");
Do an XPath search with attribute matching
XmlNode[]xpathresults = foo.parseXPath("bar[@type="left" and @lol="cats"]/toast");
Do a search for a toast element arbitrarily deep in the tree whose parent is a bar element
XmlNode[]xpathresults = foo.parseXPath("//bar/toast");
Match on inner xml
XmlNode[]xpathresults = xml.parseXPath(`//td[.="Text 2.3"]`);
Subnode text matching and inequalities
XmlNode[]xpathresults = xml.parseXPath(`//tr[@ab>=9 and th="Head"]/td`);
Quirks
- The input string may not always be the same as the output string, even if nothing is modified
- Always outputs XML with double quoted attributes
- <![CDATA[]]> nodes will be escaped and left as regular, parsed character data
To Do
- Refactor inheritance as a set of shared interfaces
- THIS WILL PROBABLY BREAK BACKWARD COMPATIBILITY
- in Xml(everything in kxml): toString, reset, more?
- in XmlChild(XmlDoc,XmlNode): parseXPath, addCData, addChild, getChildren, removeChild, getCData, getInnerXML, addChildren
- in XmlAttribute(XmlNode,XmlPI): removeAttribute, setAttribute, getAttributes, getAttribute, hasAttribute, getName, setName
- can use casting tests to check types, need to deprecate isXXXXX functions
- XPath improvements
- Implement [2] type constraints
- Improve parsing of malformed xml (<? // blah ?>)