Difference between revisions of "KXML"
(→To Do: increased parsing speed 3 fold in worst case scenario) |
(→To Do: update todo list for things accomplished) |
||
(15 intermediate revisions by one user not shown) | |||
Line 1: | Line 1: | ||
{{Project|Creator=Opticron | {{Project|Creator=Opticron | ||
− | |Status=<onlyinclude>Late | + | |Status=<onlyinclude>Late Implementation</onlyinclude> <!--LEAVE ONLYINCLUDES FOR STATUS HACK--> |
|Born On=00:49, 22 June 2008 (CDT) <!--DO NOT EDIT --> | |Born On=00:49, 22 June 2008 (CDT) <!--DO NOT EDIT --> | ||
|Last Updated={{#time: H:i, d F Y| {{REVISIONTIMESTAMP}} }} (CDT) <!--DO NOT EDIT --> | |Last Updated={{#time: H:i, d F Y| {{REVISIONTIMESTAMP}} }} (CDT) <!--DO NOT EDIT --> | ||
Line 6: | Line 6: | ||
==Overview== | ==Overview== | ||
− | This project tracks the development of the DOM XML parsing library written in D called KXML. This project was born of the need for an XML parser where none existed that relied on Phobos, supported unparsed cdata nodes, and worked properly with D1.0. The parser operates on strings and attempts to allocate as little memory as possible by using slice references into the original string provided. The parser is loosely based on the Yage XML parser, also written in D. KXML has a completely new parsing engine, but attempts to remain mostly api-compatible with the Yage parser. The parser can currently deal with XML processing instructions, comments, unparsed character data, standard, and self-closing nodes | + | This project tracks the development of the DOM XML parsing library written in D called KXML. This project was born of the need for an XML parser where none existed that relied on Phobos, supported unparsed cdata nodes, and worked properly with D1.0 (it supports D2.0 as well via the exact same codebase). The parser operates on strings and attempts to allocate as little memory as possible by using slice references into the original string provided. The parser is loosely based on the Yage XML parser, also written in D. KXML has a completely new parsing engine, but attempts to remain mostly api-compatible with the Yage parser. The parser can currently deal with XML processing instructions, comments, unparsed character data, standard, and self-closing nodes. The code is on my SVN server under [https://pianoben.ch/svn/branches/kxml kxml]. There is also a D2/Phobos2 range-oriented version available in [https://pianoben.ch/svn/branches/libdxml2 libdxml2]. |
− | + | ||
− | + | ||
==Usage Examples== | ==Usage Examples== | ||
Line 18: | Line 16: | ||
XmlNode firstchild = foo.getChildren()[0]; | XmlNode firstchild = foo.getChildren()[0]; | ||
Get the attribute named bar | Get the attribute named bar | ||
− | + | string barval = foo.getAttribute("bar"); | |
Parse a string for XML | Parse a string for XML | ||
− | foo = readDocument(xmlstring); | + | XmlNode foo = readDocument(xmlstring); |
Search a node's children for a match to the XPath String | Search a node's children for a match to the XPath String | ||
XmlNode[]xpathresults = foo.parseXPath("bar/toast"); | XmlNode[]xpathresults = foo.parseXPath("bar/toast"); | ||
Line 27: | Line 25: | ||
Do a search for a toast element arbitrarily deep in the tree whose parent is a bar element | Do a search for a toast element arbitrarily deep in the tree whose parent is a bar element | ||
XmlNode[]xpathresults = foo.parseXPath("//bar/toast"); | XmlNode[]xpathresults = foo.parseXPath("//bar/toast"); | ||
+ | Match on inner xml | ||
+ | XmlNode[]xpathresults = xml.parseXPath(`//td[.="Text 2.3"]`); | ||
+ | Subnode text matching and inequalities | ||
+ | XmlNode[]xpathresults = xml.parseXPath(`//tr[@ab>=9 and th="Head"]/td`); | ||
==Quirks== | ==Quirks== | ||
− | + | * The input string may not always be the same as the output string, even if nothing is modified | |
− | * The input string may not always be the same as the output string | + | ** Always outputs XML with double quoted attributes |
− | ** | + | ** <![CDATA[]]> nodes will be escaped and left as regular, parsed character data |
==To Do== | ==To Do== | ||
− | * | + | * Refactor inheritance as a set of shared interfaces |
− | * | + | ** THIS WILL PROBABLY BREAK BACKWARD COMPATIBILITY |
− | * | + | ** in Xml(everything in kxml): toString, reset, more? |
+ | ** in XmlChild(XmlDoc,XmlNode): parseXPath, addCData, addChild, getChildren, removeChild, getCData, getInnerXML, addChildren | ||
+ | ** in XmlAttribute(XmlNode,XmlPI): removeAttribute, setAttribute, getAttributes, getAttribute, hasAttribute, getName, setName | ||
+ | ** can use casting tests to check types, need to deprecate isXXXXX functions | ||
+ | * XPath improvements | ||
+ | ** Implement [2] type constraints | ||
+ | * Improve parsing of malformed xml (<? // blah ?>) | ||
+ | |||
[[Category:Software]] <!--MAKE AS MANY CATEGORIES AS YOU NEED--> | [[Category:Software]] <!--MAKE AS MANY CATEGORIES AS YOU NEED--> |
Latest revision as of 09:10, 26 May 2012
Creator: |
Contents
[hide]Overview
This project tracks the development of the DOM XML parsing library written in D called KXML. This project was born of the need for an XML parser where none existed that relied on Phobos, supported unparsed cdata nodes, and worked properly with D1.0 (it supports D2.0 as well via the exact same codebase). The parser operates on strings and attempts to allocate as little memory as possible by using slice references into the original string provided. The parser is loosely based on the Yage XML parser, also written in D. KXML has a completely new parsing engine, but attempts to remain mostly api-compatible with the Yage parser. The parser can currently deal with XML processing instructions, comments, unparsed character data, standard, and self-closing nodes. The code is on my SVN server under kxml. There is also a D2/Phobos2 range-oriented version available in libdxml2.
Usage Examples
Create an XML element with the name "foo".
XmlNode foo = new XmlNode("foo");
Add an attribute with the name "bar" and value "foobar"
foo.setAttribute("bar","foobar");
Get the first child node of foo
XmlNode firstchild = foo.getChildren()[0];
Get the attribute named bar
string barval = foo.getAttribute("bar");
Parse a string for XML
XmlNode foo = readDocument(xmlstring);
Search a node's children for a match to the XPath String
XmlNode[]xpathresults = foo.parseXPath("bar/toast");
Do an XPath search with attribute matching
XmlNode[]xpathresults = foo.parseXPath("bar[@type="left" and @lol="cats"]/toast");
Do a search for a toast element arbitrarily deep in the tree whose parent is a bar element
XmlNode[]xpathresults = foo.parseXPath("//bar/toast");
Match on inner xml
XmlNode[]xpathresults = xml.parseXPath(`//td[.="Text 2.3"]`);
Subnode text matching and inequalities
XmlNode[]xpathresults = xml.parseXPath(`//tr[@ab>=9 and th="Head"]/td`);
Quirks
- The input string may not always be the same as the output string, even if nothing is modified
- Always outputs XML with double quoted attributes
- <![CDATA[]]> nodes will be escaped and left as regular, parsed character data
To Do
- Refactor inheritance as a set of shared interfaces
- THIS WILL PROBABLY BREAK BACKWARD COMPATIBILITY
- in Xml(everything in kxml): toString, reset, more?
- in XmlChild(XmlDoc,XmlNode): parseXPath, addCData, addChild, getChildren, removeChild, getCData, getInnerXML, addChildren
- in XmlAttribute(XmlNode,XmlPI): removeAttribute, setAttribute, getAttributes, getAttribute, hasAttribute, getName, setName
- can use casting tests to check types, need to deprecate isXXXXX functions
- XPath improvements
- Implement [2] type constraints
- Improve parsing of malformed xml (<? // blah ?>)