Difference between revisions of "KXML"

Latest revision as of 09:10, 26 May 2012

Creator:
Opticron
Status:
Late Implementation
Born On:
00:49, 22 June 2008 (CDT)
Last Updated:
09:10, 26 May 2012 (CDT)

Overview

This project tracks the development of the DOM XML parsing library written in D called KXML. This project was born of the need for an XML parser where none existed that relied on Phobos, supported unparsed cdata nodes, and worked properly with D1.0 (it supports D2.0 as well via the exact same codebase). The parser operates on strings and attempts to allocate as little memory as possible by using slice references into the original string provided. The parser is loosely based on the Yage XML parser, also written in D. KXML has a completely new parsing engine, but attempts to remain mostly api-compatible with the Yage parser. The parser can currently deal with XML processing instructions, comments, unparsed character data, standard, and self-closing nodes. The code is on my SVN server under kxml. There is also a D2/Phobos2 range-oriented version available in libdxml2.

Usage Examples

Create an XML element with the name "foo".

XmlNode foo = new XmlNode("foo");

Add an attribute with the name "bar" and value "foobar"

foo.setAttribute("bar","foobar");

Get the first child node of foo

XmlNode firstchild = foo.getChildren()[0];

Get the attribute named bar

string barval = foo.getAttribute("bar");

Parse a string for XML

XmlNode foo = readDocument(xmlstring);

Search a node's children for a match to the XPath String

XmlNode[]xpathresults = foo.parseXPath("bar/toast");

Do an XPath search with attribute matching

XmlNode[]xpathresults = foo.parseXPath("bar[@type="left" and @lol="cats"]/toast");

Do a search for a toast element arbitrarily deep in the tree whose parent is a bar element

XmlNode[]xpathresults = foo.parseXPath("//bar/toast");

Match on inner xml

XmlNode[]xpathresults = xml.parseXPath(`//td[.="Text 2.3"]`);

Subnode text matching and inequalities

XmlNode[]xpathresults = xml.parseXPath(`//tr[@ab>=9 and th="Head"]/td`);

Quirks

The input string may not always be the same as the output string, even if nothing is modified
- Always outputs XML with double quoted attributes
- <![CDATA[]]> nodes will be escaped and left as regular, parsed character data

To Do

Refactor inheritance as a set of shared interfaces
- THIS WILL PROBABLY BREAK BACKWARD COMPATIBILITY
- in Xml(everything in kxml): toString, reset, more?
- in XmlChild(XmlDoc,XmlNode): parseXPath, addCData, addChild, getChildren, removeChild, getCData, getInnerXML, addChildren
- in XmlAttribute(XmlNode,XmlPI): removeAttribute, setAttribute, getAttributes, getAttribute, hasAttribute, getName, setName
- can use casting tests to check types, need to deprecate isXXXXX functions
XPath improvements
- Implement [2] type constraints
Improve parsing of malformed xml (<? // blah ?>)

@@ Line 1: / Line 1: @@
 {{Project|Creator=Opticron
-|Status=<onlyinclude>Late Development</onlyinclude>                                <!--LEAVE ONLYINCLUDES FOR STATUS HACK-->
+|Status=<onlyinclude>Late Implementation</onlyinclude>                                <!--LEAVE ONLYINCLUDES FOR STATUS HACK-->
 |Born On=00:49, 22 June 2008 (CDT)                                                                  <!--DO NOT EDIT -->
 |Last Updated={{#time: H:i, d F Y| {{REVISIONTIMESTAMP}} }} (CDT)              <!--DO NOT EDIT -->
@@ Line 6: / Line 6: @@
 ==Overview==
-This project tracks the development of the DOM XML parsing library written in D called KXML.  This project was born of the need for an XML parser where none existed that relied on Phobos, supported unparsed cdata nodes, and worked properly with D1.0.  The parser operates on strings and attempts to allocate as little memory as possible by using slice references into the original string provided.  The parser is loosely based on the Yage XML parser, also written in D.  KXML has a completely new parsing engine, but attempts to remain mostly api-compatible with the Yage parser.  The parser can currently deal with XML processing instructions, comments, unparsed character data, standard, and self-closing nodes.  It will ignore node types it doesn't understand.  The code is on the Narrows SVN server under [https://opticron.no-ip.org/svn/branches/kxml kxml].
+This project tracks the development of the DOM XML parsing library written in D called KXML.  This project was born of the need for an XML parser where none existed that relied on Phobos, supported unparsed cdata nodes, and worked properly with D1.0 (it supports D2.0 as well via the exact same codebase).  The parser operates on strings and attempts to allocate as little memory as possible by using slice references into the original string provided.  The parser is loosely based on the Yage XML parser, also written in D.  KXML has a completely new parsing engine, but attempts to remain mostly api-compatible with the Yage parser.  The parser can currently deal with XML processing instructions, comments, unparsed character data, standard, and self-closing nodes.  The code is on my SVN server under [https://pianoben.ch/svn/branches/kxml kxml].  There is also a D2/Phobos2 range-oriented version available in [https://pianoben.ch/svn/branches/libdxml2 libdxml2].
-This library supports basic xpath searches including multiple attribute matching and arbitrarily deep searches.
 ==Usage Examples==
@@ Line 18: / Line 16: @@
   XmlNode firstchild = foo.getChildren()[0];
 Get the attribute named bar
-  char[]barval = foo.getAttribute("bar");
+  string barval = foo.getAttribute("bar");
 Parse a string for XML
-  foo = readDocument(xmlstring);
+  XmlNode foo = readDocument(xmlstring);
 Search a node's children for a match to the XPath String
   XmlNode[]xpathresults = foo.parseXPath("bar/toast");
@@ Line 27: / Line 25: @@
 Do a search for a toast element arbitrarily deep in the tree whose parent is a bar element
   XmlNode[]xpathresults = foo.parseXPath("//bar/toast");
+Match on inner xml
+ XmlNode[]xpathresults = xml.parseXPath(`//td[.="Text 2.3"]`);
+Subnode text matching and inequalities
+ XmlNode[]xpathresults = xml.parseXPath(`//tr[@ab>=9 and th="Head"]/td`);
 ==Quirks==
-* Always outputs XML with double quoted attributes
+* The input string may not always be the same as the output string, even if nothing is modified
-* The input string may not always be the same as the output string
+** Always outputs XML with double quoted attributes
-** Unparsed CData nodes will be escaped and left as regular CData
+** <![CDATA[]]> nodes will be escaped and left as regular, parsed character data
 ==To Do==
-* Implement XPath subnode matching
+* Refactor inheritance as a set of shared interfaces
-* Add an opIndex for child node access (integer)
+** THIS WILL PROBABLY BREAK BACKWARD COMPATIBILITY
-* Add an opIndex for attribute access (string)
+** in Xml(everything in kxml): toString, reset, more?
+** in XmlChild(XmlDoc,XmlNode): parseXPath, addCData, addChild, getChildren, removeChild, getCData, getInnerXML, addChildren
+** in XmlAttribute(XmlNode,XmlPI): removeAttribute, setAttribute, getAttributes, getAttribute, hasAttribute, getName, setName
+** can use casting tests to check types, need to deprecate isXXXXX functions
+* XPath improvements
+** Implement [2] type constraints
+* Improve parsing of malformed xml (<? // blah ?>)
 [[Category:Software]]                                                  <!--MAKE AS MANY CATEGORIES AS YOU NEED-->

Difference between revisions of "KXML"

Latest revision as of 09:10, 26 May 2012

Contents

Overview

Usage Examples

Quirks

To Do

Navigation menu

Views

Personal tools

wiki pages

Search

Categories

User pages

external links

Tools