Difference between revisions of "KXML"

From Makers Local 256
Jump to: navigation, search
m (Quirks: rearrange and add some comments)
(To Do: update todo list for things accomplished)
 
(13 intermediate revisions by one user not shown)
Line 1: Line 1:
 
{{Project|Creator=Opticron
 
{{Project|Creator=Opticron
|Status=<onlyinclude>Late Development</onlyinclude>                                <!--LEAVE ONLYINCLUDES FOR STATUS HACK-->  
+
|Status=<onlyinclude>Late Implementation</onlyinclude>                                <!--LEAVE ONLYINCLUDES FOR STATUS HACK-->  
 
|Born On=00:49, 22 June 2008 (CDT)                                                                  <!--DO NOT EDIT -->
 
|Born On=00:49, 22 June 2008 (CDT)                                                                  <!--DO NOT EDIT -->
 
|Last Updated={{#time: H:i, d F Y| {{REVISIONTIMESTAMP}} }} (CDT)              <!--DO NOT EDIT -->
 
|Last Updated={{#time: H:i, d F Y| {{REVISIONTIMESTAMP}} }} (CDT)              <!--DO NOT EDIT -->
Line 6: Line 6:
  
 
==Overview==
 
==Overview==
This project tracks the development of the DOM XML parsing library written in D called KXML.  This project was born of the need for an XML parser where none existed that relied on Phobos, supported unparsed cdata nodes, and worked properly with D1.0.  The parser operates on strings and attempts to allocate as little memory as possible by using slice references into the original string provided.  The parser is loosely based on the Yage XML parser, also written in D.  KXML has a completely new parsing engine, but attempts to remain mostly api-compatible with the Yage parser.  The parser can currently deal with XML processing instructions, comments, unparsed character data, standard, and self-closing nodes.  It will ignore node types it doesn't understand.  The code is on the Narrows SVN server under [https://opticron.no-ip.org/svn/branches/kxml kxml].   
+
This project tracks the development of the DOM XML parsing library written in D called KXML.  This project was born of the need for an XML parser where none existed that relied on Phobos, supported unparsed cdata nodes, and worked properly with D1.0 (it supports D2.0 as well via the exact same codebase).  The parser operates on strings and attempts to allocate as little memory as possible by using slice references into the original string provided.  The parser is loosely based on the Yage XML parser, also written in D.  KXML has a completely new parsing engine, but attempts to remain mostly api-compatible with the Yage parser.  The parser can currently deal with XML processing instructions, comments, unparsed character data, standard, and self-closing nodes.  The code is on my SVN server under [https://pianoben.ch/svn/branches/kxml kxml].  There is also a D2/Phobos2 range-oriented version available in [https://pianoben.ch/svn/branches/libdxml2 libdxml2].
 
+
This library supports basic xpath searches including multiple attribute matching and arbitrarily deep searches.
+
  
 
==Usage Examples==
 
==Usage Examples==
Line 18: Line 16:
 
  XmlNode firstchild = foo.getChildren()[0];
 
  XmlNode firstchild = foo.getChildren()[0];
 
Get the attribute named bar
 
Get the attribute named bar
  char[]barval = foo.getAttribute("bar");
+
  string barval = foo.getAttribute("bar");
 
Parse a string for XML
 
Parse a string for XML
  foo = readDocument(xmlstring);
+
  XmlNode foo = readDocument(xmlstring);
 
Search a node's children for a match to the XPath String
 
Search a node's children for a match to the XPath String
 
  XmlNode[]xpathresults = foo.parseXPath("bar/toast");
 
  XmlNode[]xpathresults = foo.parseXPath("bar/toast");
Line 27: Line 25:
 
Do a search for a toast element arbitrarily deep in the tree whose parent is a bar element
 
Do a search for a toast element arbitrarily deep in the tree whose parent is a bar element
 
  XmlNode[]xpathresults = foo.parseXPath("//bar/toast");
 
  XmlNode[]xpathresults = foo.parseXPath("//bar/toast");
 +
Match on inner xml
 +
XmlNode[]xpathresults = xml.parseXPath(`//td[.="Text 2.3"]`);
 +
Subnode text matching and inequalities
 +
XmlNode[]xpathresults = xml.parseXPath(`//tr[@ab>=9 and th="Head"]/td`);
  
 
==Quirks==
 
==Quirks==
Line 34: Line 36:
  
 
==To Do==
 
==To Do==
* Implement XPath subnode matching
+
* Refactor inheritance as a set of shared interfaces
 +
** THIS WILL PROBABLY BREAK BACKWARD COMPATIBILITY
 +
** in Xml(everything in kxml): toString, reset, more?
 +
** in XmlChild(XmlDoc,XmlNode): parseXPath, addCData, addChild, getChildren, removeChild, getCData, getInnerXML, addChildren
 +
** in XmlAttribute(XmlNode,XmlPI): removeAttribute, setAttribute, getAttributes, getAttribute, hasAttribute, getName, setName
 +
** can use casting tests to check types, need to deprecate isXXXXX functions
 +
* XPath improvements
 +
** Implement [2] type constraints
 +
* Improve parsing of malformed xml (<? // blah ?>)
 +
 
  
 
[[Category:Software]]                                                  <!--MAKE AS MANY CATEGORIES AS YOU NEED-->
 
[[Category:Software]]                                                  <!--MAKE AS MANY CATEGORIES AS YOU NEED-->

Latest revision as of 09:10, 26 May 2012

Creator:
Opticron
Status:
Late Implementation
Born On:
00:49, 22 June 2008 (CDT)
Last Updated:
09:10, 26 May 2012 (CDT)

Overview

This project tracks the development of the DOM XML parsing library written in D called KXML. This project was born of the need for an XML parser where none existed that relied on Phobos, supported unparsed cdata nodes, and worked properly with D1.0 (it supports D2.0 as well via the exact same codebase). The parser operates on strings and attempts to allocate as little memory as possible by using slice references into the original string provided. The parser is loosely based on the Yage XML parser, also written in D. KXML has a completely new parsing engine, but attempts to remain mostly api-compatible with the Yage parser. The parser can currently deal with XML processing instructions, comments, unparsed character data, standard, and self-closing nodes. The code is on my SVN server under kxml. There is also a D2/Phobos2 range-oriented version available in libdxml2.

Usage Examples

Create an XML element with the name "foo".

XmlNode foo = new XmlNode("foo");

Add an attribute with the name "bar" and value "foobar"

foo.setAttribute("bar","foobar");

Get the first child node of foo

XmlNode firstchild = foo.getChildren()[0];

Get the attribute named bar

string barval = foo.getAttribute("bar");

Parse a string for XML

XmlNode foo = readDocument(xmlstring);

Search a node's children for a match to the XPath String

XmlNode[]xpathresults = foo.parseXPath("bar/toast");

Do an XPath search with attribute matching

XmlNode[]xpathresults = foo.parseXPath("bar[@type="left" and @lol="cats"]/toast");

Do a search for a toast element arbitrarily deep in the tree whose parent is a bar element

XmlNode[]xpathresults = foo.parseXPath("//bar/toast");

Match on inner xml

XmlNode[]xpathresults = xml.parseXPath(`//td[.="Text 2.3"]`);

Subnode text matching and inequalities

XmlNode[]xpathresults = xml.parseXPath(`//tr[@ab>=9 and th="Head"]/td`);

Quirks

  • The input string may not always be the same as the output string, even if nothing is modified
    • Always outputs XML with double quoted attributes
    • <![CDATA[]]> nodes will be escaped and left as regular, parsed character data

To Do

  • Refactor inheritance as a set of shared interfaces
    • THIS WILL PROBABLY BREAK BACKWARD COMPATIBILITY
    • in Xml(everything in kxml): toString, reset, more?
    • in XmlChild(XmlDoc,XmlNode): parseXPath, addCData, addChild, getChildren, removeChild, getCData, getInnerXML, addChildren
    • in XmlAttribute(XmlNode,XmlPI): removeAttribute, setAttribute, getAttributes, getAttribute, hasAttribute, getName, setName
    • can use casting tests to check types, need to deprecate isXXXXX functions
  • XPath improvements
    • Implement [2] type constraints
  • Improve parsing of malformed xml (<? // blah ?>)