[ALOY-28] Consider using jsdom for HTML parsing and DOM creation
GitHub Issue | n/a |
---|---|
Type | Improvement |
Priority | High |
Status | Closed |
Resolution | Fixed |
Resolution Date | 2012-07-19T11:52:03.000+0000 |
Affected Version/s | 2012 Sprint 14 |
Fix Version/s | 2012 Sprint 14 |
Components | XML |
Labels | n/a |
Reporter | Tony Lukasavage |
Assignee | Tony Lukasavage |
Created | 2012-05-11T08:36:41.000+0000 |
Updated | 2014-06-22T13:21:18.000+0000 |
Description
Right now our HTML parsing is based on grammars written against [peg.js](http://pegjs.majda.cz/), which in turn generates a Javascript HTML parser, which we then use an additional parser on top of to actually start creating our DOM and Titanium code. In addition to all that, we also have a custom selector for traversing and modifying the DOM. The grammars have been custom written by us and we currently have no test suite for testing its validity.
An alternative to this method might be to take advantage of [jsdom](https://github.com/tmpvar/jsdom), a nodejs package that is a full, relatively mature, one-step DOM creator/parser with integrated selectors (also allows jQuery integration) and a full test suite. It's basically all of the above steps rolled into one and test suite already done. It also covers far more comprehensively the ins and outs of the DOM, with specification, benchmarking, and testing for level 1, 2, and 3.
My limited research so far has not shown how exactly we could export this created DOM from our nodejs build environment to the Titanium environment, but even if this is not possible, it would still give a much needed boost in terms of functionality and stability to the parsing currently handled by
grammar/html.pegjs
and lib/parser/html.js
. We could still run our own custom parser on top of it, like lib/parser/parser.js
, without the need for creating our own selectors... and doing so in a more standardized format.
The goal here is to try and remove the complexity of DOM creation/parsing from our development so that we can focus on how it can be used to create Titanium code. Thoughts?
The selector is not custom by us, extensions to it have been done by the jQuery/Sizzle folks. The DOM we are using was easily used by Sizzle so it's a true w3c compliant DOM, although Node is good as script runner and other tasks but I would like to keep core functionality independent. Can revisit if the current solution becomes un-workable.
Concerns going forward: * Without jsdom we don't have a means of using jQuery and other browser DOM reliant libraries * jsdom offers testing that would take us a long time to create for our solution * We will likely never reach a level of DOM implementation that jsdom has done and tested already if we develop from scratch.
At least for the foreseeable future, we will continue using the current XML DOM scripts as they were successfully integrated with sizzle. The associated ticket links will still need ot be resolved, thoguh