Opened 12 years ago
Last modified 8 years ago
#139 assigned new-feature
xml parser -- revised implementation idea
Reported by: | Fred T. Hamster | Owned by: | bugdock |
---|---|---|---|
Priority: | minor | Milestone: | |
Component: | feistymeow-nucleus | Version: | |
Keywords: | Cc: |
Description
make this a non-tree thing, just live it in textual.
it should understand how to read a tag, how to read a content chunk,
how to strip spaces and crap from both ends,
how to turn attributes into a string table.
then we can do it like this:
provide the methods that do the actual parsing, like eat tag, and eat
content.
drive them either via a default method that will go through and invoke
a callback for each syntax element
OR
allow the user to drive them as they see fit.
the latter allows us goal directed parsing that each lightlink object
can use to pull in the xml.
instead of having to build a tree and figure it all out, we can invoke
the sub parsers in the same order we output stuff? that's not so great,
but it's simple to do and get working.
or we can just have it be a more sensible scheme, where we tell parser to
look for specific subtags in any order and chow them in.
is xml supposed to be order-invariant? or can people expect things in
a particular order without breaking the standard or being ugly/clumsy?
the former method of parsing would be great, because we could hook into
it from a tree based xml parser. this guy would simply read the whole
damned thing and turn it into a tree. it has to live in nodes or above
though, since it needs to use the tree class.
---
the special character sequences we injected during the output phase should
be returned to their original meaning. just an inverse operation on the
clean_reserved.
===========------------------====================----------------
when accumulating content during parsing:
when accumulating "content" productions in the xml parsing,
we want to eat any number of white spaces and turn them into a single
white space.
strip the beginning and ending white spaces entirely, of course.
==-=-=---------------_++++++++++++++++++++