id	summary	reporter	owner	description	type	status	priority	milestone	component	version	resolution	keywords	cc
125	parsers -- html prettyprint	Fred T. Hamster	bugdock	"idea from web:

python pretty printing using lxml library:

from BeautifulSoup import BeautifulSoup as bs
root=lh.tostring(sliderRoot) #convert the generated HTML to a string
soup=bs(root)                #make BeautifulSoup
prettyHTML=soup.prettify()   #prettify the html


==========

how about a recursive descent parser,
which is actually kind of similar to a state machine, but
which allows returns rather than having to know the next state.


plain text state
  < seen, go to gather tag state.
  any thing else seen, emit char.
    splitter on blocks of text to avoid too long lines?

gather tag forking state
  / seen, go to open closure tag 
  all chars up to space or > go into tag name buffer
  space seen, go to gather attribs state
  > -> go to completed tag state

gather opener attribs state
  space seen, ignore
  chars seen, go to gather tag name
  > seen, go to completed owner tag state

gather tag name
(could be used by other states too)
  take all non space cars into tag name accum
  space ignore
  > seen, return

completed opener tag state
  record tag by push on stack
  emit gathered tag and attributes at approp indent level
  indent level ++
  go to plain text state

open closure tag state

completed closure tag state"	new-feature	assigned	minor		feistymeow-nucleus