Prioritized

  1. Determine strategy for non-ASCII. Use UTF8 or Unicode internally. libxml2 uses UTF8 _only_, but provides means to convert to/from unicode, utf16, utf16be, utf16le. I'm leaning towards always using UTF8 internally and for generation (output), but providing means to convert just like libxml2 does. Since internal and output would both be UTF8, the only conversion would be on input. I dunno yet.

Not prioritized