XCG is a command line utility for transforming the structure of an XML instance
into some kind of structure. It was created to help me generate wrappers for
several dozen different XML formats that I was working with. As a result of the
need, the first version generated c++ code wrappers around libxml2, which after
a lot of research appeared to be the best solution for my XML reading and
writing needs.
Now, it is much more powerful than that what I just described. Instead of just
generating c++ wrapper code, it now generates based on the contents of its
configuration, which is contained within XML. In fact, it uses itself to
generate the code to read the XML configuration file (well, not directly, but
its the same end result). It could be self hosting simply by embedding a copy
of the default XML configuration in the executable; I chose to keep it external
to make it easier for others to modify the default code generation.
I suppose I should explain why I didn't use a DTD or schema as the basis of the
generation. Problem is, I don't really have what I consider solid reasons. For
most of the XML data I was involved with, there either wasn't a DTD or schema,
or if there was, it was out of date. I suppose I could have made the creation
and/or update of the DTD's or schema a requirement, but in the end, I found
myself creating a lot of last minute simple XML instances with all kinds of
short little formats for things that would never be used again (look at
example3 for a real world example). Yes, I could have used XSL to do
some/most of these translations and such. However, in the end, I think what
made me choose this method was a library named
TinyXML that has a pretty faithful following because it is so easy
to use (alas, it has a few issues that make it unusable for me). I also wanted
something that was as easy to use, but which would be much more flexible. This
is where I ended up, and I am actually very happy with the
result! If one is familiar with stl, then the default code generation is pretty
straight forward to use; Because all elements are wrapped in objects, and those
objects don't allow direct acess to any data, it actually is highly
maintainable as well. As always, there are a few things I still want from it,
but its a great start.
As a result of using a configuration file (called a "template"),
there are very few limitations on what XCG can generate. So far, the
limitations I have encounterd are more annoyances than limitations; Issues I
got around by making significant modifications to my template - but which a few
extra featuures will eventually make much easier to deal with.
The following is a list of the major features (or functionality) provided by
XCG in its current form:
-
Output is completely configuration defined: There isn't a requirement to output
source code - it can literally be anything desired!
-
Any number of files can be generated, as all generated files are defined by the
template. For example, if 1 class per element is created, one can have each
class in a seperate file, or one can have all of them in a single file (like
the default generation)
-
Unlimited iterations through elements, both at file and sub-file scope (i.e.
one could iterate over elements, and within the generated text of that
iteration, iterate over them again, recursively). Depth is limited by machine
hardware, not XCG itself (just don't forget to make it end somewhere!)
-
Unlimited iterations of children of an element within an element iteration
(without an element, how can there be child elements?)
-
Unlimited iteration of attributes of an element within an element iteration
(without an element, what are attributes?)
-
One can reference the name of the current file, element, attribute, child
element, or the root element within the generated text.
-
In addition to the previous item, there are some built in transformations that
can be done on the file, element, attribute, child element and root elements
names:
-
Capitalize first letter
-
Capitalize entire name
-
Perform name replacements (for example, replacing words reserved within the
target language)
-
Pluralize
-
Replace invalid characters with valid characters (or strings)
-
Remove invalid characters and capitalize next char (for example, make
"foo-bar" into "fooBar")
-
For pluralization, the template can provide verbatim pluralized forms of words.
For example, pluralizing "child" isn't just a matter of adding an
"s", so one could provide a pluralized form of child as
"children"
-
A list of invalid characters, and their replacements can be specified in the
template
-
A list of invalid words, and their replacements can be specified in the
template
-
Eventually, all transformation lists will be able to specify their context. For
example, maybe a word replacement is only valid while iterating through
elements.
-
Isolates your XML file i/o from the underlying technology. If you later on
decide to change the technology of your XML support, its simply a matter of
updating the template to generate the exact same external interface, but
internally use the new technology. For example, currently the default
generation is for libxml2. However, if I decided to change to using the Apache
XML OpenSource library, I can modify the template to use a different back end
(the Apache XML library), but keep the external interface the same. Its then
simply a matter of a recompile to change the back-end technology from libxml2
to Apache XML!
-
Abilities available in 1.1.0 or later
-
<element> allows user defined meta data.
-
Template and element specific information are now contained in seperate files.
This makes it easy to use the same template code for multiple projects, as the
data that is specific to a project is in a seperate file. For example, all
<template> elements are contained within a template file, but the new
<element> elements are only in the configuration file. Additionally, the
<replacement>, <invalid>, and <plural> elements can exist
in either or both files.
-
<condition> elements allow custom conditionals. All standard conditions can be used
as well as the values of the new meta data can be queried.