Discussion:
[odb-users] Can one use ODB with Classes generated by XSD?
B Hart
2011-07-22 17:50:24 UTC
Permalink
Hello Boris,

I have a large Schema that I compiled with XSD (tree). This allows me to
nicely read in corresponding XML data. Now I'm faced with the task of
populating an existing DB with the data. However, their are difficulties:
1) There are new elements that are not currently stored in the DB, so tables
and columns will have to be manually added, 2) there is not a nice mapping
between all of the Schema elements and the corresponding DB
table/column...i.e.. some of the element data may have to be modified or
combined to go into a DB field, and the document that specifies the mappings
is incomplete (this means I have to look through hundreds of tables to
"figure out" where data goes, for hundreds of elements). 3) Once I figure
out the mapping I have to then add the code manually to populate the DB with
the data.

The DB is a MS SQL DB and right now ODB doesn't support MS SQL. However, it
might be worth switching to MySQL if it were possible and reasonable to run
ODB against the classes that XSD creates in order to create a correspond
MySQL DB Schema and the code to populate it. Then I could read in the XML
dataset with the XSD generated code and populate and work with the DB with
ODB generated code.

What are your thoughts.

Best,

Brian Hart
Boris Kolpackov
2011-07-26 13:37:47 UTC
Permalink
Hi Brian,
Post by B Hart
I have a large Schema that I compiled with XSD (tree). This allows me to
nicely read in corresponding XML data. Now I'm faced with the task of
1) There are new elements that are not currently stored in the DB, so tables
and columns will have to be manually added, 2) there is not a nice mapping
between all of the Schema elements and the corresponding DB
table/column...i.e.. some of the element data may have to be modified or
combined to go into a DB field, and the document that specifies the mappings
is incomplete (this means I have to look through hundreds of tables to
"figure out" where data goes, for hundreds of elements). 3) Once I figure
out the mapping I have to then add the code manually to populate the DB with
the data.
Yes, I think this is a fairly common problem when trying to import data
from an XML vocabulary to a relational database, unless the XML vocabulary
was specifically designed with that conversion in mind.
Post by B Hart
The DB is a MS SQL DB and right now ODB doesn't support MS SQL. However, it
might be worth switching to MySQL if it were possible and reasonable to run
ODB against the classes that XSD creates in order to create a correspond
MySQL DB Schema and the code to populate it. Then I could read in the XML
dataset with the XSD generated code and populate and work with the DB with
ODB generated code.
The problem with automatically storing XSD-generated object model in
a relational database using ODB is that the conversion is not well
define and in fact is not always possible. It is not clear whether,
say, a nested element should be mapped to a column (value type) or
a reference to another table (object) with the contents of this
element stored in that table. Some elements can be stored as either
composite value types or as objects. Those that have more than two
levels of sequence containment can only be stored as objects.
Post by B Hart
What are your thoughts.
I see two possible approaches here, depending on how closely the
XML vocabulary models the database representation.

1. If XML and database models are very different (as in the case you
described above), then the best approach would probably be to have two
object models (sets of C++ classes): the first is for XML (generated
by the XSD compiler) and the second is for the database (hand-written
or auto-generated by ODB SQL-to-C++ compiler from the existing schema,
something that is on our TODO list). Once you have the two models, you
manually write the code that convert between the two, such as
performing merging and splitting of members, etc.

2. If XML models the database pretty closely, then you can take the
XSD-generated model and map it (using ODB pragmas) to the database
tables (those can be placed into a separate file and included into
the ODB compilation with the --odb-epilogue option). Once that is
done, you can just load the classes from XML and store them into
the RDBMS.

Boris
B Hart
2011-07-26 15:57:56 UTC
Permalink
Thanks for your explanation. Where you say "(those can be placed into a
separate file and included into
the ODB compilation with the --odb-epilogue option)." do you mean the
xsd-generated classes with pragmas?

So hypothetically, if I decided to create a whole new DB (based on the XML
Schema and using ODB), how would I best use the XSD-generated classes with
ODB to do this? Will ODB create a complete DB from the XSD-generated
output? I understand that ODB doesnt' support MS SQL currently so I am
assuming the use MySQL in this case.

-Brian
Post by Boris Kolpackov
Hi Brian,
Post by B Hart
I have a large Schema that I compiled with XSD (tree). This allows me to
nicely read in corresponding XML data. Now I'm faced with the task of
populating an existing DB with the data. However, their are
1) There are new elements that are not currently stored in the DB, so
tables
Post by B Hart
and columns will have to be manually added, 2) there is not a nice
mapping
Post by B Hart
between all of the Schema elements and the corresponding DB
table/column...i.e.. some of the element data may have to be modified or
combined to go into a DB field, and the document that specifies the
mappings
Post by B Hart
is incomplete (this means I have to look through hundreds of tables to
"figure out" where data goes, for hundreds of elements). 3) Once I
figure
Post by B Hart
out the mapping I have to then add the code manually to populate the DB
with
Post by B Hart
the data.
Yes, I think this is a fairly common problem when trying to import data
from an XML vocabulary to a relational database, unless the XML vocabulary
was specifically designed with that conversion in mind.
Post by B Hart
The DB is a MS SQL DB and right now ODB doesn't support MS SQL. However,
it
Post by B Hart
might be worth switching to MySQL if it were possible and reasonable to
run
Post by B Hart
ODB against the classes that XSD creates in order to create a correspond
MySQL DB Schema and the code to populate it. Then I could read in the
XML
Post by B Hart
dataset with the XSD generated code and populate and work with the DB
with
Post by B Hart
ODB generated code.
The problem with automatically storing XSD-generated object model in
a relational database using ODB is that the conversion is not well
define and in fact is not always possible. It is not clear whether,
say, a nested element should be mapped to a column (value type) or
a reference to another table (object) with the contents of this
element stored in that table. Some elements can be stored as either
composite value types or as objects. Those that have more than two
levels of sequence containment can only be stored as objects.
Post by B Hart
What are your thoughts.
I see two possible approaches here, depending on how closely the
XML vocabulary models the database representation.
1. If XML and database models are very different (as in the case you
described above), then the best approach would probably be to have two
object models (sets of C++ classes): the first is for XML (generated
by the XSD compiler) and the second is for the database (hand-written
or auto-generated by ODB SQL-to-C++ compiler from the existing schema,
something that is on our TODO list). Once you have the two models, you
manually write the code that convert between the two, such as
performing merging and splitting of members, etc.
2. If XML models the database pretty closely, then you can take the
XSD-generated model and map it (using ODB pragmas) to the database
tables (those can be placed into a separate file and included into
the ODB compilation with the --odb-epilogue option). Once that is
done, you can just load the classes from XML and store them into
the RDBMS.
Boris
Boris Kolpackov
2011-07-27 14:17:23 UTC
Permalink
Hi Brian,
Where you say "(those can be placed into a separate file and included into
the ODB compilation with the --odb-epilogue option)." do you mean the
xsd-generated classes with pragmas?
XSD-generated classes do not have any pragmas (XSD doesn't know anything
about ODB). So you will need to add those pragmas yourself, which can be
placed into a separate file (we call it a "mapping" file) and "added"
to the ODB compilation process (when you compile the XSD-generated header)
using the --odb-epilogue option.
So hypothetically, if I decided to create a whole new DB (based on the XML
Schema and using ODB), how would I best use the XSD-generated classes with
ODB to do this? Will ODB create a complete DB from the XSD-generated
output?
No, as I explained above, you will need to "tell" ODB how to map the
XSD-generated classes to the database, just as you would do for hand-
written code. ODB has no idea which XSD-generated classes should be
objects, which should be value types, which attribute/element is the
object id, etc. Only you can decide such aspects of the mapping.

There will be other difficulties as well. Here are a few from the
top of my head:

1. All data members in the XSD-generated classes are protected which
makes them inaccessible to ODB. To overcome this, you could post-
process the XSD-generated header with a script and replace
'protected:' with 'public:'. Alternatively, we can add an option
to XSD to generate data members public.

2. XSD uses wrapper templates for 'one' and 'optional' members. ODB
will not know how to "unwrap" them without some help from your
side (value_traits). We are currently working on the 'wrapper'
concept for ODB which will make handling this much easier.

3. You will need an id member for every object class. This may
or may not be a problem in your case. Support for objects
without an explicit object id is also on our TODO list.

Boris
B Hart
2012-05-01 19:04:33 UTC
Permalink
Hello Boris,

I'm re-visiting this thread since ODB now supports MS SQL and I'm at a
place where I might be able to use ODB, and my requirements have changed .
Particularly I no longer have the need to map my schema into an existing DB
that doesn't match the schema well. That is I'm going to create a DB that
matches the Schema.

I've used CodeSynthesis XSD to generate classes for a set of schemas that
happen to be nicely hierarchical. Essentially the XSDs define the elements
for patient care records. Each record has up to ~550 elements divided into
~23 subsections/categories. Elements in the subsections have all the
different cardinalities. For example, elements relating to adminstering a
medication are defined in a complex element with cardinality sequence
(since multiple medications might be administered to the same patient, or
the same medication given at different times).

As an evaluation exercise I generated a DB schema from the XSDs using
Altova's XMLSpy. It generated a set of tables very reflective of the
organization of the XML Schemas as well as the element constraints. I'm
wondering if I similarly relied on ODB to generate the tables if it would
produce a similar DB schema, as well as the constraints based on the
element types? Haven't tried it yet.

Also, I'm wondering if item #2 below has been implemented? I have written
a program that with excellent help from XSD generated classes reads in
patient records in an XML file, validates the XML, and checks various
business rules and generates a report. At the point after validation has
occurred and Business Rules are checked and pass, the data is ready to put
into the DB. It would be nice if I could use ODB to generate the Schema
and make it happen with just a few lines of code (similar to how easy it is
with XSD to read in a complex schema and serialize it out again.).

Thanks in advance for your comments.
Post by Boris Kolpackov
Hi Brian,
Where you say "(those can be placed into a separate file and included
into
the ODB compilation with the --odb-epilogue option)." do you mean the
xsd-generated classes with pragmas?
XSD-generated classes do not have any pragmas (XSD doesn't know anything
about ODB). So you will need to add those pragmas yourself, which can be
placed into a separate file (we call it a "mapping" file) and "added"
to the ODB compilation process (when you compile the XSD-generated header)
using the --odb-epilogue option.
So hypothetically, if I decided to create a whole new DB (based on the
XML
Schema and using ODB), how would I best use the XSD-generated classes
with
ODB to do this? Will ODB create a complete DB from the XSD-generated
output?
No, as I explained above, you will need to "tell" ODB how to map the
XSD-generated classes to the database, just as you would do for hand-
written code. ODB has no idea which XSD-generated classes should be
objects, which should be value types, which attribute/element is the
object id, etc. Only you can decide such aspects of the mapping.
There will be other difficulties as well. Here are a few from the
1. All data members in the XSD-generated classes are protected which
makes them inaccessible to ODB. To overcome this, you could post-
process the XSD-generated header with a script and replace
'protected:' with 'public:'. Alternatively, we can add an option
to XSD to generate data members public.
2. XSD uses wrapper templates for 'one' and 'optional' members. ODB
will not know how to "unwrap" them without some help from your
side (value_traits). We are currently working on the 'wrapper'
concept for ODB which will make handling this much easier.
3. You will need an id member for every object class. This may
or may not be a problem in your case. Support for objects
without an explicit object id is also on our TODO list.
Boris
Boris Kolpackov
2012-05-03 14:19:54 UTC
Permalink
Hi Brian,
Post by B Hart
As an evaluation exercise I generated a DB schema from the XSDs using
Altova's XMLSpy. It generated a set of tables very reflective of the
organization of the XML Schemas as well as the element constraints. I'm
wondering if I similarly relied on ODB to generate the tables if it would
produce a similar DB schema, as well as the constraints based on the
element types? Haven't tried it yet.
ODB will generate a database schema according to how you map XSD-
generated classes to objects, values, relationships, containers, etc.
In fact, XML schemas that I normally see (hierarchical, deeply nested,
container-in-container-in-container-... kind) don't match the canonical
relational model (i.e., a model that an experienced DBA would design)
very well. So I am quite surprised you are happy with a database schema
generated by XMLSpy without any "mapping" input from your side. And
that's also why I am quite skeptical that we can support a fully-
automatic XSD->C++->DB mapping, without any user input.

To illustrate my point, consider this fairly typical XML and schema
(based on the library example from XSD):

XML:

<catalog>
<book id="MM">
<title>The Master and Margarita</title>
<author recommends="WP">
<name>
<first>Mikhail</first>
<last>Bulgakov</last>
</name>
</author>
</book>

<book id="WP">
<title>War and Peace</title>
<author recommends="MM">
<name>
<first>Leo</first>
<last>Tolstoy</last>
</name>
</author>
</book>
</catalog>


Schema:

<complexType name="name">
<sequence>
<element name="first" type="string"/>
<element name="last" type="string"/>
</sequence>
</complexType>

<complexType name="author">
<sequence>
<element name="name" type="lib:name"/>
</sequence>
<attribute name="recommends" type="IDREF"/>
</complexType>

<complexType name="book">
<sequence>
<element name="title" type="string"/>
<element name="author" type="lib:author" maxOccurs="unbounded"/>
</sequence>
<attribute name="id" type="ID" use="required"/>
</complexType>

<complexType name="catalog">
<sequence>
<element name="book" type="lib:book" maxOccurs="unbounded"/>
</sequence>
</complexType>

<element name="catalog" type="lib:catalog"/>

How would we map something like this to a database? Is 'name' an object
or a value (i.e., do names get their own table or are part of another
table)? In case of a name, it is probably a value type. Answering the
same question for 'author' is trickier (seeing that there could be
multiple books by the same author, it should probably be an object).
'book' is most definitely an object. And 'catalog' probably doesn't
have any representation in the database at all!

Here is the database schema that I would design for this object model:

CREATE TABLE author (
first_name VARCHAR(255) NOT NULL,
last_name VARCHAR(255) NOT NULL,
recommends VARCHAR(255) NULL,

PRIMARY KEY (first_name, last_name),
CONSTRAINT recommends_fk FOREIGN KEY (recommends) REFERENCES book (id)));

CREATE TABLE book (
id VARCHAR(255) NOT NULL PRIMARY KEY,
title TEXT NOT NULL);

CREATE TABLE book_author (
book_id VARCHAR(255) NOT NULL,
author_first_name VARCHAR(255) NOT NULL,
author_last_name VARCHAR(255) NOT NULL,

CONSTRAINT book_fk FOREIGN KEY (book_id) REFERENCES book (id)),
CONSTRAINT author_fk
FOREIGN KEY (author_first_name, author_last_name)
REFERENCES author (first_name, last_name)));

Does it resemble the XML schema? Not really. In fact, XML and schema that
would resemble this database schema more closely would look along these
lines:

XML:

<catalog>

<authors>
<author id="MB" recommends="WP">
<name>
<first>Mikhail</first>
<last>Bulgakov</last>
</name>
</author>

<author id="LT" recommends="MM">
<name>
<first>Leo</first>
<last>Tolstoy</last>
</name>
</author>
</authors>

<books>
<book id="MM">
<title>The Master and Margarita</title>
<author>MB</author>
</book>

<book id="WP">
<title>War and Peace</title>
<author>LT</author>
</book>
</books>

</catalog>

Schema:

<complexType name="name">
<sequence>
<element name="first" type="string"/>
<element name="last" type="string"/>
</sequence>
</complexType>

<complexType name="author">
<sequence>
<element name="name" type="lib:name"/>
</sequence>
<attribute name="id" type="ID" use="required"/>
<attribute name="recommends" type="IDREF"/>
</complexType>

<complexType name="book">
<sequence>
<element name="title" type="string"/>
<element name="author" type="IDREF" maxOccurs="unbounded"/>
</sequence>
<attribute name="id" type="ID" use="required"/>
</complexType>

<complexType name="catalog">
<sequence>

<element name="authors">
<complexType>
<sequence>
<element name="author" type="lib:author" maxOccurs="unbounded"/>
</sequence>
</complexType>
</element>

<element name="books">
<complexType>
<sequence>
<element name="book" type="lib:book" maxOccurs="unbounded"/>
</sequence>
</complexType>
</element>

</sequence>
</complexType>

<element name="catalog" type="lib:catalog"/>

I see schemas like the first one all the time and like the second one --
not much.
Post by B Hart
I have written a program that with excellent help from XSD generated classes
reads in patient records in an XML file, validates the XML, and checks
various business rules and generates a report. At the point after
validation has occurred and Business Rules are checked and pass, the data is
ready to put into the DB. It would be nice if I could use ODB to generate
the Schema and make it happen with just a few lines of code (similar to how
easy it is with XSD to read in a complex schema and serialize it out
again.).
The point of the above exercise is to show that I don't think we can come
up with an auto-magical solution which will take an XML schema, generate
C++ classes, and map them to the database, all without your DBA swearing
at you in the end (for the all the right reasons) ;-).

Instead, the generated C++ classes will have to manually and carefully
be mapped to the database.
Post by B Hart
Also, I'm wondering if item #2 below has been implemented?
Yes, wrappers and the NULL value semantics are supported.

Boris
B Hart
2012-05-09 23:49:39 UTC
Permalink
Hi Boris,

Sorry about the late response, I was put on to some other tasks.

I looked over the two XSD/XML examples you provided. In the first example,
there is a catalog of books where (potentially) a book can have may authors
and an author many books (many to many, but in this case with attribute
"recommends" it is unidirectional 1:1). In the second example it is a
catalog of both books and authors (a little strange), with the same
attribute "recommends" expressing a unidirectional 1:1 relationship.
(BTW: In the first example, shouldn't "recommends" be an element (minOccurs
= 0 maxOccurs = unbounded), since this really isn't Metadata? )

The XML schema I'm working with seems closer to your second example. The
DB Schema auto-generated by XMLSpy is definitely not perfect, but with a
little minor cleanup should be 3N form. There is a Patient Care record
(1patient = 1 record), and each record is divided into a number of
sections, and the relationships between elements are hierarchical in one
direction (down the tree) with some 1 to many relationships. I've tried
to see where there might be relationships between elements in different
sections, but there seem to be few. Data inserted is immutable.

So maybe I just got lucky or I haven't explored deeply enough yet. I do
know that it was very quick for me to generate the tables and then use
Mapforce to create an extraction and load (not much transforming). I'd
like to use ODB, but it seems like it is going to take a lot of time to
create all the mappings.

I understand it might be be difficult to auto-generate an acceptable DB
Schema in the majority of instances, but would even a poor DB Schema and
mapping (automatically generated) be a better starting point than none at
all (especially when there are going to be many tables)??? What do you
think about the idea of having a pragma that could be used to remove the
mappings in sections of XSD generated classes that weren't found to auto
generate correctly a portion of the DB schema???
Post by Boris Kolpackov
Post by Boris Kolpackov
Yes, wrappers and the NULL value semantics are supported.
Can you point me to any examples showing how wrappers are used?

Thanks.
Post by Boris Kolpackov
Hi Brian,
Post by Boris Kolpackov
As an evaluation exercise I generated a DB schema from the XSDs using
Altova's XMLSpy. It generated a set of tables very reflective of the
organization of the XML Schemas as well as the element constraints. I'm
wondering if I similarly relied on ODB to generate the tables if it would
produce a similar DB schema, as well as the constraints based on the
element types? Haven't tried it yet.
ODB will generate a database schema according to how you map XSD-
generated classes to objects, values, relationships, containers, etc.
In fact, XML schemas that I normally see (hierarchical, deeply nested,
container-in-container-in-container-... kind) don't match the canonical
relational model (i.e., a model that an experienced DBA would design)
very well. So I am quite surprised you are happy with a database schema
generated by XMLSpy without any "mapping" input from your side. And
that's also why I am quite skeptical that we can support a fully-
automatic XSD->C++->DB mapping, without any user input.
To illustrate my point, consider this fairly typical XML and schema
<catalog>
<book id="MM">
<title>The Master and Margarita</title>
<author recommends="WP">
<name>
<first>Mikhail</first>
<last>Bulgakov</last>
</name>
</author>
</book>
<book id="WP">
<title>War and Peace</title>
<author recommends="MM">
<name>
<first>Leo</first>
<last>Tolstoy</last>
</name>
</author>
</book>
</catalog>
<complexType name="name">
<sequence>
<element name="first" type="string"/>
<element name="last" type="string"/>
</sequence>
</complexType>
<complexType name="author">
<sequence>
<element name="name" type="lib:name"/>
</sequence>
<attribute name="recommends" type="IDREF"/>
</complexType>
<complexType name="book">
<sequence>
<element name="title" type="string"/>
<element name="author" type="lib:author" maxOccurs="unbounded"/>
</sequence>
<attribute name="id" type="ID" use="required"/>
</complexType>
<complexType name="catalog">
<sequence>
<element name="book" type="lib:book" maxOccurs="unbounded"/>
</sequence>
</complexType>
<element name="catalog" type="lib:catalog"/>
How would we map something like this to a database? Is 'name' an object
or a value (i.e., do names get their own table or are part of another
table)? In case of a name, it is probably a value type. Answering the
same question for 'author' is trickier (seeing that there could be
multiple books by the same author, it should probably be an object).
'book' is most definitely an object. And 'catalog' probably doesn't
have any representation in the database at all!
CREATE TABLE author (
first_name VARCHAR(255) NOT NULL,
last_name VARCHAR(255) NOT NULL,
recommends VARCHAR(255) NULL,
PRIMARY KEY (first_name, last_name),
CONSTRAINT recommends_fk FOREIGN KEY (recommends) REFERENCES book (id)));
CREATE TABLE book (
id VARCHAR(255) NOT NULL PRIMARY KEY,
title TEXT NOT NULL);
CREATE TABLE book_author (
book_id VARCHAR(255) NOT NULL,
author_first_name VARCHAR(255) NOT NULL,
author_last_name VARCHAR(255) NOT NULL,
CONSTRAINT book_fk FOREIGN KEY (book_id) REFERENCES book (id)),
CONSTRAINT author_fk
FOREIGN KEY (author_first_name, author_last_name)
REFERENCES author (first_name, last_name)));
Does it resemble the XML schema? Not really. In fact, XML and schema that
would resemble this database schema more closely would look along these
<catalog>
<authors>
<author id="MB" recommends="WP">
<name>
<first>Mikhail</first>
<last>Bulgakov</last>
</name>
</author>
<author id="LT" recommends="MM">
<name>
<first>Leo</first>
<last>Tolstoy</last>
</name>
</author>
</authors>
<books>
<book id="MM">
<title>The Master and Margarita</title>
<author>MB</author>
</book>
<book id="WP">
<title>War and Peace</title>
<author>LT</author>
</book>
</books>
</catalog>
<complexType name="name">
<sequence>
<element name="first" type="string"/>
<element name="last" type="string"/>
</sequence>
</complexType>
<complexType name="author">
<sequence>
<element name="name" type="lib:name"/>
</sequence>
<attribute name="id" type="ID" use="required"/>
<attribute name="recommends" type="IDREF"/>
</complexType>
<complexType name="book">
<sequence>
<element name="title" type="string"/>
<element name="author" type="IDREF" maxOccurs="unbounded"/>
</sequence>
<attribute name="id" type="ID" use="required"/>
</complexType>
<complexType name="catalog">
<sequence>
<element name="authors">
<complexType>
<sequence>
<element name="author" type="lib:author" maxOccurs="unbounded"/>
</sequence>
</complexType>
</element>
<element name="books">
<complexType>
<sequence>
<element name="book" type="lib:book" maxOccurs="unbounded"/>
</sequence>
</complexType>
</element>
</sequence>
</complexType>
<element name="catalog" type="lib:catalog"/>
I see schemas like the first one all the time and like the second one --
not much.
Post by Boris Kolpackov
I have written a program that with excellent help from XSD generated
classes
Post by Boris Kolpackov
reads in patient records in an XML file, validates the XML, and checks
various business rules and generates a report. At the point after
validation has occurred and Business Rules are checked and pass, the
data is
Post by Boris Kolpackov
ready to put into the DB. It would be nice if I could use ODB to
generate
Post by Boris Kolpackov
the Schema and make it happen with just a few lines of code (similar to
how
Post by Boris Kolpackov
easy it is with XSD to read in a complex schema and serialize it out
again.).
The point of the above exercise is to show that I don't think we can come
up with an auto-magical solution which will take an XML schema, generate
C++ classes, and map them to the database, all without your DBA swearing
at you in the end (for the all the right reasons) ;-).
Instead, the generated C++ classes will have to manually and carefully
be mapped to the database.
Post by Boris Kolpackov
Also, I'm wondering if item #2 below has been implemented?
Yes, wrappers and the NULL value semantics are supported.
Boris
Boris Kolpackov
2012-05-14 17:39:19 UTC
Permalink
Hi Brian,
Post by B Hart
I understand it might be be difficult to auto-generate an acceptable DB
Schema in the majority of instances, but would even a poor DB Schema and
mapping (automatically generated) be a better starting point than none at
all (especially when there are going to be many tables)?
It might seem so, but in the long run, I don't think it will (generally,
we try to avoid designing tools that have short-term benefits with long-
term problems, even if it is very tempting sometimes).

You mentioned above that the "DB Schema auto-generated by XMLSpy is
definitely not perfect, but with a little minor cleanup should be 3N
form." The problem with this approach is that you will have to keep
manually "fixing-up" the schema (plus the generated C++ code, if we
were to support this in XSD) every time the XML schema changes. This
is the example of the "short-term benefits with long-term problems"
approach I am talking about.

The approach that I have in mind would require you to providing
initial mapping (e.g., which classes are objects, which ones are
values, which attributes/elements are object ids, etc). This might
require some upfront time investment. However, down the road,
provided the future schema changes are not too radical, this
mapping shouldn't take much effort to maintain. It should definitely
be easier than fixing-up generated database and C++ code.

Boris

Loading...