This is a discussion around proposals to enhance the Teiid Metadata Importer. (no this hasn't been fully thought out, but that's where your suggestions come in).
What problem are we disucssing:
- The ability of the importer to support more than one text format
- The abiltiy of the importer to accept all the metadata that a source model can manage
What is the importer: It provides the means to import, from a file, the metadata (i.e., tables and attributes, columns and attributes, etc.) to build a source model.
What text file formats are supported: currently, only one format and only limited metadata is supported.
(1) How can the importer be changed to support more than 1 text format?
- Abstract out the text parsing implementation, so that a different implementation could be provided for different file formats or user custom format. The core importer logic would talk to an interface to ask for: getTables(), getColumns(tablenam), getIndexes(..), etc. This could be similiar to how jdbc metadata is processed. However, for a custom implementation, the user would have to provide a jar that needs to be added to the eclipse classpath and picked up by the importer. So, how does a user, in the gui, have the option to select the different/custom formats? The default ones provided by the project could be provided, but for a custom one, the user might have to change a feature file to add this to the list?
- Create a jdbc driver that can be used to import metadata by parsing the text file and returning the metadata via the DatabaseMetadata interface. This would work just like the jdbc importer, but the implementation would be specified on the URL. This would require the user to setup a Connection Profile and specify the implementation on the URL and add the custom jar (if needed).
I prefer the 1st option, but stated the 2nd in case others could make a better case for it. Or, please comment, if you have a different suggestion.
(2) Enable the importer to accept all the metadata that a source model can manager.
The currently logic only supports a limited number of metadata. The new approach is to accept all that is specified in the file that can be mapped to a metadata type in the model. And anything that doesn't map should be logged.
-1 on both options.
What is the usecase you are trying to solve. Are we trying to write generic importer that can work with any sources? If that is case, writing the logic you are mentioning is whole crooks of the importer implementation. Actually it is more complicated/work this text based parsing, let me explain steps
| ---> JDBC Driver
source system --> source meta data --> conversion to custom text form --> parsing from custom text form --
| --> Eclipse Importer
unless, there is need to persist the metadata to externalized form and read back (for transport), this is unnecessary.
Since in-order for the user to use a *custom source*, he/she needs to write a *translator*, and translator does not facilities to expose the metadata for the dynamic vdb purposes, why not write a generic importer that reads from translator directly. If we can script right, we can deploy a "dynamic vdb" that has this custom translator exposing the metadata and we can read through Teiid JDBC driver in eclipse.
Source System ---> Translator -->Teiid (already there) --> Teiid JDBC(optional, already there) --> Eclipse Importer (jdbc importer already there, if we go directly to translator we have more work, as we need house the translator and it's connection semantics)
If I understood your usecase wrong, just ignore what I said above.
The usecase is the ability for other modeling systems (i.e., Rose, Casewise, etc.), to export their models in a "form" that Designer can import.
The problem with "form" here is that designer doesn't understand their form. In which, some form of conversion (translator) will be needed to parse the file and provide the information in a form that Designer can understand and create a model.
Could a Teiid translator be created to read thru the Teiid jdbc driver when using the Designer JDBC importer? sure, but we don't currently use the Teiid Translators in this fashion, yet. If we go down that direction, I think its more work for the user to create a custom translator just to parse a text file. The user now has to load up all the appropriate metadata objects that the translator will expose. In Option 1 above, they wouldn't have to go that far.
Also, running Teiid in embedded mode isn't currently supported. This would be a usability issue.
Ok, here the need is to generate UML models into relational models. I am afraid to say that these will be in proprietary formats, and will be hard to convert them into the Teiid specific formats. However, these tools might already have the ways to generate DDL from it. I would say as long as we can import DDL that is what we should be saying we will support.
The client / user has to write the one-off and that means it has hardly a chance at being reused by others. And not every vender is going to produce a format that we can accept, as we've encountered here. Having an integration point just makes it easier for a user to meet its demands on a timely basis without having to work thru another vendor. I think that's a plus for us.
DDL is not proprietary format. This is not for client/user. These are tools written by vendors, that save information in proprietary format. If anybody is going to write vendor need to write, as they are only ones that know how to interpret these files.
If argue that user knows the format, then have them generate Teiid Text File Metadata format for which we already have a importer for. There is no reason write another API and importer, text format that does exactly same.
Is it 'safe' to assume that the JDBC meta-data is enough to describe all the possible information that the query-analysis and optimisation engine requires to make informed decisions? Although that is one of the intents of the API, arguably a Data Virtualisation tool might be considered to be a special case insofar as, not only does it have to look at the capabilities of a source, it may also want to take in to account something like selectivity statistics or uniqueness. And then from this, take a wider view across multiple sources to identify the best way to approach the parts of the query.
Given that this - the ability for JDBC meta-data to provide all of this - seems unlikely to me - one might want to consider an alternative, which might include starting from the JDBC API and extending it.
The flat-file API seems like a lowest common denominator i.e. it can work just about anything. At the moment, the flat-file format is mostly quite straightforward. Writing bespoke translators doesn't appear too onerous. However, I've just started - so I might change my mind!
Implementing support for multiple formats means embedding (or configurig) that functionality into the product and depending on whether Teiid is going to define the format, or whether it's going to fit in to schemas defined by others - and have to deal with their different nuances - would be a big input to the direction. If one were to support multiple file-formats, it would seem likely that the latter option is what is intended (to get the most apparent benefit), and yet the development and ongoing could be quite considerable.
Although I often shy away from it, but in this case, an XML format may be a more flexible and future-proof way forward.
The benefit of option 1 is, Designer doesn't have to define the format. It only dictates the interface and the data structure of information that the implementor has to provide. Designer can provide in its first release support for the flat file format (which is what the current importer supports). But it doesn't have to preclude someone from implementing their on conversion for a different formatted file (i.e, xml). In a future release, the xml or other formats could be added as options. As always, with this approach, someone could always extend an existing implementation or create one to meet their needs.
Also, if you plan on writing a converster to match the flat file format, you could write that as one of the implementations and deploy that in your Designer installs so that for modelers, this conversion is done as part of a simpler process, not as a one-off task.