The concept of metadata is not new by any means but with the advent of XML we are being made more aware of it. A repository in the IT sense is a text aware database that can store data about data, known as metadata. It is related to XML in that XML marks up text with information about the text, i.e. the metadata, and transmits both data and metadata, which is far more flexible than the older technique of predefining the format of a message and requiring both sending and receiving parties to obey the rules strictly.
XML however is just one example of the awareness of the importance of metadata; it is by no means the only example. Almost any part of a corporate IT system has the need to define details in a document format, which is both machine and human readable. Unfortunately over the years every separate entity has acquired its own documentation and there are as a result few if any standards. In fact the current situation can best be described as a shambles.
The old CASE life-cycle model illustrates the problem. The business design should be modelled and tested to begin with. Few organisations do this at all, but those that do probably make an effort to document their models, because they will be using software tools which should maintain a repository of the design details. Then it is converted into a system model. This should be automatic, but it seldom is. Inevitably a second repository is established (or none at all!). Then the database design is undertaken and the program specification generated, which the developers use as a basis for the code developed. When the application is implemented there are numerous operational tables to be maintained to define ownership, validity, database design, etc. This often involves setting up tables for TP Monitors and communication networks. Then there is the nightmare of looking after all those vulnerable PCs on the desk tops. But now we have CRM, and that means another database for the data warehouse, and yet another repository.
In an ideal world the model details would be passed from one stage of the cycle to the next, etc. Even more significant, if a change is made lower down the cycle, say to a code module, then that should be reflected back into the business model and rechecked for consistency, etc. But this doesn't happen and it is not too difficult to come to the conclusion that it probably never will.
In effect each stage of the old life-cycle model has its own "repository", which will vary from typed specifications to a fully populated XML repository. But there is so much already invested and so much going on that no one appears to have the time to invest in a single interrelated repository. No one is ever in the enviable situation of having a completely new IT systems environment. Years ago IBM tried to exploit the CASE life-cycle model idea with AD Cycle, which failed for the above reasons. It was undoubtedly the right idea but showed just how difficult it is to get an idea into practice. Since those days there have been one or two specialist repositories introduced, Brochade for instance. One of the problems that undermined the AD Cycle effort was the huge amount of storage needed for metadata, but that is no longer a problem with the cost effectiveness of Intel servers and modern disc arrays, but none of the current players has the size of IBM to make an enterprise level impact.
There have been some success stories in defining standards to exchange metadata between independent repositories, both between CASE design subsystems and between databases, ERP systems and data warehouses. Now is the time to put some real effort behind the current XML based repository standards, which should be applicable to the whole life-cycle. Nevertheless each software product vendor still fancies their own limited function repositories as part of their revenue stream and while they are keen to exchange metadata, they aren't so keen on a single common repository from some other vendor. It is well worth watching the progress with XML related repository standards and the specific repository vendors, but there is also an opportunity in the offing for an XML database vendor, such as Software AG with Tamino.
Martin Healey, pioneer development Intel-based computers en c/s-architecture. Director of a number of IT specialist companies and an Emeritus Professor of the University of Wales.