CSAIL Publications and Digital Archive header
bullet Research Abstracts Home bullet CSAIL Digital Archive bullet Research Activities bullet CSAIL Home bullet

link to publications.csail.mit.edu link to www.csail.mit.edu horizontal line

 

Research Abstracts - 2007
horizontal line

horizontal line

vertical line
vertical line

Morpheus: A Data Integration Toolkit

Michael Stonebraker, Tiffany Dohzen & Mujde Pamuk

Data warehouses have been prevalent in large enterprises for at least the last decade. Practitioners report difficulty in identifying and constructing the transformations required to move operational data into a common global schema. Moreover, whenever the global schema changes, additional transform effort is required. One CIO at a very large e-commerce company reported that his warehouse schema changes once a week and keeping up with the transforms is his biggest warehouse problem.

The same experience was reported by one of us when trying to sell federated data base technology in the late 1990s. Specifically, constructing the transforms required for supporting the global schema was the major impediment to adoption of this technology. As enterprise information integration (EII) and cross enterprise web services become more prevalent, they will face the same issue: writing and modifying transformations is difficult, tedious and costly.

The authors have previously built the Morpheus data transformation system to address this problem. The basic idea is to assemble a large number of transforms in a repository (based on POSTGRES) and support sophisticated searching of the repository to find transforms of interest as well as easy modification of these transforms to form new ones, which can be added to the repository. The transforms with substantial amounts of associated information can be thought of as an enterprise metadata repository.

Based on feedback from demo-ing the system at SIGMOD 2006 as well as to numerous CIO’s and others, we have have embarked on a complete redesign of the system to facilitate a state-based browsing paradigm, the ability to filter transforms based on lineage and input-output properties and best-fit search for composite transforms. Also included in the redesign is a novel crawler to search for transforms of interest either within an enterprise or across the web. The result will be called Morpheus 2.0, which we hope will be a substantial step forward in transform reuse.

vertical line
vertical line
 
horizontal line

MIT logo Computer Science and Artificial Intelligence Laboratory (CSAIL)
The Stata Center, Building 32 - 32 Vassar Street - Cambridge, MA 02139 - USA
tel:+1-617-253-0073 - publications@csail.mit.edu