Web-based applications and services publish their data using XML, the de facto standard for sharing data, since the use of XML as a common data representation format helps Interoperability with other applications and services. However, since the same information can be published using XML in many different ways in terms of structure and terminology, the exchange of XML data is not yet fully automatic. This heterogeneity of XML data has led recently to research in areas such as schema matching, schema transformation and schema integration in the context of XML data, in an attempt to enhance data sharing between applications. The development of algorithms that automate these tasks, thereby reducing the time and effort spent on creating and maintaining data sharing applications, is highly beneficial for many domains: examples range from generic frameworks, such as for XML messaging and component-based development, to applications and services in e-business, e-science and e-learning.
This thesis addresses the problem of sharing XML data between applications. In particular, we have developed an approach to the transformation and integration of heterogeneous XML data sources. Our approach is schema-based, meaning that its output is a set of mappings between a source and a target schema, in a data transformation scenario, or sets of mappings between several source and one target integrated schema, in a data integration scenario. Our mappings specify the relationships between data sources at the schema level, but also at the data level, and they can be utilized for querying or materializing the target schema using data from one or more data source(s).