Implementing Semantic Web applications: reference architecture and challenges

From AcaWiki
Jump to: navigation, search

Citation: Benjamin Heitmann, Sheila Kinsella, Conor Hayes, Stefan Decker (2008/10) Implementing Semantic Web applications: reference architecture and challenges. 5th International Workshop on Semantic Web-Enabled Software Engineering (RSS)
Internet Archive Scholar (search for fulltext): Implementing Semantic Web applications: reference architecture and challenges
Download: http://ceur-ws.org/Vol-524/swese2009 2.pdf
Tagged: Computer Science (RSS) Semantic Web (RSS), software engineering (RSS), reference architecture (RSS)

Summary

Overview

This paper presents a reference architecture for Semantic Web applications, based on an empirical survey of 98 Semantic Web applications.

The paper is motivated in part by the need for cost-benefit analyzes of Semantic Web applications. While it does not provide a cost estimate, it identifies the relevant components of Semantic Web applications, for use in future cost estimate studies.

The paper may also be useful for finding surveys of Semantic Web applications: its related work section mentions 4 other empirical surveys.

Reference architecture

Seven (7) main components were distinguished. The paper indicates whether all, most, or some of the surveyed applications implemented each component.

Almost always implemented (90%-100%)

  1. Data interface (100%)
  2. User interface (92%)

Often implemented (70%-80%)

  1. Search Service (81%)
  2. Integration Service (72%)

Sometimes implemented (30%-40%)

  1. Crawler (35%)
  2. Persistent storage (35%)
  3. Authoring Interface (32%)

The main challenges for implementing Semantic Web technologies

1. Integrating noisy and heterogeneous data

"The majority of applications rely on data integration, but in order to imple- ment it, expensive human intervention is necessary and knowledge about reason- ing and inferencing needs to be acquired by the software engineers." Three particular problems are discussed:

  1. Use of non-standard (undefined) terms
  2. Incorrect usage of vocabularies (contrary to their intended usage)
  3. Multiple URIs for the same objects (identifiers aren't unique)

Furthermore, data integration may require multiple components, such as a crawler and a search service, in addition to the integration service itself.

2. Mismatch of data models and APIs between components

The data models need to be mapped--e.g. relational databases, object oriented data, RDF (graph data model). While web applications benefit from existing mappings, the Semantic Web developer may be obliged to "provide an abstraction layer on top of the RDF data model himself."

3. Missing or belated conventions and standards

"There are many different export and access mechanisms for RDF data, from putting an RDF dump on a web server, embedding links to RDF data in HTML or providing a SPARQL endpoint." and "Authoritative recommendations for making RDF accessible over the Web were not available until 2006, when Tim-Berners Lee published a design note([http: //www.w3.org/DesignIssues/LinkedData.html http: //www.w3.org/DesignIssues/LinkedData.html]) which established the Linked Data principles."

4. Distribution of application logic across computers

Inferencing and reasoning, formal vocabularies, and RDF query language may be used, which "results in the application logic being distributed across the different components."

Related Documents

Data Sets

Conference Details

Theoretical and Practical Relevance

Cost-benefit analyzes can use this reference architecture. Highlights the importance of reusable components for frequently used components. Suggests simplifying the architecture of Semantic Web applications, in particular by outsourcing data integration and using software frameworks.