Implementing Semantic Web applications: reference architecture and challenges

{{Summary
 * title=Implementing Semantic Web applications: reference architecture and challenges
 * authors=Benjamin Heitmann, Sheila Kinsella, Conor Hayes, Stefan Decker
 * url=http://ceur-ws.org/Vol-524/swese2009_2.pdf
 * tags=Semantic Web, software engineering, reference architecture
 * summary===Overview==

This paper presents a reference architecture for Semantic Web applications, based on an empirical survey of 98 Semantic Web applications.

The paper is motivated in part by the need for cost-benefit analyzes of Semantic Web applications. While it does not provide a cost estimate, it identifies the relevant components of Semantic Web applications, for use in future cost estimate studies.

The paper may also be useful for finding surveys of Semantic Web applications: its related work section mentions 4 other empirical surveys.

Reference architecture
Seven (7) main components were distinguished. The paper indicates whether all, most, or some of the surveyed applications implemented each component.

Almost always implemented (90%-100%)

 * 1) Data interface (100%)
 * 2) User interface (92%)

Often implemented (70%-80%)

 * 1) Search Service (81%)
 * 2) Integration Service (72%)

Sometimes implemented (30%-40%)

 * 1) Crawler (35%)
 * 2) Persistent storage (35%)
 * 3) Authoring Interface (32%)

The main challenges for implementing Semantic Web technologies
1. Integrating noisy and heterogeneous data

"The majority of applications rely on data integration, but in order to imple- ment it, expensive human intervention is necessary and knowledge about reason- ing and inferencing needs to be acquired by the software engineers." Three particular problems are discussed:

Furthermore, data integration may require multiple components, such as a crawler and a search service, in addition to the integration service itself.
 * 1) Use of non-standard (undefined) terms
 * 2) Incorrect usage of vocabularies (contrary to their intended usage)
 * 3) Multiple URIs for the same objects (identifiers aren't unique)

2. Mismatch of data models and APIs between components

The data models need to be mapped--e.g. relational databases, object oriented data, RDF (graph data model). While web applications benefit from existing mappings, the Semantic Web developer may be obliged to "provide an abstraction layer on top of the RDF data model himself."

3. Missing or belated conventions and standards

"There are many different export and access mechanisms for RDF data, from putting an RDF dump on a web server, embedding links to RDF data in HTML or providing a SPARQL endpoint." and "Authoritative recommendations for making RDF accessible over the Web were not available until 2006, when Tim-Berners Lee published a design note([http: //www.w3.org/DesignIssues/LinkedData.html http: //www.w3.org/DesignIssues/LinkedData.html]) which established the Linked Data principles."

4. Distribution of application logic across computers

Inferencing and reasoning, formal vocabularies, and RDF query language may be used, which "results in the application logic being distributed across the different components."

Data Sets

 * Architectural analysis
 * Questionnaire results

Conference Details
}}
 * Presentation Slides
 * Workshop Description (5th International Workshop on SW-Enabled Software Engineering); won best paper award
 * relevance=Cost-benefit analyzes can use this reference architecture. Highlights the importance of reusable components for frequently used components. Suggests simplifying the architecture of Semantic Web applications, in particular by outsourcing data integration and using software frameworks.
 * journal=5th International Workshop on Semantic Web-Enabled Software Engineering
 * pub_date=2008/10
 * subject=Computer Science