The evaluation of the system was carried out in the Ontology Alignment Evaluation Initiative (OAEI). DSSim has participated in 2006, 2007, 2008 and 2009 achieving gradually improved results. The following sections present the result of two tracks out of 8 from the OAEI 2008.
Library track at OAEI 2008 According to the original task definition provided by the organizers of the OAEI 2008, the library track involved the alignment of two Dutch thesauri. These Dutch thesauri are used to index books from two collections held by the National Library of the Netherlands (KB). KB maintains two big collections: the Deposit Collection, containing all the Dutch printed publications (one million items), and the Scientific Collection, with about 1.4 million books mainly about the history, language and culture of the Netherlands. Each collection is described according to its own indexing system and conceptual vocabulary. On the one hand, the Scientific Collection was described using the GTT, a huge vocabulary containing 35,000 general concepts ranging from Wolkenkrabbers (Sky-scrapers) to Verzorging (Care). On the other hand, the books contained in the Deposit Collection are mainly indexed against the Brinkman thesaurus, containing a large set of headings (more than 5,000) that were expected to serve as global subjects of books. For each concept, the thesauri provided the usual lexical and semantic information: preferred labels, synonyms and notes, broader and related concepts, etc. The language of both thesauri was Dutch, but a quite substantial part of Brinkman concepts (around 60%) come with English labels. The library track was difficult partly because of its relative large size and because of its multilingual representation. Nevertheless in the library track DSSim has performed the best out of the 3 participating systems. However these ontologies contain related and broader terms therefore the mapping can be carried out without consulting multi-lingual background knowledge.
Directory track at OAEI 2008 As stated by the original task definition provided by the organizers of the OAEI 2008, this track is designed to evaluate mapping quality in a real world taxonomy integration scenario. The main objective is to measure whether
ontology alignment tools can effectively be applied to integration of "shallow ontologies". The evaluation dataset was extracted from
Google,
Yahoo! and
Looksmart web directories. The way these ontology pairs were created was to rely on a reference interpretation for nodes, constructed by looking at their use. The assumption was that the semantics of nodes could have been derived from their
pragmatics, namely from analysing, which documents were
classified under which nodes. The basic idea was therefore to compute the relationship hypotheses based on the
co-occurrence of documents. The specific characteristics of the dataset were: • More than 4500 of node matching tasks, where each node matching task is composed from the paths to root of the nodes in the web directories. • Expert mappings for all the matching tasks. • Simple relationships. Basically web directories contain only one type of relationship the so called "classification relation". • Vague terminology and modeling principles: The matching tasks incorporate the typical "real world" modeling and terminological errors. In the directory track only 6 systems have participated in 2008. In terms of
F-value DSSim has performed the best however the difference was marginal compared to the CIDER or Lily systems. == References ==