Preliminary proposal for a metadata scheme for the description of LRs
|Authors:||Maria Gavrilidou; Penny Labropoulou; Elina Desipri; Stelios Piperidis|
|Book title:||Language Resources and Evaluation Conference (LREC 2010) - Workshop on Language Resource and Language Technology Standards – state of the art, emerging needs, and future developments|
This paper presents a preliminary version of a metadata scheme for describing language resources and technologies (LRTs) currently under development for the needs of META-SHARE, an open distributed facility for the exchange and sharing of LRTs. An essential ingredient in its setup is the existence of formal and standardised LRT descriptions, cornerstone of the interoperability layer of any such initiative. In this effort, we are building upon previous major metadata standardisation and related activities and, in particular, the ISOcat DCR. We present the main principles and features of the metadata schema, with an emphasis on the proposed ontology and LRT taxonomy and the strategies adopted for the descriptive mechanism. A stratified approach is suggested: the description of LRTs will be granular and abstractive, combining the taxonomy with an inventory of a maximal set of descriptive elements (catering for broad coverage / completeness), of which only a minimal subset is obligatory (catering for user friendliness). The schema will be implemented exploiting the "profile/component" model, providing different profiles for different LRT types, with metadata elements pointing to ISOcat Data Categories, thus ensuring semantic interoperability. Moreover, the schema will include a set of relations appropriately linking resources (e.g. LRs, LRs and reference documents).