Schema-aware extended Annotation Graphs
Abstract: Multistructured (M-S) documents were introduced as an answer to the need of ever more expressive data models for scholarly annotation, as experienced in the frame of Digital Humanities. Many proposals go beyond XML, that is the gold standard for annotation, and allow the expression of multilevel, concurrent annotation. However, most of them lack support for algorithmic tasks like validation and querying, despite those being central in most of their application contexts. In this paper, we focus on two aspects of annotation: data model expressiveness and validation. We introduce extended Annotation Graphs (eAG), a highly expressive graph-based data model, fit for the enrichment of multimedia resources. Regarding validation of M-S documents, we identify algorithmic complexity as a limiting factor. We advocate that this limitation may be bypassed provided validation can be checked by construction, thatisby constraining the shape of data during its very manufacture. So far as we know, no existing validation mechanism for graph-structured data meets this goal. We define here such a mechanism, based on the simulation relation, somehow following a track initiated in Dataguides. We prove that thanks to this mechanism, the validity of M-S data regarding a given schema can be guaranteed without any algorithmic check.