written by
Declerck, Thierry, McCrae, John, Hartung, Matthias, Gracia, Jorge, Chiarcos, Christian, Montiel, Elena, Cimiano, Philipp, Revenko, Artem, Sauri, Roser, Lee, Deirdre, Racioppa, Stefania, Nasir, Jamal, Orlikowski, Matthias, Lanau-Coronas, Marta, Fäth, Christian, Rico, Mariano, Elahi, Mohammad Fazleh, Khvalchik, Maria, Gonzalez, Meritxell, Cooney, Katharine
on 2020-04-01
In this paper we describe the contributions made by the European H2020 project “Prêt-à-LLOD” (‘Ready-to-use Multilingual Linked
Language Data for Knowledge Services across Sectors’) to the further development of the Linguistic Linked Open Data (LLOD)
infrastructu
written by
Hartung, Matthias, Orlikowski, Matthias, Veríssimo, Susana
on 2020-03-12
Rolling out text analytics applications or individual components thereof to multiple input languages of interest requires scalable workflows and architectures that do not rely on manual annotation efforts or language-specific re-engineering per target language. These scalability challenges aggravate
written by
John P. McCrae, Thierry Declerck
on 2020-01-14
In this paper we briefly describe the European H2020 project "Prêt-à-LLOD" ('Ready-to-use Multilingual Linked Language Data for Knowledge Services across Sectors'). This project aims to increase the uptake of language technologies by exploiting the combination of linke
written by
Patricia Martín-Chozas
on 2020-01-02
This Doctoral Consortium paper presents a methodology to automate the creation of rich terminologies from plain text documents, by establishing links to external resources and by adopting the W3C standards for the Semantic Web. The pro-posed method comprises six tasks: refinement, disambiguatio
written by
Abromeit, Frank, Chiarcos, Christian
on 2019-11-27
We introduce AnnoHub, an on-going effort to automatically complement existing language resources with metadata about the languages they cover and the annotation schemes (tagsets) that they apply, to provide a web interface for their curation and evaluation by means of domain experts, and to publish
written by
Jorge Gracia, Besim Kabashi, Ilan Kernerman, Marta Lanau-Coronas, Dorielle Lonke
on 2019-11-27
The objective of the Translation Inference Across Dictionaries (TIAD) shared task is to explore and compare methods and techniques that infer translations indirectly between language pairs, based on other bilingual/multilingual lexicographic resources. In its second, 2019, edition the participating
written by
Donandt, Kathrin, Chiarcos, Christian
on 2019-11-27
This paper describes our contribution to the Shared Task on Translation Inference across Dictionaries (TIAD-2019). In our approach, we construct a multi-lingual word embedding space by projecting new languages in the feature space of a language for which a pretrained embedding model exists. We use t
written by
Julia Bosque-Gil, Dorielle Lonke, Jorge Gracia, Ilan Kernerman
on 2019-11-27
The OntoLex-lemon model has gradually acquired the status of de-facto standard for the representation of lexical information according to the principles of Linked Data (LD). Exposing the content of lexicographic resources as LD brings both benefits for their easier sharing, discovery, reusability an
written by
Bettina Klimek, John P. McCrae, Maxim Ionov, James K. Tauber, Christian Chiarcos, Julia Bosque-Gil, Paul Buitelaar
on 2019-10-25
Recent years have experienced a growing trend in the publication of language resources as Linguistic Linked Data (LLD) to enhance their discovery, reuse and the interoperability of tools that consume language data. To this aim, the OntoLex-lemon model has emerged as a de-facto standard to represent
written by
Adrian Doyle, John P. McCrae, Clodagh Downey
on 2019-08-28
This paper examines difficulties inherent in tokenization of Early Irish texts and demonstrates that a neural-network-based approach may provide a viable solution for historical texts which contain unconventional spacing and spelling anomalies. Guidelines for tokenizing Old Irish text are presented