written by
John P. McCrae, Adrian Doyle
on 2019-08-28
Automatic Term Recognition (ATR) is an important method for the summarization and analysis of large corpora, and normally requires a significant amount of linguistic input, in particular the use of part-of-speech taggers. For an under-resourced language such as Irish, the resources necessary for thi
written by
John P. McCrae, Alexandre Rademaker, Francis Bond, Ewa Rudnicka, Christiane Fellbaum
on 2019-08-28
We describe the release of a new wordnet for English based on the Princeton WordNet, but now developed under an open-source model. In particular, this version of WordNet, which we call English WordNet 2019, which has been developed by multiple people around the world through GitHub, fixes many error
written by
John P. McCrae
on 2019-08-28
Neologism detection is a key task in the constructing of lexical resources and has wider implications for NLP, however the identification of multiword neologisms has received little attention. In this paper, we show that we can effectively identify the distinction between compositional and non-compo
written by
Bharathi Raja Chakravarthi, Ruba Priyadharshini, Bernardo Stearns, Arun Jayapal, S Srivedy, Mihael Arcan, Manel Zarrouk, John P. McCrae
on 2019-08-28
Multimodal machine translation is the task of translating from source language to target language using information from other modalities. Existing multimodal datasets have been restricted to only highly resourced languages. These datasets were collected by manual translation of English descriptions
written by
Bharathi Raja Chakravarthi, Mihael Arcan, John P. McCrae
on 2019-08-28
In this paper, we translate the glosses in the English WordNet based on the expand approach for improving and generating wordnets with the help of multilingual neural machine translation. Neural Machine Translation (NMT) has recently been applied to many tasks in natural language processing, leading
written by
Hommel, Fabian, Orlikowski, Matthias, Cimiano, Philipp, Hartung, Matthias
on 2019-08-21
Considerable progress in neural question answering has been made on competitive general domain datasets. In order to explore methods to aid the generalization potential of question answering models, we reimplement a state-of-the-art architecture, perform a parameter search on an open-domain dataset
written by
Declerck, Thierry, Siegel, Melanie, Racioppa, Stefania
on 2019-08-20
We describe work consisting in porting two large German lexical resources into the OntoLex-Lemon model in order to establish
complementary interlinkings between them. One resource is OdeNet (Open GermanWord-Net) and the other is a further development of the German version of the MMORPH morphological
written by
Thierry Declerck, Dagmar Gromann
on 2019-07-18
Semantic shifts caused by derivational morphemes is a common subject of investigation in language modeling, while inflectional morphemes are frequently portrayed as semantically more stable. This study is motivated by the previously established observation that inflectional morphemes can be just as
written by
Víctor Rodríguez Doncel, Mariano Rico
on 2019-07-16
This document is the initial version of the Prêt-à-LLOD Data Management Plan.
The Data Management Plan adheres to and complies with the “H2020 Data Management Plan – General Definition” given by the European Commission (EC) online. Prêt-à-LLOD adopts poli