Understanding the user's intention is crucial for many tasks that involve human-machine interaction. To that end, word sense disambiguation (WSD) techniques play an important role. WSD techniques typically require well-formed sentences as context to operate, as well as pre-defined catalogues of
# Lemmatized English Word2Vec data
This is a version of the original GoogleNews-vectors-negative300 Word2Vec embeddings for English.
In addition, we provide the following modified files:
- converted to conventional CSV format (and gzipped)
for the most frequent 1.000.000 wo
Bilingual lexicons are a vital tool for under-resourced languages and recent state-of-the-art approaches to this leverage pretrained monolingual word embeddings using supervised or semi-supervised approaches. However, these approaches require cross-lingual information such as seed dictionaries to tr
We present the current state of the large "European network for Web-centred linguistic data science". In its first phase, the network has put in place several working groups to deal with specific topics. The network also already implemented a first round of Short Term Scientific Missions (
The proliferation of the World Wide Web and the Semantic Web applications has led to an increase in distributed services and datasets. This increase has put the infrastructural load in terms of availability, immutability, and security, and these challenges are being failed by the Linked Open Data (L
Sentiment analysis of Dravidian languages has received attention in recent years. However, most social media text is code-mixed and there is no research available on sentiment analysis of code-mixed Dravidian languages. The Dravidian-CodeMix-FIRE 2020, a track on Sentiment Analysis for Dravidian Lan
The CoNLL-RDF ontology provides machine-readable semantics for an inventory of CoNLL properties (and classes) for a growing collection of about two dozen CoNLL and related formats currently used in language technology.
We describe the use of linguistic linked data to support a cross-lingual transfer framework for sentiment analysis in the pharmaceutical domain. The proposed system dynamically gathers translations from the Linked Open Data (LOD) cloud, particularly from Apertium RDF, in order to project a deep lear
This zip file containes the results of the conversion of Mmorph morphologies into the OntoLex-Lemon model, using the Turtle syntax as the serialization method.
The content of the file is: 380.405 base forms and 2.534.735 fullforms, covering English, German French, Spanish ,Italian and Du