Why do we need this project when we have tools from Google etc?

written by Mohammad Fazleh Elahi on 2020-05-06

Citizens and businesses of different nationalities often face barriers when using digital tools and services. Consumers have restricted access in some cases, and governments and citizens cannot fully benefit from this digital transformation. The solution is the digital single market, as it removes key differences, breaking down the barriers to cross-border online activity and providing multilingual resources. 

A coordinated, complementary, cross-border, and multilingual single solution approach can play an important role not only in the digital single market but also in disaster management. For example, the European Union developed a common approach for efficient contact tracing apps to support the gradual lifting of confinement measures as a response to the Covid-19 pandemic. Europes data protection watchdog has called for a single coronavirus app to be used across the multilingual EU nations, instead of every country making its own. 

The existing Google and Natural Language Processing (NLP) tools have several disadvantages over creating a digital single market and well-coordinated multilingual solutions. Prêt-à-LLOD tools, on the other hand, utilize linked open data and language technologies to create multilingual and cross-border applications.

Why Prêt-à-LLOD tools?

Interlinked and interoperable information

Google and Facebook provide very comfortable environments, precise search results, autocomplete text boxes etc., but these tools are developed on traditional databases (such as BI-tools, search engines, etc) and these databases are not interlinked, although they are highly related to each other. 

For example, if a user wants to know about Barack Obama and searches in Google tools, the results are obtained from the entire Internet as a supplementary source. The results are merged, ranked and presented to the user. To get more detail of personal information (such as birth date and place, education, family, etc.) of Barack Obama and related information (such as a predecessor, African American leaders, White House, etc) users have to go through many documents and browse them.

obama

In google search, when searching Barack Obama a user has to go through many documents to get related information (such as a predecessor, white house,  etc).

On the other hand, if the data were stored or internally published as Linked Data (as in the Prêt-à-LLOD linked data approach), the result will expose all this personal information and also a list of related entities, which are semantically interrelated to the users search query. Exchangeability, interoperability, transformation, and linking are the key to the development of linked data applications.

Prêt-à-LLOD provides several supporting tools to develop linked data applications. They are:

  • A management tool for transforming, manipulating, enriching or creating language resources. The tool converts data from any text format to linked data format (i.e. RDF) and ensures exchangeability and interoperability among databases and services.
  • Linking components to link databases and resources by linking a term to another term (i.e apple is linked to orange), a term to a concept (i.e. apple to linked to fruit), and a concept to another concept of a knowledge base (i.e. DBpedia).

An application developed on linked data, the search result can provide a list of related entities, which are interrelated to the users search query. As can be seen from the figure, Barack Obama is linked with George W. Bush.

Multilingual information and complex translation

Most of these NLP tools are developed without considering their usage in multilingual settings. Although Google Translate can be a lifesaver when we need to translate simple text (such as a menu in a foreign language), when translating financial reports, educational materials, health-related documents, pharmaceutical drug trials, Google Translate is unlikely to interpret original ideas and provide an accurate translation. 

The Prêt-à-LLOD tools provide a multilingual cross-repository data search facility, covering major dataset sources across Europe and the world. Through these tools, terms (or pieces of text) are exposed as multilingual data (translations, definitions, synonyms and other terminological data). Therefore, users can navigate from one translation to the other in a specific language by simply clicking the information.  

Cross-border information access and licensing

The Google translation service platforms are not ideal where confidential documents are involved. Some companies have found that their confidential data was published online after they had used online translation tools to translate company data. 

Prêt-à-LLOD provides new models and mechanisms for ensuring the ​ validity​, ​ maintainability and ​ licensing of language resources and tools to deduce the possible licenses of a resource.

Literature links: