What is Language Grid?

Overview

Several research groups including NICT, universities, and NTT started developing a language infrastructure on the Internet called the Language Grid. This project is based on collaboration between industry, government, universities, and citizens. Practical systems will be released within three years from April 2006.

The Language Grid offers two main benefits:

  • Ability to combine language resources (ex. bilingual dictionaries) or language processing functions (ex. machine translators).
  • Ability to add own language resources to create new language services for their own intercultural activities.

The Language Grid is an infrastructure that is built on the top of the Internet. It allows a better understanding of Internet contents written in different languages and by people from different countries. In addition, the Language Grid allows users to easily develop new language services by combining existing ones to satisfy their needs.

Basic software for the Language Grid has been studied and developed at the National Institute of Information and Communications Technology (NICT). For trial operation, however, Department of Social Informatics, Graduate School of Informatics, Kyoto University takes on the role as the Language Grid Operator. During this trial, the usage of the Language Grid is limited to non-profit activities. To accumulate use cases and best practices, user groups including NPOs, NGOs and universities form the Language Grid Association.

For more information, the Language Grid Operation Center run by Kyoto University, and the Language Grid Association Web site, where various user groups gather.

(626KB) (706KB) Language Grid - Connecting World's Language Services to Support Intercultural Collaboration
(663KB) (1.39MB) Language Grid Services
(938KB) (3.58MB) Supporting Intercultural Collaboration through the Language Grid
(324KB) Language Grid Pamphlet
(940KB) Use and management of the Language Grid

Background of the Language Grid

The Internet allows people to be linked together, however language remains the biggest barrier. There is no standard language on the Internet. Its users have a variety of languages. For instance, the Internet population of English speaking people is only 35% or so. The remainder is almost equally divided between other European languages and Asian languages. We need to speak various languages to get all possible information from the Internet. Although machine translation services can be applied to computer mediated communication, the quality of translation is still not good enough. We have to overcome the language barriers in order to enhance worldwide intercultural collaboration activities across the Internet.

Two main goals of our project are:

  • Combine the existing standard language services provided by linguistic professionals.
  • Assist users to create new language services for their own purpose by permitting them to add their own language resources to the ones made by professionals.

The Language Grid is a new infrastructure that allows not only professionals but also end users to conquer the language barriers by themselves.

Role of the Language Grid

Online language services already exist including bilingual dictionaries and machine translators. However, can people use those services for their intercultural activities? Difficulties often arise while trying to use those language services in their communities.

Those difficulties are:

  • The cost of language services is very high. Machine translation costs around ten thousand USD per language pair per year. To cover ten to twenty languages, a large budget is required.
  • Complex contracts, intellectual property rights, and non-standard application interfaces make it difficult for users to customize language services in support of their activities.

The Language Grid is a new infrastructure on the top of the Internet that aims to improve the accessibility and usability of existing language services and so encourage users to create new language services that suit their needs.

Functions of the Language Grid

The Language Grid enables users to easily create new language services by combining existing ones on the Internet. Generally, the word grid is defined as “a system or structure of distributed resources collaborating with each other, and using an open-standard protocol for creating a high quality service.” Our objective, applying the concept of “grid” to ensure collaboration among language services, has not been described before.

The Language Grid has two main structures:

  • Horizontal Language Grid concerns the combination of existing bilingual dictionaries or machine translation systems for standard languages.
  • Vertical Language Grid concerns specific scenes of intercultural collaboration activities, which require new specialized language services.

By the horizontal Language Grid, bilingual dictionaries and machine translation services for ten or more Asian languages will be available in three years. Furthermore, we will collaborate with research institutes in Europe to cover more than twenty different languages. By the vertical Language Grid, various intercultural collaboration activities will be supported such as interpretation in medical hospitals, communication via pictograms among kids around the world, and multilingual workshops to facilitate intercultural activities.

Technology of the Language Grid

Semantic Web technologies will be developed to enable the collaboration needed among language resources and language processing functions. Language resources include bilingual dictionaries, thesauruses, and corpora, while language processing functions include machine translation, morphological analysis, and paraphrase.

The technologies we will pursue are:

  • Language service ontology is a technology to define language service entries in a standardized way. We will collaborate with European research institutes to advance the standardization of language service ontology.
  • Semantic Web service is a technology to use Web services via standard methods. The scenario description language developed in Kyoto University will be used to describe interaction among composite services.

Language service ontology and semantic Web services make it possible to combine bilingual dictionaries and machine translations all over the world. For instance, to realize Japanese-German translation, we can combine atomic services such as Japanese-English and English-German translation.

Project Structure of the Language Grid

The Language Grid project was started in 2005 mainly by researchers from universities and research institutes near Kyoto. This project was officially adopted as a part of the government’s IT strategy at NICT Knowledge Creating Communication Research Center in April 2006.

The feature of this project is the collaboration between industry, government, universities, and citizens. In other words, researchers from NICT, universities, companies and members of NPOs are cooperating to realize the Language Grid. The Language Grid Association has been established at COCON KARSUMA in the center of Kyoto. We use this place for usability testing and for creating business models of the Language Grid.

In three years we will apply the Language Grid in many ways as follows:

  • Medical hospital: Bilingual parallel texts made by medical professionals and those made by local volunteers will be combined to support medical translation services in hospitals.
  • Kids’ communication: Pictograms made by kids around the world, various machine translators and morphological analyzers will be combined to create pictogram chat systems.
  • Multilingual radio production: Various machine translators and bilingual dictionaries are combined to support multilingual workshops for radio program production.