Skip to main content

Fujitsu

Global

Archived content

NOTE: this is an archived page and the content is likely to be out of date.

Fujitsu Laboratories Develops Technology for Automatically Linking with Open Data throughout the World

Assigns links to Linked Open Data to increase the usefulness of data

Fujitsu Laboratories Ltd.,Fujitsu Research and Development Center Co., Ltd.,Fujitsu Laboratories of Europe Ltd.

Kawasaki, Japan, Beijing, China, and Middlesex, United Kingdom, January 16, 2014 Fujitsu Laboratories Ltd., Fujitsu Research and Development Center Co., Ltd. and Fujitsu Laboratories of Europe Limited today announced the development of technology that can discover and automatically link data representing the same underlying subject among Linked Open Data (LOD)(1) available throughout the world and individual data sets maintained by governments and companies.

LOD is starting to come into wider use as a mechanism for publishing data on the Internet. Each individual LOD record is intended to be linked to data published on other websites, and by following these links, users can traverse multiple websites to access the data they need. When publishing data under the LOD approach, however, it can be challenging to interpret published data and determine which data is related in order to link to data on other websites.

The new technology enables inferences as to when data records refer to the same thing based on similarities in their notation and data structures, thereby making it possible to assign links. For example, the technology is expected to help increase the value of open data by making it possible to use LOD published by governments in combination with data held by companies and other LOD throughout the world.

In January, Fujitsu Laboratories is planning to launch a publically available search service for LOD data that makes it possible to tie in with the new technology: http://lod4all.net/

Background

Open data has rapidly garnered attention, as demonstrated by the release of the "Open Data Charter" at the G8 Summit in June 2013. In Japan, the IT Strategic Headquarters of the Japanese government's Cabinet has promulgated an e-gov open data strategy since July 2012, and declared the release of public data to the private sector (open data) to be one of the three pillars of the Cabinet's "Declaration of Creating the World's Most Advanced IT Nation" announced in June 2013.

In collaboration with the Irish Research Institute Insight Centre for Data Analytics, at National University of Ireland Galway (previously known as the Digital Enterprise Research Institute), Fujitsu Laboratories has developed an LOD utilization platform(2) that can collect and perform batch searches on LOD published throughout the world.

Technological Issues

With LOD, it is advantageous that interrelated data, even data stored on different websites, be linked. This lets data users traverse multiple websites to access the data they need. However, when data is published on different websites, even if it represents the same underlying subject, differences in how it is structured or denoted cannot be resolved through simple keyword searches. As a result, data creators have been forced to find data they want to link to ahead of time, understand how that data is structured and denoted, and match it up to their own data.

In addition, because there had not been a means of traversing numerous websites to discover related data, data creators had been able to link only to data that they were already aware of. This means that while possible to link to well-known data sets and publish it in LOD format, it was difficult to link to data scattered across the web.

About the Technology

Fujitsu Laboratories has developed technology that leverages its LOD utilization platform to assign links based on similarities in notation and data structures. This makes it possible to automatically discover when multiple records refer to the same underlying subject. Features of the technology are as follows.

1. Technology for inferring when LOD data refers to the same person, organization, place, or other subject as that found in other data

Inferences are made by combining the following newly developed features:

  • Resolving differences in data structures: Uses similarity in notation to measure the similarity of data structures.
  • Resolving differences in notation: Uses the data structures in LOD to collect different notations about the same subject.
  • Resolving ambiguity: Places parameters on similar data structures and notations and leverages machine learning to judge subject identity.

Figure1. Overview of the new algorithmFigure1. Overview of the new algorithm

This technology achieved top-ranked inference accuracy in competitions in the US and China(3).

2. Ties in with LOD utilization platform

By tying in with the LOD utilization platform, which collects and performs batch searches on LOD published throughout the world, the technology can discover globally dispersed data that represents the same subject in different LOD datasets. So, for example, it can link to information not only in English-language data sets, but in other language data sets as well.

Figure2. Sample search interface displayFigure2. Sample search interface display

Larger View (488 KB)

Results

The newly developed technology makes it possible to discover and link data representing the same subject in multiple LOD datasets published around the world. This makes it simple to use a company's own data in combination with LOD data if, for instance, a national government publishes LOD data.

From January, Fujitsu Laboratories is planning to launch a LOD search service, available at http://lod4all.net/, that can tie in with the new technology. The search service features a visual, interactive search interface that takes advantage of the LOD utilization platform. From LOD datasets around the world that meet the service's license and download requirements(4), searches can be performed and the content of data viewed.

Future Plans

Fujitsu Laboratories is leveraging the newly developed LOD linking technology in a variety of field test projects with open data from national and local governments, with the aim of commercializing the technology in fiscal 2015.


  • [1] Linked Open Data (LOD)

    A dataset published in the Linked Data format, a new format for publishing data on the web. It uses the Resource Description Framework (RDF) format, which is intended to simplify machine processing without being dependent on any particular software, and is promoted by World Wide Web Consortium (W3C), a standards body for web-related technology.

  • [2] LOD utilization platform

    Technology for storing large volumes of LOD data and quickly performing batch searches on it. Press release from April 3, 2013: "Fujitsu and DERI Revolutionize Access to Open Data by Jointly Developing Technology for Linked Open Data"

  • [3] Competitions in the US and China

    Scored first in accuracy in both the Chinese-language microblog entity linking evaluation at the NLP&CC2013 conference, sponsored by the Chinese Computer Federation, and the cross lingual entity linking evaluation at the Text Analysis Conference Knowledge Base Population 2013, sponsored by the National Institute of Standards and Technology in the United States.

  • [4] License and download requirements

    With the new search service, queries can be performed for datasets that can be downloaded via the web and have standard licenses that allow secondary data usage.

About Fujitsu

Fujitsu is the leading Japanese information and communication technology (ICT) company offering a full range of technology products, solutions and services. Approximately 170,000 Fujitsu people support customers in more than 100 countries. We use our experience and the power of ICT to shape the future of society with our customers. Fujitsu Limited (TSE:6702) reported consolidated revenues of 4.4 trillion yen (US$47 billion) for the fiscal year ended March 31, 2013. For more information, please see http://www.fujitsu.com.

About Fujitsu Laboratories

Founded in 1968 as a wholly owned subsidiary of Fujitsu Limited, Fujitsu Laboratories Limited is one of the premier research centers in the world. With a global network of laboratories in Japan, China, the United States and Europe, the organization conducts a wide range of basic and applied research in the areas of Next-generation Services, Computer Servers, Networks, Electronic Devices and Advanced Materials. For more information, please see: http://jp.fujitsu.com/labs/en.

About Fujitsu Research and Development Center

Established in 1998, Fujitsu Research and Development Center Co., Ltd. is a wholly owned R&D center of Fujitsu Limited, located in Beijing. The center's research areas cover the major business fields of the Fujitsu Group, including information processing, telecommunications, semiconductors, and software and services. For more information, please see: http://www.fujitsu.com/cn/frdc/en/.

About Fujitsu Laboratories of Europe Limited

Fujitsu Laboratories Limited has had an active presence in Europe since 1990, forming Fujitsu Laboratories of Europe Limited in 2001. The company's groundbreaking work is closely aligned to the future needs of the business community, focused on making future technologies a reality for today's businesses. Fujitsu Laboratories of Europe aims to shorten the R&D cycle to put cutting edge technologies into customers' hands as quickly as possible, enabling businesses to gain a tangible competitive advantage. Close collaboration with leading academics and experts Europe-wide forms a central element of Fujitsu Laboratories of Europe's approach, ensuring the effective pooling of expertise with other pioneers in any given field of research. Fujitsu Laboratories of Europe also participates in a number of EU research initiatives, bringing together the joint expertise of industry and academia to accelerate the development and use of new technologies on a pan-European basis.
For more information, please see: www.fujitsu.com/emea/about/fle/

Press Contacts

Public and Investor Relations Division
Inquiries

Company:Fujitsu Limited

Technical Contacts

Social Innovation Laboratories
Knowledge Platforms Lab.

E-mail: E-mail: lod@ml.labs.fujitsu.com
Company:Fujitsu Laboratories Ltd.


All company or product names mentioned herein are trademarks or registered trademarks of their respective owners. Information provided in this press release is accurate at time of publication and is subject to change without advance notice.

Date: 16 January, 2014
City: Kawasaki, Japan, Beijing, China, and Middlesex, United Kingdom
Company: Fujitsu Laboratories Ltd., , Fujitsu Research and Development Center Co., Ltd., , Fujitsu Laboratories of Europe Ltd., , , , , , , , ,