-
Note 1:Peripheral language information, such as the volume and tone of the voice, is transmitted from the speaker to the listener.
Aiming to realize "socially-intelligent conversational AI”! Digital Annealer Supports University Startup
Japanese
Waseda University has launched the Waseda Open Innovation Valley Initiative, which promotes a fully integrated industry-academia collaboration, human resource development, intellectual property creation, and venture development. As part of this, Fujitsu, together with Waseda University, has established the "Fujitsu Co-Creation Research Laboratory at Waseda University" and is conducting joint research with industry and academia to make use of Digital Annealer, a quantum-inspired technology that can process combinatorial optimization problems at high speed. Against this backdrop, Waseda University’s Conversational AI Media Research Group at the Perceptual Computing Laboratory, Green Computing System R&D Center, is developing natural language processing AI using this technology. Professor Yoichi Matsuyama of Waseda University, Professor Hiroaki Takatsu of Waseda University, Senior Researcher Yutaka Takita of Fujitsu, and Researcher Nasa Matsumoto of Waseda University shared their views on the group's efforts to date to establish a university-launched startup to commercialize conversational AI services.
Published on December 1, 2022

-
Yoichi Matsuyama
Guest Researcher(Guest Associate Professor)
Perceptual Computing Laboratory
Green Computing System R&D Center
WASEDA University
CEO
Equmenopolis, Inc. -
Hiroaki Takatsu
Junior Researcher(Assistant Professor)
Perceptual Computing Laboratory
Green Computing System R&D Center
WASEDA University
Research Scientist
Equmenopolis, Inc.

-
Yutaka Takita
Research Manager
Optimization Technology Project
Quantum Laboratory
Research Unit
Fujitsu Limited -
Nasa Matsumoto
Optimization Technology Project
Quantum Laboratory
Research Unit
Fujitsu Limited
AI that enables natural conversation
First of all, please would you introduce your individual research themes.
My research area is natural language processing, and the theme of my doctoral thesis is " A spoken dialogue system for enabling comfortable information acquisition and consumption". This research is aimed at the development of a conversation system (Note 2) that can access information while frequently switching between a push mode such as a radio involving the listener passively consuming information, and a pull mode such as an AI speaker enabling a listener to acquire desired information based on his or her preference. As elemental technologies for this purpose, we have been developing systems such as summarization systems that generate personalized utterance plans from text, intention recognition systems, and expressive speech synthesis systems. Currently, we are focusing our research on an "English conversation proficiency assessment agent system" that applies these technologies.
-
Note 2:To solve the problem of obtaining necessary information from vast amounts of information, we will develop a conversation system that conveys personalized information to each person in a timely manner by taking into account the user's implicit intention.

Why did you apply for the joint research with Fujitsu?
We have been conducting research on a spoken dialogue system designed to deliver only the information we want to receive from the enormous amount of information we are exposed to daily, such as from news articles. In order to generate a utterance plan for our dialogue system, we formulated the dialogue scenario generation problem as a combinatorial optimization problem, which extracts sentences from multiple articles about different topics to maximize the sum of the degree of user's interest under the constraints of the discourse structure (Note 3) of each article and the total utterance time.
Using this technique, we aim to analyze news articles overnight and prepare a personalized utterance plan for each user ready for their commute to work or school the next morning. However, as the number of sentences increases, it takes an enormous amount of time to solve a single problem. As the number of users increases, the number of problems to be solved increases proportionally and the overall processing time increases as well. For example, to prepare utterance plans for 80,000 people by the next morning, all processing must be completed within 6 hours. However, when this processing is handled on a typical 1-core processor, it takes an average of 900 seconds per person to obtain an optimal solution. So for 80,000 people, it would take approximately 20,000 hours (approximately 833 days), which is just not feasible.
Bearing this in mind, when we heard about the joint research activity using Digital Annealer, we could see an opportunity to solve our problem by using its high-speed processing. We applied to use Digital Annealer’s ability to solve combinatorial optimization problems.
-
Note 3:Structure such as the role of sentences and phrases in a document and the transition of topics


Incorporate user personalization into optimization problems
What kind of experiments and evaluations have you conducted using Digital Annealer?
Using news article data annotated with the discourse structure and multiple users’ interests, we evaluated how well the proposed method was able to extract sentences that the user was interested in. In particular, with regard to processing time, we were able to obtain a solution of a problem with 4,096 bits or less (about 6 articles of 15 to 25 sentences) in about 0.2 seconds by using Digital Annealer, and we were able to generate utterance plans for 80,000 people in about 4 and a half hours.
We also applied this technology to the virtual museum guide to generate personalized utterance plans from the description texts of national treasures and important cultural properties, and verified the effectiveness of explaining the exhibits based on the generated scenarios. As a result, the personalized scenario-based guide system was able to select interesting exhibits and explain them in an easy-to-understand manner. Improvements were also seen in evaluation items such as enjoyment and willingness to revisit.
How did you collaborate with Fujitsu?
Due to the impact of the coronavirus, we held our monthly meeting with Fujitsu via online video conference. At the regular meeting, we focused mainly on the results of the Digital Annealer experiments, setting the future direction. In particular Fujitsu advised us on setting parameters for Digital Annealer.
In the joint study with Waseda University, we found that Digital Annealer, which handles combinatorial optimization problems at high speed, produced different results depending on the model. In this case, as Professor Takatsu mentioned earlier, the model that has been incorporated into the linear programming problem (optimization problem), in which sentences with the greatest user interest are extracted under the constraints of discourse structure and time, fits Digital Annealer perfectly. For more detailed parameter setting questions, we have provided tips and tricks such as "This will make it faster". Professor Takatsu was able to lead the joint experiment, thanks in part to Digital Annealer’s ease of use.
Next Steps – towards a commercial conversational AI agent platform
You started a university spin-out startup called "Equmenopolis, Inc." to commercialize the research results. Please tell us about the company's activities.
Originally, I specialized in dialogue systems, and after receiving a Ph.D. from the Graduate School of Fundamental Science and Engineering at Waseda University, I worked as a visiting researcher at the Italian Institute of Technology and as a a postdoc research fellow at Carnegie Mellon University in the United States. In 2019, I returned to the Perceptual Computing Laboratory of Waseda as an Associate Research Professor. Since then, I have been preparing for the university startup to commercialize the research results (Note 4). After an initial preparation period, Equmenopolis, Inc. was established in May 2022, envisioning a world where humans and AI can coexist and create value.
Currently, we are developing a platform for conversational AI agent services, and we are also conducting a field test of an English conversation proficiency assessment agent system utilizing this platform (Note 5). The purpose of this research is to evaluate language ability effectively by adjusting the conversation according to the learner's proficiency and comprehension level. R&D is currently underway to realize natural conversation between a person and an agent. We are promoting the use of Digital Annealer to implement dialogue system technology in society and to establish new business models.
-
Note 4:Initiatives by the research group were selected for the 2019 University New Industry Creation Program (START) by the Japan Science and Technology Agency (JST) and the 2022 Seed-stage Technology-based Startup’s (STS) commercialization support by the New Energy and Industrial Technology Development Organization (NEDO)
-
Note 5:The English-Speaking Ability Test Agent received the Bronze Award in the Learning Assessment Category of the QS-Wharton Reimagine Education Award 2021, the world's largest education contest recognizing innovative educational efforts.
Realizing the Benefits – for business and academia
Do you have any expectations for future joint research and Digital Annealer?
I believe Digital Annealer can be used for large-scale social simulations of processes in a wide range of areas where humans and AI evolve together. On the business side, I look forward to conducting joint research with Fujitsu in order to promote full-scale implementation of cutting-edge technologies, including quantum-inspired technologies, contribute to the future digital society.
I hope that Digital Annealer can handle larger problems and solve problems faster. As Digital Annealer continues to evolve into the fourth and fifth generations, I would like Fujitsu to share the latest information as appropriate.
Working with Waseda University brings a wide range of insights and inspiration to us, the results of which are sometimes quite surprising! We look forward to receiving proposals for innovative ways of using Digital Annealer in the future.
From a company perspective, the goal is to release products and services that address customer needs and challenges. However, at the university, delivering value to society is the first priority. Researchers from a wide range of different specialties are constantly engaged in research, and I feel that this is a highly productive environment to nurture new concepts for social contribution. However, companies have perhaps more access to real-world data in society than universities, and so I anticipate that joint research between Waseda University and Fujitsu will deliver mutual benefits.


Finally, how can this research theme contribute to Sustainable Development Goals (SDGs)?
The English conversation proficiency assessment agent system, which assesses abilities and supports language learning, is an initiative that contributes to the SDGs Goal 4 "Quality Education". In addition, we believe that museum guides that use conversational AI agents will promote culture and the tourism industry, leading to the realization of Goal 8, " Decent Work and Economic Growth".
Towards the beyond 5G era, we have been focusing on projects such as the " R&D of XR communication infrastructure for realizing a highly realistic interaction experience with conversational AI agents" in collaboration with industry, government, and academia (Note 6), and these efforts are truly in line with Goal 9: “Industry, innovation and infrastructure”. AI, an information space guide similar to a museum guide, contributes to the goal of 11 " Sustainable Cities and Communities " by passing down cultural and natural heritage to future generations. In this way, we are confident that our research will contribute to the realization of a sustainable society, in which people and AI coexist and create value.
-
Note 6:The initiative was selected by Ministry of Internal Affairs and Communications and NICT’s Beyond 5G R&D Promotion Project and Beyond 5G R&D Seeds Creation Program FY2022
Related Information


Fujitsu’s Commitment to the Sustainable Development Goals (SDGs)
The Sustainable Development Goals (SDGs) adopted by the United Nations in 2015 represent a set of common goals to be achieved worldwide by 2030. Fujitsu’s purpose — “to make the world more sustainable by building trust in society through innovation” —is a promise to contribute to the vision of a better future empowered by the SDGs.
The goals most relevant to this project

The Perceptual Computing Laboratory is investigating conversational robots and conversational protocols that can understand paralanguage (Note 1). In recent years, in order to realize "HANASHI-JOZU: fluent conversational AI," we have been developing a number of technologies. These include personalized dialogue scenario generation technology, which organizes and prepares the information in advance, as well as intention-understanding technology, which is sensitive enough to respond to inexplicit content. And finally, speech synthesis technology, which enables key information to be conveyed in an emphasized manner. These studies were kick-started by the initial concept proposed by Dr. Tetsunori Kobayashi, Director of the Perceptual Computing Laboratory, and other members of the team. Dr. Takatsu implemented and developed the technologies as his doctoral dissertation, and based on the seeds of these technologies, research into practical applications is being conducted at the university startups.