Local
Resources
- Awards
- Events
- News
- Publications
- Research
  - Achievements
  - Research Topics
- Fujitsu Electronics (Shanghai) Video
The FUJITSU brand
Corporate Profile
Corporate Responsibility
Business Policy
Corporate Philosophy
Message from the President

Archived content

NOTE: this is an archived page and the content is likely to be out of date.

High-quality Chinese speech synthesis technology

【Abstract】

Speech synthesis technology is extensively applied in a variety of real time information services such as call center, car navigation, speech webpage, assistance teaching, special populations (eye disabled) service, and so on. The user experience and service quality are largely affected by the correctness, smoothness, and naturalness of the synthesized speech. By studying the key technologies such as rhythm, polyphone, digits and symbol processing, we develop the Text-To-Speech (TTS) system to generate high-quality and natural speech.

【Solution】

1. Highnatural rhythm

A natural speech synthesis system should be close to the true speech of the people as much as possible in both speech pause, pronunciation duration, and tone. For this purpose, we analyze several factors related to rhythm and build statistical model to predict the duration for each syllable. In addition, diversified tone templates are adopted to characterize the Chinese rhythm to guarantee the naturalness of the synthetic speech.

2. Powerful processing capability of polyphone, digit and special symbol

Correct pronunciation of polyphones, digits and special symbols plays an important role for easy understanding of the synthetic speech. To ensure correct pronunciation of these special contents, we build a combined model using rules and statistical learning methods to analyze the contexts of the polyphones, digits and special symbols in a large speech corpus, and establish a prediction model for each character to correctly determine their pronunciation.

【Technical points】

Pronunciation prediction model based on decision tree
Pitch model based on multi-variant analysis and clustering technology
Context based statistical learning.

【Synoptic diagram】

Contact Information

Yu Hao: yu@cn.fujitsu.com

Liu Rujie: rjliu@cn.fujitsu.com

Top of Page

Archived content

High-quality Chinese speech synthesis technology

Contact Information

Contact Us

Further Information