A virtual sign language interpretation system was able to help hearing-impaired people in China enjoy the 2022 Beijing Winter Olympic and Paralympic Games and follow sporting news.
Developed by AI companies in conjunction with the Beijing Media Group, the AI sign language anchor made her debut on Feb 5, providing real-time interpretation and broadcasting services during news and sports broadcasts on one of BRTV's programs.
Based on the Wudao 2.0 pre-training model, the system applied cutting-edge technologies like the synchronized acquisition of multimodal body and finger movements, as well as facial expressions, according to Zuo Jiaping, senior vice-president of Beijing Zhipu AI Technology, one of the system's developers. It also provides accurate, natural and smooth sign language translation as a result of the ultrahigh precise movements of which digital humans are capable.
According to the test, 90 percent of the sign language interpreted by the AI anchor was understandable, Zuo added.
"We have built the world's largest multimodal sign language corpus," she said, adding that more than 100,000 words and phrases have been coded into the system.
The interpreting system contains 8,214 expressions in Chinese National Sign Language, and its grammar is based on the habits of the hearing-impaired to ensure the accuracy and professionalism of the AI anchor.
With the support of the Beijing Disabled Persons' Federation, researchers invited 40 hearing-impaired people and sign language experts to oversee transcription work. A wider range of tests on the target population were also conducted.
With a lower broadcasting speed compared to spoken language, sign language interpreters have to simplify meaning in order to keep up with spoken language anchors.
Du Jizhong, from the Department of AI Digital Humans at Zhipu, gave the example of an anchor saying "Today, Beijing is fine and warm with clean air and a bright sky", being interpreted as "The weather in Beijing is good today".
As the word order of sign language is also different, the system's hyperscale pre-training model has to translate broadcasts into sign language with simplified meanings and in the correct order, he said.
Much like sign language teachers who mouth words to assist translation, AI anchors can also change the shape of their mouths to boost understanding for the hearing-impaired. The system was in development for nine months before it was able to match sign language with the mouth shape of an AI anchor.
"In the future, digital human facial expressions will be upgraded as research proceeds," Zuo said.
China is home to 27 million hearing-impaired people, and promoting this kind of technology will enable them to watch live events and special reports, she said.
With the ending of the Olympics and Paralympics, digital humans will continue to provide sign language services to media outlets and allow the hearing-impaired to watch the news.
Meanwhile, round-the-clock AI anchors will eventually help solve the problem of the shortage of sign language interpreters, she added.