VOA标准英语2010年-Shelved Machine Translator Gets New Li(在线收听

Medical dictionary springs from largely forgotten English/Creole database

Rosanne Skirble | Washington, DC 08 February 2010

A largely forgotten translator is getting new life in the aftermath of the devastating January 12 earthquake in Haiti which left as many as 200,000 people dead and 1.5 million homeless.

Linguists and computer scientists are among the rapid responders to the disaster site. Among them is former Carnegie Mellon University linguist Jeff Allen who went to Haiti in 1990s on U.S. Army contract.

 Carnegie Mellon University
A sample sentence in Creole collected by Carnegie Mellon University's speech data collection project in Haiti led by linguist Jeff Allen.

Allen is fluent in Creole, the language widely spoken in Haiti. He says his mission on the project, dubbed Diplomat, was to develop an English/Creole speech and text translation system.

"I spent nine months collecting data from different people within the Haitian community. And then we in-house translated everything that we could for a period of two years," he said.

Computer scientist Robert Frederking with Carnegie Mellon's Language Technology Institute was a lead investigator for Diplomat. He says Carnegie Mellon built a portable translator for a laptop computer and sent it to Haiti.

"It kind of sat on a shelf for four months and it came back [to the university]," he said. "Because it was kind of rare data, I made an effort to preserve it over the years after the project ended."

When Allen, now based in Paris with software giant SAP, watched news of the earthquake, he knew that Carnegie Mellon still had the English/Creole database. "So I called up Carnegie Mellon, and I said, 'We need to do something. What can we do?'"

On January 21, with Allen's help, Carnegie Mellon made the data public.

"We put out on the Internet site of Carnegie Mellon 13,000 parallel sentences and 35,000 parallel terms," he said.

This rich data set presented an opportunity for Microsoft Research. Their web-based translator service has 23 languages with more added every few months. Product manager Vikram Dendi, responding to the crisis in Haiti, says within five days his team put an English/Creole translator on the Internet, adding disaster-specific words and phrases to the data base.

"We have taken medical terminology. We have taken other emergency-type notification and helped translated them into Haitian-Creole," he said.

Microsoft regularly updates the translator, building a more robust system. Dendi says the more parallel sentences and phrases in the system, the more accurate the translation.

Translators without Borders
Translators without Borders, a virtual network that links translators worldwide with humanitarian causes, seeks bilingual Creole speakers for its database.

The Haitian earthquake struck the group, Translators without Borders, with an explosion of interest. More than 1,000 Creole speakers from the Haitian diaspora volunteered their translation services to the Paris-based humanitarian group. Co-founder Lori Thicke says the non-profit is distributing an English/Creole triage dictionary based on the newly released data.

"It contains a lot of interesting questions that you might ask someone to ascertain how serious their injuries are," she said. "For example, 'Where does it hurt? How long have you had this wound?' That sort of thing."

 Thicke says machine translators from Microsoft and, more recently Google, help volunteers increase their productivity, affording them a rapid first draft that can be later revised.

"They are helping us translate documents that might be instructions for building a water purification or for treatment protocols, for educational materials, all really important translations that there might not be a budget for," she said.

And over at Microsoft, Vikram Dendi adds that his company is working to help integrate as many applications as possible for the translator on mobile devices like the cell phone.

 

  原文地址:http://www.tingroom.com/voastandard/2010/2/100329.html