How we taught Google Translate to recognize homonyms
Do you like bass?
Your answer to that question probably depends on whether you’re thinking about seafood or music. That’s because “bass” and “bass” are homonyms — two (or more) words with the same spelling or pronunciation that mean different things. When you encounter a homonym like “bass” in the wild, you likely use context clues to understand the question and figure out an appropriate response. And so does Google Translate. Thanks in part to advanced machine learning, Translate can parse context and differentiate between various homonyms. Getting to this point, though, required a lot of work.
In the early days of Google Translate, translations tended to be very literal and word-for-word. This was because Translate originally used a statistical approach to create its results, says Google Translate engineer Apu Shah. And that wasn’t ideal for understanding language like homonyms. For instance, say you wanted to translate the word “medium” from English into Spanish. Using the statistical approach, Translate would count how many times any Spanish word meaning “medium” showed up in publicly available translation data, like from online dictionaries. Then it would base your result on which option was most common. So even if you wanted to say “el médium” because you were talking about a psychic, Translate might have suggested the word for something that’s of average size —“medio” — if that word showed up more. “Translate was really limited by the data that was available,” Apu says. “And it couldn’t read semantics or context very well.”
Today, Google Translate supports 133 languages — when it first launched in 2006, that number was closer to 60. As the amount of languages we support has grown, so has the translation quality, says Google Engineering Director Macduff Hughes, who’s been working in the role for nearly 11 years and oversaw a major transition for the product in 2016 to a pure neural-based machine translation system. This transition eventually led us to the more accurate and context-driven translations we get today (like our bass versus bass example).
But there was still room for improvement even after transitioning to the neural network-based system. “We found that Translate could generate these very impressive natural-sounding texts, but sometimes with mistakes,” Macduff says. “It might sound or look grammatically correct and use a high-level of vocabulary and have correct capitalization and punctuation, creating this feeling of credibility — but it could still be wrong.”
So the team focused on teaching the neural network to become more and more accurate. “The models we run today are three or four times bigger than the ones we originally launched with, and they run faster,” Macduff says. The team trains the model by showing it examples of translated materials, which helps teach it how to represent language. This allows Translate to deliver more nuanced results. “We aren’t just going for word-by-word representation,” Apu says. “We’re looking for context. Did you run the race? Did your program run? Did you run it into the ground?”
Sometimes, there's just not enough context for the translation system to pick the right meaning — like in the aforementioned “bass” example. Starting today, Translate detects these cases and allows you to manually select your intended meaning. This is thanks to our latest generative AI experiment, through Search Labs. If you’re opted in to our Search Generative Experience (SGE) in the U.S., and you ask Search to translate a phrase from English to Spanish where certain words could have more than one possible meaning, you’ll see those terms underlined. Simply tap on those underlined words and you can indicate the specific meaning that reflects what you want to say. This option may also appear when you need to specify the gender for a particular word.
Outside of SGE, if you enter one of these kinds of words without context into Translate in a web browser or say one of them out loud when using the Translate app, for instance, the algorithm will assess all of the potential results, then give you options to clarify what you mean. For example, Translate options for the word “bat” include the animal, the equipment and the action.
If you’ve written or said an entire phrase that includes a homonym, the algorithm will analyze the phrase in context, leading it to a more accurate representation of how you’re using the homonym than if it were simply relying on statistics.
“We’ve also done a lot of work on curating data,” Macduff says. Google partners with dictionary providers and third-party translators who gather words and phrases in different languages, and the team studies public databases to better understand how to build new features in Translate. “We also trained a language model to recognize the difference between high-quality translations and low-quality translations,” Macduff says. The “contribute” option also gives Google Translate users the chance to help with translations or offer corrections.
Translate will get better and better at handling homonyms and other translations that require context over time, and the team thinks it’s important to stay nimble in order to do so. “AI is evolving, and computer power is evolving, but language is evolving, too,” Apu says. Words take on new meanings and usages all the time — take “slay” or “cancel.” The work keeps the team on their toes, but their core goal remains the same.
“Our vision for the future is to enable very fluid interactions for people,” says Apu. “We want to take away all the barriers for communication that we can, so everyone can just talk to another person, no matter what language they speak.” Or what kind of bass they’re talking about.