Our latest investments in information quality in Search and News

Sep 10, 2020

Google Fellow and Vice President, Search

Delivering high-quality results is what has always set Google apart from other search engines, even in our earliest days. Over the years as the product and user experience have evolved, our investments in quality have accelerated.

We conduct extensive testing to ensure that Search is as helpful as it can be—from the quality of information we deliver, to the overall experience. Since 2017, we’ve done more than 1 million search quality tests, and we now average more than 1,000 tests per day.

In addition to investing in the overall Search experience, we also focus on providing reliable information for people everywhere. We’ve highlighted our fundamental approach and ongoing investment in this area, but we also wanted to share some of the new improvements we’ve made to continue to deliver high quality information.

In a year when access to reliable information is more critical than ever—from COVID-19 to natural disasters to important moments of civic participation around the world—our longstanding commitment to quality remains at the core of our mission to make the world’s information accessible and useful.

New insights from our Intelligence Desk

With new things happening around the world every day, the information landscape can change quickly. To understand how our systems are performing when news breaks, we’ve developed an Intelligence Desk, which actively monitors and identifies potential information threats.

This effort grew out of our Crisis Response team, which for years has done real-time tracking of events around the world, launching SOS Alerts in Search and Maps to help people get vital information quickly. Over the years, we’ve monitored thousands of events and launched hundreds of alerts to help keep people safe.

Crisis events monitored (green) and SOS Alerts launched (red), 2016 - 2020.

The Intelligence Desk is a global team of analysts monitoring news events 24/7, spanning natural disasters and crises, breaking news moments and the latest developments in ongoing topics like COVID. When events occur, our analysts collect data about how our systems are responding and compile reports about narratives that are emerging, like new claims about COVID treatments. Our product teams use these data sets and reports from the Intelligence Desk to run more robust quality tests and ensure that our systems are working as intended for the wide range of topics people Search for.

Improving our systems for breaking news and crises

As news is developing, the freshest information published to the web isn’t always the most accurate or trustworthy, and people’s need for information can accelerate faster than facts can materialize.

Over the past few years, we’ve improved our systems to automatically recognize breaking news around crisis moments like natural disasters and ensure we’re returning the most authoritative information available. We’ve also made significant strides in our overall ability to accurately identify breaking news moments, and do so more quickly. We’ve improved our detection time from up to 40 minutes just a few years ago, to now within just a few minutes of news breaking.

Our improvements in detecting crisis events expands on our work in 2017 to improve the quality of results for topics that might be susceptible to hateful, offensive and misleading information. Those improvements remain fundamental to how we handle low-quality information in Search and News products, and since then, we’ve continuously updated our systems to be able to detect topic areas that may be at risk for misinformation. We’re continuing to train and test our systems to ensure that whatever people are searching for, they can find reliable information.

Providing accurate information from the Knowledge Graph

In Search, features like knowledge panels that display information from the Google Knowledge Graph help you get quick access to the facts from sources across the web. To deliver high-quality information in these features, we’ve deepened our partnerships with government agencies, health organizations and Wikipedia to ensure reliable, accurate information is available, and protect against potential vandalism.

For COVID-19, we worked with health organizations around the world to provide local guidance and information to keep people safe. To respond to emerging information needs, like the surge we saw in people searching for unemployment benefits, we provide easy access to information right from government agencies in the U.S. and other countries. For elections information, we work with non-partisan civic organizations that provide authoritative information about voting methods, candidates, election results and more.

Information in knowledge panels comes from hundreds of sources, and one of the most comprehensive knowledge bases is Wikipedia. Volunteer Wikipedia editors around the world have created robust systems to guard for neutrality and accuracy. They use machine learning tools paired with intricate human oversight to spot and address vandalism. Most vandalism on Wikipedia is reverted within a matter of minutes.

To complement Wikipedia’s systems, we’ve added additional protections and detection systems to prevent potentially inaccurate information from appearing in knowledge panels. On rare occasions, instances of vandalism on Wikipedia can slip through. Only a small proportion of edits from Wikipedia are potential vandalism, and we’ve improved our systems to now detect 99 percent of those cases. If these issues do appear, we have policies that allow us to take action quickly to address them.

To further support the Wikipedia community, we created the WikiLoop program last year that hosts several editor tools focused on content quality. This includes WikiLoop DoubleCheck, one of a number tools Wikipedia editors and users can use to track changes on a page and flag potential issues. We contribute data from our own detection systems, which members of the community can use to uncover new insights.

Helpful context from fact checks and Full Coverage

We design Search and News to help you see the full picture, by helping you easily understand the context behind information you might find online. We make it easy to spot fact checks in Search, News and, most recently, Google Images by displaying fact check labels. These fact checks and labels come from publishers that use ClaimReview schema to mark up fact checks they have created. This year to date, people have seen fact checks on Search and News more than 4 billion times, which is more than all of 2019 combined.

We understand the importance of the fact checking ecosystem in debunking misleading information, which is why we recently donated an additional $6.5 million to help fact checking organizations and nonprofits focus on misinformation about the pandemic.

We also just launched an update using our BERT language understanding models to improve the matching between news stories and available fact checks. These systems can better understand whether a fact check claim is related to the central topic of a story, and surface those fact checks more prominently in Full Coverage—a News feature that provides a complete picture of how a story is reported from a variety of sources. With just a tap, Full Coverage lets you see top headlines from different sources, videos, local news reports, FAQs, social commentary, and a timeline for stories that have played out over time.

Expanded protections for Search features

We have policies for what can appear in Search features like featured snippets, lists or video previews that uniquely highlight information on the search results page. One notable example is Autocomplete, which helps you complete your search more quickly.

We have long-standing policies to protect against hateful and inappropriate predictions from appearing in Autocomplete. We design our systems to approximate those policies automatically, and have improved our automated systems to not show predictions if we detect that the query may not lead to reliable content. These systems are not perfect or precise, so we enforce our policies if predictions slip through.

We expanded our Autocomplete policies related to elections, and we will remove predictions that could be interpreted as claims for or against any candidate or political party. We will also remove predictions that could be interpreted as a claim about participation in the election—like statements about voting methods, requirements, or the status of voting locations—or the integrity or legitimacy of electoral processes, such as the security of the election. What this means in practice is that predictions like “you can vote by phone” as well as “you can't vote by phone,” or a prediction that says “donate to” any party or candidate, should not appear in Autocomplete. Whether or not a prediction appears, you can still search for whatever you’d like and find results.

Information online is constantly changing—as are the things people search for—so continuing to deliver high-quality information is an area of ongoing investment. We’ve made great strides and built upon successful improvements to our systems, and we’ll continue to look for new ways to make Search and News as reliable and helpful as possible, no matter what you’re looking for.

POSTED IN: