A decade in deep learning, and what's next
Twenty years ago, Google started using machine learning, and 10 years ago, it helped spur rapid progress in AI using deep learning. Jeff Dean and Marian Croak of Google Research take a look at how we’ve innovated on these techniques and applied them in helpful ways, and look ahead to a responsible and inclusive path forward.
From research demos to AI that really works
I was first introduced to neural networks — computer systems that roughly imitate how biological brains accomplish tasks — as an undergrad in 1990. I did my senior thesis on using parallel computation to train neural networks. In those early days, I thought if we could 32X more compute power (using 32 processors at the time!), we could get neural networks to do impressive things. I was way off. It turns out we would need about 1 million times as much computational power before neural networks could scale to real-world problems.
A decade later, as an early employee at Google, I became reacquainted with machine learning when the company was still just a startup. In 2001 we used a simpler version of machine learning, statistical ML, to detect spam and suggest better spellings for people’s web searches. But it would be another decade before we had enough computing power to revive a more computationally-intensive machine learning approach called deep learning. Deep learning uses neural networks with multiple layers (thus the “deep”), so it can learn not just simple statistical patterns, but can learn subtler patterns of patterns — such as what’s in an image or what word was spoken in some audio. One of our first publications in 2012 was on a system that could find patterns among millions of frames from YouTube videos. That meant, of course, that it learned to recognize cats.
To get to the helpful features you use every day — searchable photo albums, suggestions on email replies, language translation, flood alerts, and so on — we needed to make years of breakthroughs on top of breakthroughs, tapping into the best of Google Research in collaboration with the broader research community. Let me give you just a couple examples of how we’ve done this.
A big moment for image recognition
In 2012, a paper wowed the research world for making a huge jump in accuracy on image recognition using deep neural networks, leading to a series of rapid advances by researchers outside and within Google. Further advances led to applications like Google Photos in 2015, letting you search photos by what’s in them. We then developed other deep learning models to help you find addresses in Google Maps, make sense of videos on YouTube, and explore the world around you using Google Lens. Beyond our products, we applied these approaches to health-related problems, such as detecting diabetic retinopathy in 2016, and then cancerous cells in 2017, and breast cancer in 2020. Better understanding of aerial imagery through deep learning let us launch flood forecasting in 2018, now expanded to cover more than 360 million people in 2021. It’s been encouraging to see how helpful these advances in image recognition have been.
Similarly, we’ve used deep learning to accelerate language understanding. With sequence-to-sequence learning in 2014, we began looking at how to understand strings of text using deep learning. This led to neural machine translation in Google Translate in 2016, which was a massive leap in quality, particularly for less prevalent languages. We developed neural language models further for Smart Reply in Gmail in 2017, which made it easier and faster for you to knock through your email, especially on mobile. That same year, Google invented Transformers, leading to BERT in 2018, then T5, and in 2021 MUM, which lets you ask Google much more nuanced questions. And with “sparse” models like GShard, we can dramatically improve on tasks like translation while using less energy.
We’ve driven a similar arc in understanding speech. In 2012, Google used deep neural networks to make major improvements to speech recognition on Android. We kept advancing the state of the art with higher-quality, faster, more efficient speech recognition systems. By 2019, we were able to put the entire neural network on-device so you could get accurate speech recognition even without a connection. And in 2021, we launched Live Translate on the Pixel 6 phone, letting you speak and be translated in 48 languages -- all on-device, while you’re traveling with no Internet.
More invention ahead
As our research goes forward, we’re balancing more immediately applied research with more exploratory fundamental research. So we’re looking at how, for example, AI can aid scientific discovery, with a project like mapping the brain of a fly, which could one day help better understand and treat mental illness in people. We’re also pursuing quantum computing, which will likely take a decade or longer to reach wide-scale applications. This is why we publish nearly 1000 papers a year, including around 200 related to responsible AI, and we’ve given over 6500 grants to external researchers over the past decade and a half.
Looking ahead from 2021 to 2031, I'm excited about the next-generation AI systems we can build, and how much more helpful they’ll be. We’re planting the seeds today with new architectures like Pathways, with more to come.
Minding the gap(s)
As we develop these lines of research and turn them into useful technologies, we’re mindful of the broader societal impact of AI, and especially that technology has not always had an equitable impact. This is personal for me — I care deeply about ensuring that people from all different backgrounds and circumstances have a good experience.
So we’re increasing the depth and rigor of how we review and evaluate our research to ensure we’re developing it responsibly. We’re also scaling up what we learn by inventing new tools to understand and calibrate critical AI systems across Google's products. We’re growing our organization to 200 experts in Responsible AI and Human Centered Technology, and working with hundreds of partners in product, privacy, security, and other teams across Google.
As one example of our work on responsible AI, Google Research began exploring the nascent field of ML fairness in 2016. The teams realized that on top of publishing papers, they could have a greater impact by teaching ML practitioners how to build with fairness in mind, as with the course we launched in 2018. We also started building interactive tools that coders and researchers could use, from the What-If Tool in 2018 to the 2019 launch of our Fairness Indicators tool, all the way to Know Your Data in 2021. All of these are concrete ways that AI developers can test their datasets and models to see what kind of biases and gaps there are, and start to work on mitigations to prevent unfair outcomes.
A principled approach
In fact, fairness is one of the key tenets of our AI Principles. We developed these principles in 2017 and published them in 2018, announcing not only the Principles themselves but a set of responsible AI practices with practical organizational and technical advice from what we’ve learned along the way. I was proud to be involved in the AI Principles review process from early on — I’ve seen firsthand how rigorous the teams at Google are on evaluating the technology we’re developing and deciding how best to deploy it in the real world.
Indeed, there are paths we’ve chosen not to go down — the AI Principles describe a number of areas we avoid. In line with our principles, we’ve taken a very cautious approach on face recognition. We recognize how fraught this area is not only in terms of privacy and surveillance concerns, but also its potential for unfair bias and impacts on historically marginalized groups. I’m glad that we’re taking this so thoughtfully and carefully.
We’re also developing technologies that help engineers apply the AI Principles directly — for example, incorporating privacy design principles. We invented Federated Learning in 2017 as a way to train ML models without your personal data leaving your phone. In 2018 we showed how well this works on Gboard, the free keyboard you can download for your phone — it learns to provide you more useful suggestions, while keeping what you type private on your device.
AI by everyone, for everyone
As we look to the decade ahead, it’s incredibly important that AI be built in a way that works well for everyone. That means building as inclusive a team as we can ourselves at Google. It also means ensuring the field as a whole increasingly represents the people whose lives it aims to improve.
I’m proud to lead the Black Leadership Advisory Group (BLAG) at Google. We helped craft and drive programs included in Google’s recent update on racial equity work. For example, we paired up new director-level hires with BLAG members, and the feedback has been really positive, with 80% of respondents saying they'd recommend the program. We’re looking at extending this to other groups, including for Latinx+ and Asian+ Googlers. We’re holding ourselves accountable as leaders too — we now evaluate all VPs and above at Google on progress on diversity, equity, and inclusion. This is crucial if we’re going to have a more representative set of researchers and engineers building future technologies.
For the broader research and computer science communities, we’re providing a wide variety of grants, programs, and collaborations that we hope will welcome a more representative range of researchers. Our Research Scholar Program, begun in 2021, gave grants to more than 50 universities in 15+ countries — and 43% of the principal investigators identify as part of a group that’s been historically marginalized in tech. Similarly, our exploreCSR and CS Research Mentorship programs support thousands of undergrads from marginalized groups. And we’re partnering with groups like the National Science Foundation on their new Institute for Human-AI Collaborations.
We’re doing everything we can to make AI work well for all people. We’ll not only help ensure products across Google are using the latest practices in responsible AI — we’ll also encourage new products and features that serve those who’ve historically missed out on helpful new technologies. One example is Project Relate, which uses machine learning to help people with speech impairments communicate and use technology more easily. Another is Real Tone, which helps our imaging products like our Pixel phone camera and Google Photos more accurately and beautifully represent a diverse range of skin tones. These are just the start.
We’re excited for what’s ahead in AI, for everyone.