Google I/O 2023: Making AI more helpful for everyone
Seven years into our journey as an AI-first company, we’re at an exciting inflection point. We have an opportunity to make AI even more helpful for people, for businesses, for communities, for everyone.
We’ve been applying AI to make our products radically more helpful for a while. With generative AI, we’re taking the next step. With a bold and responsible approach, we’re reimagining all our core products, including Search.
“Help me write” in Gmail
There are some great examples of how generative AI is helping to evolve our products, starting with Gmail. In 2017, we launched Smart Reply, short responses you could select with just one click. Next came Smart Compose, which offered writing suggestions as you type. Smart Compose led to more advanced writing features powered by AI. They’ve been used in Workspace over 180 billion times in the past year alone. And now, with a much more powerful generative model, we’re taking the next step in Gmail with “Help me write.”
Let’s say you got an email that your flight was canceled. The airline has sent a voucher, but what you really want is a full refund. You could reply, and use “Help me write.”
Just type in the prompt of what you want — an email that asks for a full refund — hit create, and a full draft appears. It conveniently pulls in flight details from the previous email. It looks pretty close to what you want to send, but maybe you want to refine it further. In this case, a more elaborate email might increase the chances of getting the refund. “Help me write” will start rolling out as part of our Workspace updates. And just like with Smart Compose, you’ll see it get better over time.
New Immersive View for routes in Maps
Since the early days of Street View, AI has stitched together billions of panoramic images, so people can explore the world from their device. At last year’s I/O we introduced Immersive View, which uses AI to create a high-fidelity representation of a place, so you can experience it before you visit.
Now, we’re expanding that same technology to do what Maps does best: help you get where you want to go. Google Maps provides 20 billion kilometers of directions, every day — that’s a lot of trips. Now imagine if you could see your whole trip in advance. With Immersive View for routes you can, whether you're walking, cycling or driving.
Say you’re in New York City and you want to go on a bike ride. Maps has given you a couple of options close to where you are. The one on the waterfront looks scenic, but you want to get a feel for it first, so you click on Immersive View for routes. It’s an entirely new way to look at your journey. You can zoom in to get an incredible bird’s eye view of the ride.
There’s more information available too. You can check air quality, traffic and weather, and see how they might change.
Immersive View for routes will begin to roll out over the summer, and launch in 15 cities by the end of the year, including London, New York, Tokyo and San Francisco.
A new Magic Editor experience in Photos
Another product made better by AI is Google Photos. We introduced it at I/O in 2015, and it was one of our first AI-native products. Breakthroughs in machine learning made it possible to search your photos for things like people, sunsets or waterfalls.
Of course, we want you to do more than just search photos — we also want to help you make them better. In fact, every month,1.7 billion images are edited in Google Photos. AI advancements give us more powerful ways to do this. For example, Magic Eraser, launched first on Pixel, uses AI-powered computational photography to remove unwanted distractions. And later this year, using a combination of semantic understanding and generative AI, you can do much more with a new experience called Magic Editor.
Here’s an example: This is a great photo, but as a parent, you probably want your kid at the center of it all. And it looks like the balloons got cut off in this one, so you can go ahead and reposition the birthday boy. Magic Editor automatically recreates parts of the bench and balloons that were not captured in the original shot. As a finishing touch, you can punch up the sky. This also changes the lighting in the rest of the photo so the edit feels consistent. It’s truly magical. We’re excited to roll out Magic Editor in Google Photos later this year.
Making AI more helpful for everyone
From Gmail and Photos to Maps, these are just a few examples of how AI can help you in moments that matter. And there's so much more we can do to deliver the full potential of AI across the products you know and love.
Today, we have 15 products that each serve more than half a billion people and businesses. And six of those products serve over 2 billion users each. This gives us so many opportunities to deliver on our mission — to organize the world's information and make it universally accessible and useful.
It's a timeless mission that feels more relevant with each passing year. And looking ahead, making AI helpful for everyone is the most profound way we’ll advance our mission. We’re doing this in four important ways:
- First, by improving your knowledge and learning, and deepening your understanding of the world.
- Second, by boosting creativity and productivity, so you can express yourself and get things done.
- Third, by enabling developers and businesses to build their own transformative products and services.
- And finally, by building and deploying AI responsibly, so that everyone can benefit equally.
PaLM 2 + Gemini
We are so excited by the opportunities ahead. Our ability to make AI helpful for everyone relies on continuously advancing our foundation models. So I want to take a moment to share how we're approaching them.
Last year you heard us talk about PaLM, which led to many improvements across our products. Today, we’re ready to announce our latest PaLM model in production: PaLM 2.
PaLM 2 builds on our fundamental research and our latest infrastructure. It’s highly capable at a wide range of tasks and easy to deploy. We are announcing more than 25 products and features powered by PaLM 2 today.
PaLM 2 models deliver excellent foundational capabilities across a wide range of sizes. We’ve affectionately named them Gecko, Otter, Bison and Unicorn. Gecko is so lightweight that it can work on mobile devices: fast enough for great interactive applications on-device, even when offline. PaLM 2 models are stronger in logic and reasoning thanks to broad training on scientific and mathematical topics. It’s also trained on multilingual text — spanning more than 100 languages — so it understands and generates nuanced results.
Combined with powerful coding capabilities, PaLM 2 can also help developers collaborating around the world. Let's say you’re working with a colleague in Seoul and you’re debugging code. You can ask it to fix a bug and help out your teammate by adding comments in Korean to the code. It first recognizes the code is recursive, then suggests a fix. It explains the reasoning behind the fix, and it adds Korean comments like you asked.
[Visual link to PaLM blog]
While PaLM 2 is highly capable, it really shines when fine-tuned on domain-specific knowledge. We recently released Sec-PaLM, fine-tuned for security use cases. It uses AI to better detect malicious scripts, and it can help security experts understand and resolve threats.
Another example is Med-PaLM 2. In this case, it’s fine-tuned on medical knowledge. This fine-tuning achieved a 9X reduction in inaccurate reasoning when compared to the base model, approaching the performance of clinician experts who answered the same set of questions. In fact, Med-PaLM 2 was the first language model to perform at “expert” level on medical licensing exam-style questions, and is currently the state of the art.
We're also working to add capabilities to Med-PaLM 2, so that it can synthesize information from medical imaging like plain films and mammograms. You can imagine an AI collaborator that helps radiologists interpret images and communicate the results. These are some examples of PaLM 2 being used in specialized domains. We can’t wait to see it used in more, which is why I’m pleased to announce that PaLM 2 is now available in preview.
PaLM 2 is the latest step in our decade-long journey to bring AI in responsible ways to billions of people. It builds on progress made by two world-class research teams, the Brain Team and DeepMind.
Looking back at the defining AI breakthroughs over the last decade, these teams have contributed to a significant number of them: AlphaGo,Transformers, sequence-to-sequence models, and so on. All this helped set the stage for the inflection point we’re at today.
We recently brought these two teams together into a single unit, Google DeepMind. Using the computational resources of Google, they’re focused on building more capable systems, safely and responsibly.
This includes our next-generation foundation model, Gemini, which is still in training. Gemini was created from the ground up to be multimodal, highly efficient at tool and API integrations and built to enable future innovations, like memory and planning. While still early, we’re already seeing impressive multimodal capabilities not seen in prior models.
Once fine-tuned and rigorously tested for safety, Gemini will be available at various sizes and capabilities, just like PaLM 2.
AI responsibility: Tools to identify generated content
As we invest in more capable models, we are also deeply investing in AI responsibility. That includes having the tools to identify synthetically generated content whenever you encounter it.
Two important approaches are watermarking and metadata. Watermarking embeds information directly into content in ways that are maintained even through modest image editing. Moving forward, we're building our models to include watermarking and other techniques from the start.
If you look at a synthetic image, it's impressive how real it looks, so you can imagine how important this is going to be in the future. Metadata allows content creators to associate additional context with original files, giving you more information whenever you encounter an image. We'll ensure every one of our AI-generated images has that metadata. Read more about our bold and responsible approach.
Updates to Bard + Workspace
As models get better and more capable, one of the most exciting opportunities is making them available for people to engage with directly.
That’s the opportunity we have with Bard, our experiment for conversational AI, which we launched in March. We’ve been rapidly evolving Bard. It now supports a wide range of programming capabilities, and it’s gotten much smarter at reasoning and math prompts. And, as of today, it is now fully running on PaLM 2. Read more about the latest Bard updates.
We’re also bringing new features to Google Workspace. In addition to “Help me write” in Docs and Gmail, Duet AI in Google Workspace provides tools to generate images from text descriptions in Slides and Meet, create custom plans in Sheets and more. Read more about the latest Workspace updates.
Introducing Labs and our new Search Generative Experience
As AI continues to improve rapidly, we’re focused on giving helpful features to our users. And starting today, we're giving you a new way to preview some of the experiences across Workspace and other products. It's called Labs. I say new, but Google has a long history of using Labs as a way to enable early access and get feedback, and you can start signing up later today.
Alongside the Workspace features you just saw, one of the first experiences you’ll be able to test in Labs involves our founding product, Google Search. The reason we began deeply investing in AI many years ago is because we saw the opportunity to make Search better. And with each breakthrough, we’ve made it more helpful and intuitive.
Improvements in language understanding let us ask questions more naturally and reach the most relevant content on the web. Advances in computer vision introduced new ways to search visually. Now, even if you don’t have the words to describe what you’re looking for, you can search anything you see with Google Lens. In fact, Lens is used for over 12 billion visual searches every single month — a 4X increase in just two years. Lens combined with multimodality led to multisearch, which allows you to search using both an image and text.
As we look ahead, Google's deep understanding of information combined with the unique capabilities of generative AI can transform how Search works yet again, unlocking entirely new questions that Search can answer, and creating increasingly helpful experiences that connect you to the richness of the web.
Of course, applying generative AI to Search is still in its early days. People around the world rely on Search in important moments, and we know how critical it is to get this right and continue to earn their trust. That’s always our North Star.
So we’re approaching innovation responsibly, striving for the highest bar for information quality as we always have from the very beginning. This is why we are bringing our new Search Generative Experience to you first in Labs.