A breakthrough to better represent human genetic diversity

May 10, 2023

A consortium of over 100 scientists — including engineers from Google — announced the world’s first human pangenome reference.

Andrew Carroll

Product Lead, Genomics

A rendering of a DNA double helix, just above a schematic diagram showing the bases that compose it.

Today, a group of researchers reached a breakthrough in our understanding and representation of human genomics, helping create more inclusive and equitable genetic testing and treatment. A consortium of 119 scientists from 60 institutions — including engineers from Google Research — announced the first draft human pangenome reference in a Nature paper.

This new human pangenome — “pan” from the Greek word for “involving all members” — combines assembled genomes from 47 people from diverse ancestries around the world. Unlike the current human reference genome, which represents data from just one person at each point along the DNA, the pangenome reference includes data from many individuals at each position. This creates a new resource that better represents human genetic diversity, allowing scientists and doctors to more accurately diagnose and treat diseases and develop new therapeutics.

To contribute to the consortium’s efforts, Google engineers helped develop and apply deep learning approaches to solve genomics challenges. Engineers adapted their open-source tool DeepVariant, which uses convolutional neural networks to identify genetic variants. The consortium then used the adapted methods to improve pangenome analysis techniques and eliminate sequencing errors from the long, particularly hard-to-decode stretches of the human genome.

Google’s DeepConsensus, which uses transformers to correct errors in sequencing instrument data, helped to improve the accuracy of the data used to construct the pangenome. High accuracy is critical for a reference pangenome to ensure that it isn’t a source of error in genome analysis. Using DeepConsensus data, the consortium was able to develop a long-read assembler that achieved a final accuracy of more than 99.999%. You can learn even more about these deep learning approaches on our Google Research blog.

This breakthrough was only made possible through the collaboration of an international community of experts, including geneticists, engineers and ethicists. This demonstrates the progress made through diverse contributions — just like the pangenome itself.

POSTED IN:

Innovation & AI

Products & platforms

Company news

A breakthrough to better represent human genetic diversity

A breakthrough to better represent human genetic diversity

Related stories

Building the future of global health, together

New research shows how AMIE, our medical AI, could help manage health conditions.

Google.org and the Johnson & Johnson Foundation are launching a $10 million initiative to train rural U.S. healthcare workers in AI.

An update on our mental health work

The latest AI news we announced in March 2026

Announcing the winners of the MedGemma Impact Challenge