Using AI to study demographic representation in Indian TV

16 Nov, 2023

We developed AI models to provide first-of-its-kind multilingual analysis to inform equitable content creation across mainstream media

Komal Singh

Senior Product Manager, Research, Google

Krishna Somandepalli

Software Engineer, Research, Google

Shachi Dave

Software Engineer, Research, Google

An illustration (not actual data) of computational signals that can be analyzed at scale to reveal representational patterns in media collections. — MUSE demo [Video Collection / Getty Images]

Storytelling is intrinsic to India’s rich cultural heritage, creating shared experiences across the country’s social plurality and linguistic diversity. Stories have also always had a central place in informing, educating, and entertaining a growing audience base across screens of all sizes, through new content genres, and on new platforms.

Storytelling that is not only relatable and relevant, but also equitable and representative of India’s vast demographic, has emerged as an important imperative for media producers and content creators. And with the largest viewership across segments, TV content is key to this, enjoying prominence, resonance and reach deep within people’s homes.

This is the driving force for “Reflecting India: An intersectional and longitudinal analysis of popular scripted television from 2018 to 2022,” a new 5-year longitudinal study led by the Geena Davis Institute on Gender in Media (GDI), to which Google Research has extended our AI-powered research support, with the Signal Analysis and Interpretation Laboratory (SAIL) at the University of Southern California (USC) as the study’s academic advisor, and the India Chapter of the International Advertising Association (IAA) as the media studies advisor.

This first-of-its-kind large-scale, multi-lingual study examined media content across five Indian languages – Bengali, Hindi, Kannada, Tamil, and Telugu – in 10 scripted television shows that were the most watched between 2018 and 2022, according to the Broadcast Audience Research Council (BARC), India. The sample included a variety of genres such as soap operas, thrillers and mythological dramas. The study sample was organized by GDI and IAA.

Google’s machine learning innovations in computer vision and the natural-language understanding capabilities of its large language models (LLMs) powered the multimodal analysis. Specifically, AI-enabled technology developed by Google Research MUSE (Media Understanding for Social Exploration) was used to infer the visual and intersectional attributes of perceived gender, perceived skin tone, and perceived age of the on-screen characters. ¹

In addition, the dialogues were automatically transcribed using Google’s Universal Speech Model, a state-of-the-art automatic speech recognition model, and the language in the dialogue was analyzed with our LLMs, drawing from expertise in Project BINDI which focuses on evaluating and mitigating undesirable biases in language models. The automatic language analysis complemented visual analysis to draw multimodal insights.

This AI-backed analysis yielded a wealth of evidence that would have otherwise been impractical and difficult to collect manually. Automation also provided the benefit of accuracy and consistency in analysis and reducing human error. This technology processed over 430 hours of footage in less than 48 hours with over 100 frames per second, cumulatively analyzing over 15 million frames, and about 38 million face appearances, and nearly 2 million words using machine learning models, delivering several data-driven insights.

Female characters had more on-screen time than male characters, nearly 55.8% for women compared to 44.2% for men, with both Bengali and Telugu shows providing female characters the highest proportion of screen time, approximately 59%, across all languages.
While female names are mentioned more often in dialogue than male names, unique male names outnumbered unique female names. Perceived female names featured in 55.6% of all instances in which names were mentioned in dialogue, but these names were only 46.7% of the different names featured on the shows.
Young adults (18–33 years old) are seen on screen the most, accounting for 75.6% of all characters present on screen, with female characters over the age of 33 on screen for less time than their male counterparts.
Characters with lighter skin tones were shown 8x more on screen than characters with medium or dark skin tones. However, between 2018 and 2022, the screen time of characters with medium skin tones increased proportionately with a decrease in screen time of characters with lighter skin tones.
When shown on screen, female characters tend to be younger and with lighter skin tones than male characters. 70% of female characters on screen were between the ages of 18 and 32 and had lighter skin tones, compared to 52.9% for male characters, who represented a wider age and skin tone range.
Tamil and Telugu language TV shows present a wider range of skin tones, with characters with darker skin tones occupying more screen time, approximately 23%, than in other language shows, which showed characters with medium or dark skin tones between 13% and 18% of screen time.

This study builds on our earlier joint studies on gender equity in Hollywood movies and 12 years of representation in US television shows, which were among the first to use AI to effectively study representation in media at scale. Expanding this work to the Indian context is a meaningful step towards fostering global understanding of media representation and cross-culture patterns, speaking directly to our Responsible AI approach in ensuring that foundational AI technologies, be they in vision, language, or audio, are not just English and western-centric, but work for plethora of languages and visual mediums.

Madeline Di Nonno, President & CEO of GDI:

"If the past few years have taught us anything, it is that we are all dramatically impacted by how we connect, how we create and how we consume media and entertainment. And has profoundly reinforced the critical importance of diversity, equity and inclusion. The companies that invest in developing inclusive cultures and content will prevail in the ‘new normal’. This is why we are so excited to launch our new study and continue our deep partnership with Google to help us understand onscreen representation trends, and inspire systemic change so that the entertainment stories we see can be more inclusive of our diverse population. What happens in the world of make-believe can have real world impact. As our tagline says ‘If They Can See it, They Can Be it’.”

Shrikanth (Shri) Narayanan, University Professor and Nikias Chair in Engineering, University of Southern California (USC), Signal Analysis and Interpretation Laboratory (SAIL):

“Computational Media Intelligence enabled by multimodal signal analyses offer us rich and nuanced insights into representations and portrayals, including from the lens of human perception, at detail and scale not possible before. The continuing partnership of USC SAIL with GDI and Google addressing multilingual-multicultural media stories in this groundbreaking [study] is yet another milestone in this journey of helping create inclusive and equitable media experiences universally.”

We’re thrilled this study was released at the IAA’s ‘Voice Of Change’ summit, where the focus on propagating gender-sensitive advertising and positive gender norms across all forms of media content aligned with the study’s underlying objective and Google’s own commitment to equity, inclusion and representation.

Nina Elavia Jaipuria, Chairperson, IAA Women Empowerment Committee:

“In today’s world, we must recognise our role as co-creators of narratives that actively challenge stereotypes. With the latest edition of the ‘Voice of Change’ initiative, we aim to drive a collective cultural shift, dismantle biases, and empower all stakeholders to make way for a steady revolution of gender-equitable advertising and content creation. It’s time our content begins to reflect the progress and aspirations of an inclusive society.”

The MUSE project will continue to innovate on technologies to build ‘multimodal’ signals to study not just ‘presence’ but also ‘portrayals’ of characters across cross-culture and multimedia content in the service of encouraging more transparency, and fairness for diverse communities. Our ambition is to help ensure that the onscreen content across the world reflects the full, rich variety in society and all humanity.

Posted in:

We recognize that skin tone does not equate to race or ethnicity, that gender is not a simple binary attribute, and that one's gender identity may not match one's gender expression.

Using AI to study demographic representation in Indian TV

Related stories