"Gender disparities have long existed in film industry since its naissance..."

starting point of our data journey

Gender disparities have long existed in film industry since its naissance. As something that reaches so many people in the world, movie is bound to influence our society. With the subject being hotter than ever, the question of female representation in the movie industry arises. We want to analyse gender equality not because it is a buzzword or a trendy topic. We believe that on and off-screen representation has an impact on how we see ourselves and others. Underrepresentation can contribute to negative stereotypes and biases. It could also mislead the public that men are more valuable and superior, which has serious consequences for women's careers and opportunities.

In this project, we aim to analyse the on and off-screen gender representation in movie industry to address the gender disparities problem. This is done by comparing data among different genres, geographical areas and time series, plus making prediction and analysing possible stereotypes.

Are women underrepresented on and off-screen ?

Who dominates the screen?

Men have ALWAYS dominated the screen. The time series from 1908 to 2013 was broken down into four periods that show something interesting. At the beginning of the movie industry's birth, there were only a few movies.

Right around the second World War, there is a noticeable decrease in female cast ratio from 1935 to 1945. Wars make the female situation even worse.

The last period between 1990 to 2013 is representative for the current situation. However, with number of the movies dramatically booming, female cast ratio is somehow still unexpectedly LOW and NEVER reaches 50%.

Will tomorrow be better?

Let's look a bit further into future. A prediction was made based on Period 4 above. In year 2050, the female cast ratio will be equivalent to male. To reach gender equality on screen, we need at least 15 years! Unsurprisingly, there is still a LONG way to go for ultimate success.

Below We could see how the gender distribution of cast looks like in different periods. In the plots of next few chapters, you can explore the respective female cast percentage for: the most popular genres, the genres that have the highest female ratio and the countries that produce most movies currently. Try to see if you notice anything interesting. We will take a deeper dive later.

"Stay Young, Ladies!"

It's a tale as old as time: "Men age like fine wine, while women are left to wilt on the vine." There is a stigma that older women are no longer needed.

Actresses are in average 8 years younger than their counterpart. There is a dramatic decline once the "best age of 20" is reached. This could imply that looking young and fitting into today's beauty standard is more important for actresses than actors in movie industry.

Where is the gender-equality paradise?

We are wondering, is there a true gender-equality paradise? Explore the plot below a bit.

Almost all countries are male-dominant. Although it does not show an immediate and direct causal effect since other factors such as economy level and cultural elements play stronger roles, we believe it does subconsciously affect our thinking.

Besides quantitative features, it would be beneficial to check qualitative features that show how the characters are portrayed. This is what we are gonna do in later chapters.

Behind-Scene Power

Feminist film critic Laura Mulvey believes that, movie production is in control of men who make decisions that appeal to their own values and interests. She describes a term called 'the male gaze' where female characters are often depicted as attractive rather than having complex characteristics and directs the root of the problem to men being dominant in behind-scene roles. To see if this is true, we also look at female situations in writers and directors.

Things go even worse for female directors and writers. The overall growth rate of female employment in film industry is much lower than worldwide female employment rate. A 2-degree least square model also implies that, in year 2127, female writer ratio will be equivalent to male. Meanwhile for directors, it's in year 2133.

In those less-exposed and darker corners, inequal competition is way more brutal. While fewer female voices and perspectives might lead to a more stereotyped portrayal of female stories and experiences.

"Women's pain is built-in"

There are some movie genres solely occupied by male directors and writers. Those are more about themselves, about their adventures, dreams. Some male dominant genres such as action have been long criticized for objectifying women. While for female directors and writers, genres seem to be unavoidably limited to painful and emotional female roles, as a mother, a lover or a sentimental woman.

It's not only a reflection of female behind-scene situation, but also a mirror showing what women have experienced in real world. Women's pain is built-in. Before pursuit of self-fulfillment, women have to deal with what society has imposed on them - a curse of inborn duty. And their real needs as independent human beings are somehow always neglected.

Female directors and writers might be more aware of this problem. With an improvement of gender representation off-screen, "male gaze" might be avoided in movie story-telling.

Are there gender stereotypes?

"Boys get superhero comics and girls receive princess fairytales."

It is important to recognize and challenge stereotypes to promote equality, respect, and mutual understanding. Is there any difference between the portrayal of men and women? Do topics vary between movies with higher female lead? By examining the gender distribution in different genres, we can gain a better understanding of the challenges and obstacles that women in these fields may face.

"A helpless princess is always waiting for her prince."

Audiences are tired of conventional stories that a helpless princess is always waiting for her prince. However, movie plots are easy to fall flat. How are female and male characters portrayed? In particular, are females portrayed more positively or more negatively than males? The most commonly used adjectives to describe the characters are shown as follows.

Those wordclouds give us a good idea of adjectives used to describe each gender. Words describing female are more romantic, fictional and superficial than those for male. But it is still quite vague to fully answer the question. To further analyse it, a sentiment score is assigned to every adjective on a scale from -1 to 1.

Females are generally described more positively, especially with extreme positive adjectives. This may be because females are frequently highlighted using positive adjectives such as 'beautiful', 'attractive', 'young' or 'charming', since appearance is often particularly important for female characters. This is in contrast to male characters, who may not be emphased much on appearance.


Follow the feminism history, see what's happening in film industry simultaneously

  • 1908-1935

    Suffrage Movement, 19th Amendment

    Women in US asked for suffrage. While the film industry was just born. First narrative film was made shortly before, whose story was about a man who robbed on a train.

  • 1935-1945

    World War II

    War made female situation even worse. Same happened in film industry. Less opportunity, less job and less representation.

  • image alt text


    Civil Rights and Equal Pay

    Feminism movements for various female rights arose in the world, such as civil rights and equal pay. However, progress in movie industry was rather slow. Technological development seemed to be the focus. First color 3D film was born.

  • 1990-now

    New Women, Me Too and More

    Film industry was booming. With more and more feminism movements, female representation in film was largely improved. It's glad to see 'mee too' movement in the movie industry. But it's still not sufficient.

  • It is
    going on!


Bechdel Test

The Bechdel Test, or Bechdel-Wallace Test, sometimes called the Mo Movie Measure or Bechdel Rule is a simple test which names the following three criteria:
(1) It has to have at least two women in it.
(2) They talk to each other.
(3) They talk about something besides a man.
The test was popularized by Alison Bechdel's comic published in 1985 Dykes to Watch Out For .

According to Julia T. Wood, in her article Gendered Media, she pointed out there are three themes in the general media regarding women:
(1)Women are underrepresented, which falsely implies that men are the cultural standard and women are unimportant or invisible.
(2)Men and women are stereotyped in ways that reflect and reinforce socially endorsed gender ideas.
(3)Portrayals of male-female interactions reinforce traditional roles while normalizing violence against women.

The Bechdel test has become a commonly used tool for evaluating the representation of women in media. Here it's used as a comparison with CMU movie data analysis. Although it's effective, it has yet to be a widely known concept for the general audience. The dataset we used is fetched from the API of bechdeltest.com, where the reporter of each movie already knows about the test. Thus the movies in this dataset are very likely preselected by people who are already aware of gender inequality. People are more likely to submit movies that pass the test. The Bechdel test dataset is still worth analyzing regardless of the awareness of the score reporters.

For 9065 movies in the Bechdel dataset, 56.68% of movies passed the test. If we join the data with the cmu movies dataset, we get a similar result that 54.4% of the 3563 matched movies passed the test. We can say that the Bechdel dataset can be used as a good sample to measure the representation of female characters in the movie for those who care about gender inequality.

Looking at the average Bechdel test result from 1874 to 2021, the trend is up-and-coming. One interesting drop point is around 1930, between the two world wars. Similar to what we observed before in the CMU movie dataset, more war-themed movies were produced at the time. This also echoes Wood's claim that men are more related to violence in the media. The good news is that in the later time of the 20th century, from 1970, the Bechdel test score keeps increasing. The criteria of the Bechdel test first appeared in 1985; producers and writers in the movie industry may consider it when they launch a project. Modern feminism covers topics more than simple interactions between women. We must remember that even if a movie passes the Bechdel test, the script still needs to capture the nuance and complexity of female characters as women in real life. From the results in the previous section, we know that the equal representation of the two genders still has a long way to go. On-screen representation can improve easily, and it shows how feminism theory influences social trends. The importance is more difficult to measure for the female crew working off-screen. They are not the reflection of society but the actual underrepresented women in the workplace.


The analysis of the gender gap in the movie industry shows that while there is an improvement to be seen, the gender gap is still present in many areas. The percentage of the male cast, director, and writer is significantly higher than women, with some variation between genres and countries.

The female on-screen representation based on the Bechdel test score looks promising. However, looking at the word cloud in sentiment analysis, the emphasis on the appearance of female characters still tells the stereotypes of women, princesses in fairytales, or femme fatale.

The cultivation theory says that the more we see a particular representation, the more we will believe it is important and true. We should strive for a better representation of men and women to break down society's inequalities.

Besides shaping a better society for everyone, movie companies can profit more from a better-represented industry. Media research agency Shift7 worked in collaboration with leading entertainment agency CAA, and published their research result (https://shift7.com/media-research). It shows that female-led films outperform at the box office. This again reminds us of the importance of off-screen female crew in the industry, which still needs rigorous study.

We hope our research will further the study of gender inequality in the film industry.


@Yinghui Jiang

@Yichen Wang

@Sophia Rui-Xue Ly

@David Rochinha Chaves


1. CMU Movie Summary Corpus:  http://www.cs.cmu.edu/~ark/personas/ contains character metadata, plot summaries and character tropes.
2. IMDB website:  https://www.imdb.com/interfaces/ contains relevant infomation about movies, writers and directors etc.
3. Movie Bechdel Test Scores:  https://www.kaggle.com/datasets/alisonyao/movie-bechdel-test-scores contains information of IMDb movie id and whether it passes the Bechdel Test or not. It's used as an indicator for the active presence of women in the movie industry.
4. Index of Names:  https://www.cs.cmu.edu/Groups/AI/areas/nlp/corpora/names/ to identify writers' and directors' gender from names.
5. Worldwide female employment in labour market:  https://ourworldindata.org/female-labor-supply to compare with women's employment in film industry.
6."Gendered Media:The Influence of Media on Views of Gender" by Julia T. Wood:  https://www1.udel.edu/comm245/readings/GenderedMedia.pdf.