Gamer gate: Hate in gamer communities, the case of the kiA subreddit

The datastory: https://epfl-ada.github.io/ada-2025-project-othertagada/

Abstract

The aim of this project will be to understand the events and mechanisms in and around the "Gamer gate", an online harassement campaign against feminism, diversity, and progressivism in video game culture, that was present on multiple online platforms, including reddit. We chose this topic because "Gamer gate" functions as a "model", i.e., a blueprint for the coordinated, polarized hate campaigns we see on social media today. More specifically, we want to examine the evolution of the campaign, using data to uncover how the event unfolded across the platform and how various communities responded, interacted, and influenced each other during the escalation. The goal of our project is to be able to better understand hate on the internet and maybe find mitigation strategies.

Research questions

To what extent do linguistic patterns and user sentiment diverge between Pro-GamerGate (KiA) and Anti-GamerGate (GiA) communities ? How does this compare to other related subreddits ?
Is the hostility and negative sentiment in GamerGate debates widely distributed across the user base, or disproportionately generated by a hyper-active core of ‘super-participants’ that create the majority of posts ?
To what extent do Pro- and Anti-GamerGate communities function as linguistic echo chambers? Can a classifier distinguish between posts from r/KotakuInAction and r/GamerGhazi with high accuracy, indicating a distinct separation in vocabulary and rhetoric?
A lot of articles such as this post suggest that gamergate was just a strategy headed by extremists to spread ideas and recruit more people to their cause. Did that work out ? To which extent did the gaming community involved in the gamergate controversy get influenced towards alt-right spheres ?
What happens of subreddits created to discuss of specific events when the events aer not longer relevant? Do they become totally inactive? If not what are the topics of discussion?

Additional datasets

Pushshift: this dataset been generated by pushshift, which uses the Reddit API to extract and save post data to create a snapshot of Reddit accessible to the public (obtained here). We selected the same period of time as the first dataset (Jan 2014 to April 2017). It contains a log and metadata of all posts made during the time period, including usernames, titles, post bodies and more! This enables us to do a lot more analysis on the text content (for example keyword analysis), on users, since we can track their posts using the username and total post volume over time.

To make it easier to work with the Pushshift dataset, we only kept the following attributes:

Label	Description
SUBREDDIT	The subreddit of the post
USERNAME	Username of the post author
POST_ID	Unique id of the post
TIMESTAMP	Time of the post
TITLE	Text of the post title
BODY_TEXT	Text of the post body
NUM_COMMENTS	Number of comments under the post

Methods

To make it easier to work with our datasets and to focus our attention on relevant data:

Keep only the posts of the top 10 subreddits that interact the most with either r/kotakuinaction (main pro-GamerGate subreddit) or r/gamerghazi (main counterpart of kotakuinaction)

To understand the "players" and their relationships:

User Similarity: A heatmap of user similarity is generated to measure the overlap of user bases between different subreddits.
Clustering: Users are clustered using data_gamergate (large) based on behavioral features, including: Average link sentiment and LIWC
Activity Metrics: Histograms of posts per user are compared between KotakuInAction (KiA) and GamerGhazi (GiA), cross-referenced with link sentiment to identify if high-volume users drive negativity.
Deleted Content Analysis: Comparison of deleted vs. non-deleted users/posts to check for differences using LIWC.

To map "how it played out" over time:

Event Detection: Identification of spikes in total post volume correlated with a timeline of real-world GamerGate events.
Dynamic Network Visualization: A network graph with a time slider to visualize the structural evolution of the community.

Linguistic & Sentiment Analysis To analyze speech patterns and misogyny:

Statistical Hypothesis Testing (T-tests): The distribution of LIWC categories (Sexual, Swear, Anger, Sad) in KiA is compared against the global distribution using T-tests (p-values) to quantitatively verify if the discourse in KiA is distinct (e.g., significantly more toxic or misogynistic).

Topic Modeling To understand what was being discussed:

TF-IDF Matrix: Used to weigh word importance within posts.
Topic Extraction: Identification of "Top Topics" per subreddit and an analysis of topic evolution per month to see how the narrative shifted.
Topic Classification: A model trained to predict the subreddit based on the topic, with accuracy used as a metric of discourse distinctiveness.

Proposed timeline

Week	Dates	Tasks
1	6.11-12.11	Find target population for deeper analysis Make more utils for time analysis, plotting and other general functionalities needed for the project
2	13.11-19.11	Start in-depth time analysis of our target population Define attack/interaction window (to be able to relate interactions between two subreddits)
3	20.11-26.11	Start work on the datastory (probably with a github.io website) Use a classifier to try to create groups for our target population (by topic, size, other metrics...) Continue working on analyzing the target population with focus on "troublemakers"
4	27.11-03.12	Look for conflict sparking trends and possible alliances Continue work on both data story and further analysis of the target population (bulk of our work should be this week)
5	4.12-10.12	Finalize the datastory, including text, interactive graphs and images Wrap up analysis of our target population (time allocated for ideas not part of our initial planning) Make sure all helpers and supporting code is mostly finalized
6	11.12-17.12	Fix bugs (if any) Verify website layout and code clarity Avoid adding new features/content, focus on correctness of the project

Individual contributions

Robin Herberich:

Readme for P3
Setting up datastory website, work on layout of website and general features
Writing datastory introduction and data presentation
Pushshift datawrangling, exploration, processing scripts and related hyperlink dataset clean-up
Extracting posts per day per subreddit dataset and plot posts per day for a few subreddits.

Maguette Diouf:

Prediction of link sentiment
Analysis of feature importance in logistic regresion
Analysis of Gamergate Speech

Katia Häfliger:

Writing structure of results notebook
Analysis of Political implication of users
Analysis of topics monthly
Writing script to create .txt file from post dataset for each subreddits

Matteo Simonet:

Scraping of events data from an online timeline
Selection of relevant subreddits for analysis
Visualization of the network and volume of posts during the conflict, visualization of relationships between subreddits
Analysis of political implications of users

Jérémie de Faveri:

Styling the website
Posts per user analysis: Does it follow the power law?
Analysis: Difference in negativity between power and light users
Analysis: Which subreddits are more moderated?

Name		Name	Last commit message	Last commit date
Latest commit History 279 Commits
.vscode		.vscode
docs		docs
lib		lib
outputs		outputs
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
example_dataloader.ipynb		example_dataloader.ipynb
network_gif_readme.gif		network_gif_readme.gif
pip_requirements.txt		pip_requirements.txt
results.ipynb		results.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gamer gate: Hate in gamer communities, the case of the kiA subreddit

Abstract

Research questions

Additional datasets

Methods

Proposed timeline

Individual contributions

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Gamer gate: Hate in gamer communities, the case of the kiA subreddit

Abstract

Research questions

Additional datasets

Methods

Proposed timeline

Individual contributions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages