Personalized Warnings Could Bring Down Hate Speech On Twitter

Twitter Logo renewed ah db20

A new research has found that personalized warnings for Twitter users could help bring down the amount of hate speech on the platform. The findings come from a group of researchers at New York University’s Center for Social Media and Politics.

The research (via) found that issuing a carefully worded disclaimer or warning could deter people from using hateful or abusive language. This could prove to be an effective tool to bring down the violent rhetoric on platforms like Twitter. However, it’s still early days.

Mustafa Mikdat Yildirim is the lead author of the paper. He explained how sending warnings could help alleviate the problem of hate speech.


“Even though the impact of warnings is temporary, the research nonetheless provides a potential path forward for platforms seeking to reduce the use of hateful language by users,” he said.

Researchers created test accounts with profile names like “hate speech warner” before warning individuals

The research began with the identification of accounts that were close to suspension for violating Twitter’s hate speech rules.

Researchers sought candidates who used at least one word in the “hateful language dictionaries” over a one-week period. Additionally, these users should have followed at least one account that was recently suspended for violating Twitter’s guidelines.


The researchers then created test accounts with names like “hate speech warner” while tweeting out warnings against violating individuals. The wording of the tweets differed between the test accounts. However, the core idea was the same – to discourage hate speech. The warnings would also inform individuals about the suspension of an account they follow.

The authors of the research had around 100 followers and no affiliation with any organization or NGO

“The user [@account] you follow was suspended, and I suspect that this was because of hateful language,” one of the samples from the research reads. Other warnings include, “If you continue to use hate speech, you might get suspended temporarily” or “If you continue to use hate speech, you might lose your posts, friends and followers, and not get your account back.”

The paper claims that each of the accounts created by the co-authors had around 100 followers. Moreover, none of these accounts had an affiliation with organizations, thus maintaining a neutral stance. The researchers point out that such warnings could have a bigger impact if they came from Twitter or an NGO.


We’re still a long way from determining whether this method could bring about a decline in online hate speech. But this is a decent starting point for social media giants to expand on the existing frameworks to combat hate speech/misinformation.