Tech detection of hate speech not fool-proof

Hyderabad: Automatic hate speech detection systems developed by social media giants and even by academia are quite easy to fool, said researchers.

The field of applied hate speech detection is not more than three or four years. Facebook, for instance, started aggressively developing and deploying its detection algorithms only a few years ago.

Before that, it was entirely done by human content moderators who would go through flagged posts manually.

Present detection techniques focus mostly on abusive, offensive words and racial slurs and flag posts containing them. Hence, in theory, these systems are not difficult to fool. For instance, a statement “Kill all Indians” would most likely be flagged as hate speech. But removing the spaces between the words and adding the word “love” would give different results. “KillallIndians love” is likely to be missed by most algorithms in their current form. This vulnerability was discovered by researchers from Aalto University in Finland in 2018.

India has had a massive hate speech problem. Due to its multilingual internet user base, Facebook and Twitter have often been weaponised by some groups to propagate abuse and hate against certain communities. For instance, during the National Register of Citizens (NRC) exercise, Facebook was criticised for not taking down posts that referred to Bengali Muslims as “insects”, “vermin” and “gang-rapists”. Facebook has been particularly criticised for Islamophobic content.

Equality Labs, a South Asian human rights group, found such content was “the biggest source of hate speech on Facebook in India, accounting for 37 percent of (hate speech”. The group also noted anti-Dalit posts on reservations on Facebook that were not taken down by Facebook either.

Across borders, the Californian company was accused of fanning the flames of anti-Rohingya sentiment during the Rohingya crisis in Myanmar. Twitter, too, has faced similar criticism. Activist Kavitha Krishnan, for instance, complained of Twitter’s bias against marginalised groups such as Dalits, women and religious minorities.

Over the past couple of months, many Indian users of the micro-blogging website have opened accounts on open-source rival Mastodon for its purportedly robust “anti-abuse” features.

Most abusive Facebook posts regularly go under the radar due to the company’s reliance on users flagging hate speech themselves.

Unless these posts are reported by other users, and later reviewed by content moderators, they are unlikely to be taken down.

Present detection techniques focus mostly on abusive, offensive words and racial slurs and flag posts containing them.

Latest News