rfmcdonald: (Default)
[personal profile] rfmcdonald
Earlier, I shared Yorkshire Ranter Alex Harrowell's post looking at a definition of trolling on Facebook.

Defining trolls as those who get banned for trolling, a pragmatic solution if nothing else, they obtained a large corpus of comments from three high-volume sources, CNN, a gamer news site, and Breitbart. (Clearly they weren’t about to risk not finding enough trolls.) They paid people to classify the comments on various metrics, and also derived a lot of algorithmic metrics, and used this to train a machine learning model to guess which users were likely to be banned down the line.

The results are pretty fascinating. For a start, there are two kinds of troll – ones who troll-out fast, explode, and get banned, and ones whose trollness develops gradually. But it always develops, getting worse over time.

In general, we can conclude that trolls of all kinds post too much, they obsess about relatively few topics, they are often off topic, and their prose is unreadable as measured by an automated index of readability. Readability was one of the strongest predictors they found. They also generate lots of replies and monopolise attention.

Not surprisingly, predictions are harder the further the moment of the ban is into the future. However, the classifier was most effective looking at the last 5 to 10 posts – it actually lost forecasting skill if you gave it more data. Fortunately, because trolling is a progressive condition that tends to get worse, scoring the last 10 comments on a rolling basis is a valid strategy.


A link to the paper, and more analysis of said including graphics, is available at the link.
Page generated Jan. 31st, 2026 05:59 am
Powered by Dreamwidth Studios