Internet Content Moderation 101

Since Facebook, Twitter and YouTube have all been vocal (to various degrees) about staffing up the human element of their content moderation teams, here are a few things to understand about how these systems typically work. Most of this is based on my time at YouTube (which ended almost five years ago, so nothing here should be considered a definitive statement of current operations), but I found our peer companies approached it similarly. Note, I’m going to focus on user generated/shared content, not advertising policies. It’s typical that ads have their own, separate criteria. This is more about text, images & video/audio that a regular user would create, upload and publish.

What Is Meant By Content Moderation

Content Moderation or Content Review is a term applied to content (text, images, audio, video) that a user has uploaded, published or shared on a social platform. It’s distinct from Ads or Editorial (eg finding content on the site to feature/promote if such a function exists within an org), which typically have separate teams and guidelines for when they review content.

The goal of most Content Moderation teams is to enforce the product’s Community Standards or Terms of Service, which state what can and cannot be shared on the platform. As you might guess, there’s black and white and gray areas in all of this, which mean there are guidelines, training and escalation policies for human reviewers.

When Do Humans Get Involved In The Process

It would be very rare (and undesirable) for humans to (a) review all the content shared on a site and (b) review content pre-publish – that is, when a user tries to share something, having it “approved” by a human before it goes live on the site/app.

Instead, companies rely upon content review algorithms which do a lot of the heavy lifting. The algorithms attempt to “understand” the content being created and shared. At point of creation there are limited signals – who uploaded it (account history or lack thereof), where it was uploaded from, the content itself and other metadata. As the content exists within the product more data is gained – who is consuming it, is it being flagged by users, is it being shared by users and so on.

These richer signals factor into the algorithm continuing to tune its conclusion about whether a piece of content is appropriate for the site or not. Most of these systems have user flagging tools which factor heavily into the algorithmic scoring of whether content should be elevated for review.

Most broadly, you can think about a piece of content as being Green, Yellow or Red at any given time. Green means the algorithm thinks it’s fine to exist on the site. Yellow means it’s questionable. And Red, well, red means it shouldn’t be on the site. Each of these designations are fluid and not perfect. There are false positives and false negatives all the time.

To think about the effectiveness of a Content Policy as *just* the quality of the technology would be incomplete. It’s really a policy question decided by people and enforced at the code level. Management needs to set thresholds for the divisions between Green, Yellow and Red. They determine whether an unknown new user should default to be trusted or not. They conclude how to prioritize human review of items in the Green, Yellow or Red buckets. And that’s where humans mostly come into play…

What’s a Review Queue?

Human reviewers help create training sets for the algorithms but their main function is continually staffing the review queues of content that the algorithm has spit out for them. Queues are typically broken into different buckets based on priority of review (eg THIS IS URGENT, REVIEW IN REAL TIME 24-7) as well as characteristics of the reviewers – trained in different types of content review, speak different languages, etc. It’s a complex factory-like system with lots of logic built in.

Amount of content coming on to the platform and the algorithmic thresholds needed to trigger a human review are what influence the amount of content that goes into a review queue. The number of human reviewers, their training/quality, and the effectiveness of the tools they work in are what impact the speed with which content gets reviewed.

So basically when you hear about “10,000 human reviewers being added” it can be (a) MORE content is going to be reviewed [thresholds are being changed to put more content into review queues] and/or (b) review queue content will be reviewed FASTER [same content but more humans to review].

Do These Companies Actually Care About This Stuff

The honest answer is Yes But….

Yes but Content Operations is typically a cost center, not a revenue center, so it gets managed to a cost exposure and can be starved for resources.

Yes but Content Operations can sometimes be thought of as a “beginner” job for product managers, designers, engineers so it gets younger, less influential staffing which habitually rotates off after 1-2 years to a new project.

Yes but lack of diversity and misaligned incentives in senior leadership and teams can lead to an under-assessing of the true cost (to brand, to user experience) of “bad” content on the platform.

Why Straight-Up Porn Is The Easiest Content To Censor…But Why “Sexual” Content Is Tough

Because there are much better places to share porn than Twitter, Facebook and YouTube. And because algorithms are actually really good at detecting nudity. However, content created for sexual gratification that doesn’t expressly have nudity involved is much tougher for platforms. Did I ever write about creating YouTube’s fetish video policy? That was an interesting discussion…

What Are My ‘Best Practices’ for Management To Consider?

Make it a dashboard level metric – If the CEO and her team is looking at content safety metrics alongside usage, revenue and so on, it’ll prove that it matters and it’ll be staffed more appropriately.
Talk in #s not percentages – These HUGE platforms always say “well, 99% of our content is safe” but what they’re actually saying is “1% of a gazillion is still a really large number.” The minimization framework – which is really a PR thing – betrays the true goals of taking this stuff seriously.
Focus on preventing repeat infringement and recovering quickly from initial infringement – No one expects these systems to be perfect and I think it’s generally good to trust a user until they prove themselves to be non-trustworthy. And then hit them hard. Twitter feels especially poor at this – there are so many gray-area users on the system at any given time.
Management should spend time in the review queues – When I was leading product at YouTube I tried to habitually spend time in the content review queues because I didn’t want to insulate myself from the on-the-ground realities. I saw lots of nasty stuff but also maintained an appreciation for what our review teams and users had to go through.
Response times are the new regulatory framework – I wonder if there’s role for our government to not regulate content but to regulate response time to content flagging. There’s a ton of complexity here and regulations can create incentives to *not* flag content, but it’s an area I’m noodling about.

Hope that helps folks understand these systems a bit more. If you have any questions, reach out to me on Twitter.

Update: My friend Ali added some great best practices on how you treat the content reviewers!

Hunter Walk

Self-Aware Self-Promotion

Internet Content Moderation 101

Like this:

Share this:

Like this:

Discover more from Hunter Walk