Armchair moderation: or, how I learned not to worry and love user-generated content.

What do Facebook and reddit have in common? A lot but, namely, user-generated content and all of the hiccups and hazards associated with it. How do you, as a platform, responsibly manage information that you’re ultimately not creating?

Mark Zuckerberg, in an effort to address the above, recently announced changes to Facebook’s news algorithm. The purpose, he says, is to (re)focus Facebook’s social media platform on building common ground. Facebook will, in addition to this shift, crowdsource journalistic credibility, relying on its users to police potentially dubious news articles. It’s no secret that Facebook’s disruption as a social media platform has led to increased scrutiny, particularly with respect to dubious and widely shared news articles during the 2016 US Presidential Election, and that this move is intended as a first step towards addressing those concerns.

I don’t presume to have the precise answers to Facebook’s content problems. However, Change My View has presented our moderation team with many opportunities to creatively manage user-based content. Our subreddit is heavily curated by design but does so with an eye towards facilitating robust conversation. Any moderation process must be both scalable and frictionless wherever possible. This parallels, I suspect, with Facebook’s goals.

I’d like to think out loud, for a moment, about the ways in which my experience moderating a subreddit, coupled with my professional background, could offer some insight into “moving the needle” on Facebook’s content curation.

There are two main issues with Facebook’s latest move:

  • Crowdsourcing journalistic trust is a bad idea. Plus, if people can game your algorithm, they can game your survey or whatever.
  • Throwing people further into feedback loops (i.e., their bubbles) is a bad idea, at least if you care about self-sorting and news bubbles (which I do and think Facebook/Twitter should.)

We all know phenomena like sorting and information bubbles exist. There’s no novelty in pointing it out so I won’t rehash the argument and will instead point to (a) The Big Sort by Bill Bishop, and; (b) this post, based on a Council of Europe Report on information disorder (colloquially referred to as “fake news”) and what researchers point to as causes and impact of social media bubbles and their impact writ large.

I work in risk management at a large company and so I have a modicum of experience on solving problems resulting from algorithms, as well as a large girth of data coming at you with high speed velocity. None of my suggestions below reflect my company’s opinion or practice, but I would like to apply some of my learned experience as an attorney (none of this is legal advice) and a risk manager to Facebook’s (and Twitter’s) problems and see if I can’t brainstorm some ideas.

First Principles and Burden-Shifting

My first step would be to look into my company’s current practices for other compliance hurdles. I know Facebook has its toes dipped into money processing and a marketplace. I have no idea if they’ve determined the applicability of commonly associated laws — anti-bribery, anti-money laundering, know-your-customer, trademark infringement, sanctions, etc. — to their benign platform. However, if they have found ways in which they must comply with these laws/regulations, it would behoove the company to see how many of their current practices can be leveraged with respect to this other kind of information management, or at the least used to triage until they come up with a creative solution.

My second step is to see where algorithms and automation make sense and where they don’t. This is where my background as a moderator on online communities — including Change My View — comes into play. I don’t believe by any stretch of the imagination that cultivating a good online community can be boiled down to an algorithm. I do believe, however, that algorithms and automated processes are good for culling large swaths of information into more manageable portion sizes for human beings.

For example, on reddit we use a moderator queue in conjunction with template replies for manual mod removals. The mod queue has some automated triggers, such as derogatory names and ethnic slurs, as well as some key phrases we’ve noticed highly correlate with bad behavior (e.g., “reading comprehension” keys closely with people accusing other people of having poor reading comprehension and, effectively, being dumb, which violates a rule we have against personal or ad hominem attacks.)

This means that we as human beings do not have to go out of our way to remove those posts. Effectively, we shift the burden of proof on our user to appeal the removal and demonstrate to us that their use of these derogatory names were not, in fact, intended to be derogatory. Instead, as another example, they were quoting illustrative song lyrics that incorporated that term. It’s a lot easier to manage appeals because they are fewer in number and typically are only made by people who care to have a conversation. They might not be validated in their appeal, but at the least we’re working with other human beings at the end of that computer, and we’ve made a visible signal that there is an aspiration towards adult conversation.

(Note the reputation Change My View has and how it does not require Facebook’s “common ground.” Or, at the least, the common ground is how one has a conversation, not the views themselves.)

Community Standards

Likewise, we do rely on our users to flag potential rule violations. This is exceptionally helpful because we can’t meaningfully catch every break for so many threads and posts. This is a place where crowdsourcing is effective by funneling information into your queue. However, if this was the end of the line, we’d have an erratic enforcement framework. Some people believe everything is a personal attack and would remove everything. Others think that it’s only a personal attack if it comes in the most technical of phrasings. Rules only work when applied consistently. Otherwise, expectations are prohibitively haphazard; you can’t tailor your behavior to a moving target.

Our policy is coherent and therefore can be anticipated by our users precisely because we use moderation guidelines to enforce rules, but we take advantage of our user base to highlight behavior we couldn’t suss out manually. This works because our user-base has internalized the goals and culture of our subreddit. They want these rules enforced because it makes the experience enjoyable and productive. If you look at how people report feeling when they use Facebook and Twitter, this is a worthwhile consideration (more on that at the end.)

Internet Literacy

A third space worth considering is squishier, mostly because it’s anecdotal and therefore the most experimental. It touches somewhere between psychology and (I think) generational differences, though I don’t want to highlight the latter too much. I occupy this weird Old Millennial space where I can remember growing up on the Internet but not being raised on it. Contrary to popular baby boomer opinion, I did library research in elementary school (with books, yes), subscribed to a paper newspaper up until 2016, and can recall dial-up and begging my mom to get off the phone so I could go to keyword: Nick.

Publication used to be something that required gatekeepers, and web 2.0 is fundamentally predicated on eradicating this barrier, consequently democratizing — among other things — journalism, or at least something kind of (?) like it. Folks like my parents, in my experience, take for granted that the mere publication of a thought indicates veracity — presumably because their lived experience, up until now, had a far more cultivated news environment. You could use publication, by and large, as a proxy for having gone through some editorial rigor. The consequence today is that the “internet literacy” of this generation is probably a bit dodgy, and I likewise think we’ve done a poor job of ensuring younger generations don’t fall victim to the same practice. This dovetails into a concern I have over the lack of critical thinking taught in schools, generally, but this is a song and dance everybody has heard of and tangential to what I’m getting at here.

So What?

The final step is to take this rough analysis and see if there are any practical rules to glean from them. None of these are exhaustive or intended to be silver bullets. Rather, they are “first principles.” They’re minimally intrusive and intended, in part, to test the above theories, see what sticks, and move the needle from there:

  • Begin by adopting some form of CAPTCHA for the news feed. The goal here is to minimize the amount of automated sharing. It hits at low hanging fruit, i.e., bots. This is probably better for Twitter than for Facebook but still worth considering.
  • Do not dabble in the substantive worth of news sites and blogs. Instead, increase the barrier of entry to the news feed, and shift the burden to probable rule violators. Have a series of keywords and phrases that reasonably correlate with poor behavior and remove them automatically. This will thin out more low-hanging fruit and provide moderator groups with far more manageable queues. It’s fair to keep the appeal process simple and easy; the idea is solely to make sure that it is more likely than not a human being is doing the posting and that they can make a case that they are not violating your community rules.
  • Your community rules should draw clear, equally enforced lines. “Equally” here does not mean perfection. It means that you can demonstrate every single time that the analysis leading to a conclusion was identical, not that the conclusion itself is identical. Facts change case-by-case but the analytical framework doesn’t have to.
  • Create an aspirational, visibly-marked status. In Facebook’s case, rather than having your community decide what is trustworthy, create a separate status for news organizations that meet additional hurdles. There are strong proxies for news organizations that really do intend to be journalistic. Usually they are incorporated or they have a headquarters or they have a CEO or they have a primary editor, etc etc. Have a submission process with a few key, concrete identification factors (like the verified e-mail address of the editor) to “verify” that these organizations exist and are willing to put their name/face to what it disseminates. This doesn’t prove or verify their bona fides — readers decide that (see: above) — but it does make it less likely they’re shit-stirrers.

Thus concludes my armchair moderation, but I’d like to end with one other point: people often tell us that they feel better after posting on Change My View. Studies show that people are left feeling worse after using Facebook and other social media platforms. I think this is worth considering. We’re in the early observational stages of social media’s impact on our health, but there’s some preliminary research indicating Facebook’s service leaves people feeling bad. We return for the dopamine hit. Likewise, Nicholas Carr and Cal Newport alike have made the case for negative impacts on brain plasticity and our collective ability (or inability) to focus.

That’s one way to make money, but Change My View’s approach provides an alternative argument that one facet of your product – the news facet – can still be profitable via a sense of constructive enjoyment and time well spent (or at least more well spent) than the current status quo. Were I in Zuckerberg’s shoes and had more time to polish specifics, is the argument I would make with my shareholders.

4 thoughts on “Armchair moderation: or, how I learned not to worry and love user-generated content.

  1. > I do believe, however, that algorithms and automated processes are good for culling large swaths of information into more manageable portion sizes for human beings.

    Very good call. Particularly the way this is written — if I ever need to express the need of information processing from social media, I will likely use some keywords from here, if you don’t mind.

    I am very interested in research that focuses on this kind of work.

    > Begin by adopting some form of CAPTCHA for the news feed.

    Requiring CAPTCHA for every post makes it extremely inconvenient, particularly across multiple platforms. However, it is inherently a Good Idea that can have a positive impact.

    So, the CAPTCHA is there to distinguish between Human and Bots. Most of the troublesome Twitter Bots do not come from Bots Providing a Service, but Bots Pretending To Be Human. However, Bots can provide positive information / feedback. Hence: We need a distinction between a Potentially Harmful Lying Bot and a Bot We Can Trust.

    In addition to the CAPTCHA, I would propose the introduction of Bot Accounts: similar to “Verified” Twitter accounts, Bot Accounts exist as regular accounts but are marked as Bots. Anyone who follows this account knows that its content is not exactly / completely created by a human.

    I would also propose “flagging” accounts that do not pass a certain number of CAPTCHA. Not fail, but *not pass*, as is stopping trying to post something due to the CAPTCHA. That way, whenever someone tries to follow these accounts, they can be warned that this account is suspected of being a bot account and not a real human.

    What do you think?

    Liked by 3 people

    1. Thanks for your comment. Let me response where I can:

      First, you can always feel free to refer back to these blog posts. It would be great if you can share the link if only to contextualize and give them a platform to post a comment if they have anything to add.

      I like the proposal about a somewhat randomized CAPTCHA, and then pivoting to those accounts that fail to pass a threshold of valid CAPTCHAs. I’m actually not even married to the CAPTCHA itself so much as some kind of manual check that makes it more likely a the share originates with a human being than a bot. The reason why digital advertising is so great is that it is frictionless, low cost and automated, allowing broad swaths of information to be distributed in a way that was impossible before; human beings did not have the computational power.

      I don’t want to utterly eliminate the frictionless experience of both businesses and end-users, but I do think a good place to start thinking creatively is all the spots that keep that barrier low and see if there are certain spots where we can increase the amount of manual “work” required in a way that doesn’t unduly increase that barrier but would be considered a fair trade-off to culling information. An additional “Are you a robot” mouse click seems like a very minor infringement on one’s ability to post information.

      Liked by 3 people

  2. I think there are some people who seem to know how to actually trigger the “reading comprehension” reply. I noticed replies where the “lack of reading comprehension” retort is actually fully applicable, rather than just being a common ad hominem. I am talking about replies which are so obviously, almost deliberately misconstruing your own meaning that it is hard to reply otherwise (or you would have to go about it in a very, rather unusually thoughtful way).

    Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s