Josh Carty, media executive at iProspect, looks at Facebook's strategy to try and automate vetting of its user base to identify and isolate negative and dangerous elements within it, employing AI

Mark Zuckerberg's essay on Building Global Community offers considerable insight into Facebook's ambition as a social platform and some of the challenges it faces in helping develop safe, informed and inclusive communities.

He addresses two important concerns about the role of social media in shaping public discourse and helping prevent and respond to social harm. Zuckerberg acknowledges Facebook's powerful role in controlling what content its users see, and the site's responsibility to help users when confronted with dangerous content.

Whether it is providing more diverse content to its users or identifying signs of danger that could help save lives, Zuckerberg's essay shows that he sees a growing need to better understand the content shared on Facebook. In both cases, he points to artificial intelligence as the tool to help recognise and act against untrustworthy or dangerous content.

There are a range of possible solutions that would constitute monitoring content on Facebook using artificial intelligence. These range from using more traditional machine learning techniques to catch hoax articles - in the same way they are already used to catch spam - to a more revolutionary (and not yet achieved) advancement in artificial intelligence research. Such revolutionary research might take the form of an artificial intelligence capable of evaluating a piece of content's topic, perspective and sentiment with superhuman ability. It is important to recognise the enormous gap between these two solutions and forces us to clarify what we mean when we talk about artificial intelligence.

While recent progress in artificial intelligence spans a range of domains, from image recognition to Texas hold ‘em, each solution is domain-specific. The artificial intelligence strongest at playing chess would be of no use in driving a car or recognising speech. These solutions markedly contrast with the artificial intelligence of science fiction, capable of performing tasks across a range of domains, so-called artificial general intelligence. Although Zuckerberg does not explicitly refer to this distinction, it is worth considering whether the depth and range of understanding required to evaluate content amounts to an artificial general intelligence problem.

Understanding a piece of content, its perspective on a particular topic, whether it's satire or genuine opinion and the degree to which it might offend, seemingly requires at least human-like levels of comprehension. It would mean navigating the nuances of natural language, ethics and social norms. Creating a complete artificial general intelligence ‘superhuman' programme capable of understanding such a broad range of nuances is unlikely what Zuckerberg had in mind; nor is needed for Facebook to better understand online content. As we consider the ways in which Facebook might go about implementing a solution, we will only focus on what seems possible in the near and medium term.

To be effective in monitoring and responding to different content, an artificial intelligence would need to cut through the noise of online media and accurately classify content into topics. It would also need some understanding of the nuance and of the sentiment a piece of content expresses. We will take each of these challenges, noise, classification and accuracy, and discuss some of the ways in which Facebook might approach them.

If we look at the types of signals Facebook could use to characterise content. When we refer to the noise of online media, we refer to a wealth of data we could use to classify it. These might be user signals, describing the way users have interacted with a piece of content, or features of the content itself. User signals might be the number of times an article has been clicked, the amount of time a user watches a video before they share it, or the number of times a post has been liked. Other useful user signals might be information about the people producing or sharing the content, their preferences and past behaviours. Facebook is not short of such signals.

Features of the content itself would also be useful in characterising it. Extracting these features, whether the words in an article or pixels in an image, pose different challenges but the task of picking out useful features from content remains the same. Choosing the best features to represent a particular medium will be essential before any machine learning algorithm can begin to classify content. Facebook's access to a diverse range of different possible features, both user signals and features of the content itself, make this unlikely to be a significant obstacle.

With a robust set of features that could help characterise a piece of content, Facebook would use an algorithm to cluster content into different topics and classify potentially dangerous content. It is important to note that the rules for classification would not be explicitly programmed. Instead, an algorithm would be given examples of, for example, dangerous and non-dangerous content and the algorithm would uncover the relevant relationships itself. This would be no easy task. The algorithm would need to uncover nuanced semantic differences between news stories about terrorism and terrorist propaganda itself. It would need to be able to distinguish graphic content from war photography. These challenges are particularly difficult considering the scale Facebook operates on.

Any machine learning solution deployed by Facebook demands a high degree of accuracy due to the enormous scale of its operations. It is this scale that makes using artificial intelligence to review content necessary and this comes with its own challenges. A solution with even 99.9% accuracy leaves about two million of Facebook's nearly two billion users negatively affected. The level of accuracy required and the downside of misclassifying dangerous content distinguishes Facebook's goal from the work of other brands working on artificial intelligence.

It's essential as part of search marketing that machine learning is incorporated as it classifies search keywords into product categories to get a better understanding of clients' opportunity online. As brands continue to innovate and develop these artificial intelligence solutions, we will see increased reliance on machine learning to operate at greater scale.

While a human-level comprehension of content likely amounts to an artificial general intelligence problem, this is not what Facebook needs to develop to more effectively monitor content. We have looked at how further development in existing artificial intelligence research shows a clear path to this goal and some of the challenges Facebook must overcome in doing so.