Fake news | Artificial Intelligence and Automated fake news

13 November 2018

2932

Robots, dark advertising, image and voice synthesis: algorithms serving disinformation. How to craft automated fake news.

Automated fake news (but we can call them automated hoaxes too) is a reality. Actually, this “fake news”, as it is called today, is certainly not a recent invention. However, it becomes far more relevant when the mass media plays an active role in its dissemination.

In the history of hoaxes that become “viral” thanks to the media, there have been particularly sensational cases. These cases foreshadowed one of the major concerns on the mass media, or the ability to shape public opinion thanks to its huge capillarity and diffusion.

A long history of fake news

The Great Moon Hoax

Known as “The Great Moon Hoax”, it is considered one of the most sensational pranks of all time perpetrated by mass media.

On 21st August 1835, on the second page of the New York Sun, a small article appeared, apparently taken from the supplement of the Edinburgh Courant. The text announces amazing astronomical discoveries that would (according to the Sun) have been accomplished by the already illustrious Sir John Herschel through innovative techniques.

Despite the New York Herald’s suspicions, the hoax was officially unmasked only in 1840. If on a hand the media of that time were quite divided on the issue, on the other hand, the coverage they provided did contribute to the enthusiasm and general excitement that those “discoveries” created.

The Great Hoax of the Moon was the first sensational demonstration of the mass media’s power, arisen with the introduction of steam printing machines. Before 1830, such a prank would not have been even conceivable. We can say that the whole issue foreshadowed one of the major concerns about the mass media, or the ability to manipulate the public opinion.

War of the Worlds

A century after Locke’s prank in the press, it is no longer just the press that covers the territory. Thanks to the radio, the mass media reach almost all the homes of the American population.

It is October 30th, 1938, when the usual CBS radio program is interrupted by an “Intercontinental Radio News” reporter announcing that astronomers had just detected a huge blue flame erupting from the surface of Mars. It is the beginning of another colossal prank, inspired by The War of the Worlds, a novel written by H.G. Wells in 1898. The whole thing was conceived and directed by Orson Welles, and half of the American population was thrown into total chaos… or was it?

In this case, the “fake news” were two: the “live” story of the alien invasion ^[1], and the myth, still strong today, of a panic spreading out on a national scale, with millions of terrified Americans.

Once again the mass media played a crucial role. The press, which hastened to divulge the news of the panic due to the prank, but this time assisted by an even more powerful and pervasive medium: the radio.

The Sokal affair

The world of pranks has not spared even academic publications.

In 1994 Alan Sokal sent to Social Text ^[2] an article called Transgressing the Boundaries: Towards a Transformative Hermeneutics of Quantum Gravity. The journal (which at the time did not practice peer review) published the article in the spring of 1996.

The peculiarity of the article was that it was composed entirely of meaningless sentences. The only criteria were that the words sounded good together and that the content was aligned with the journal’s ideology.

The aim that Sokal later declared on Lingua Franca, was to demonstrate the decline of the intellectual rigor standards of the humanistic journals of the time ^[3].

Internet and the sounding boards

Today, with the Internet, the concept of fake news has taken on a connotation on a totally different scale. While the Internet itself does not reach more people than the radio did, it has radically lowered the entry threshold for publication. In the pre-Internet era, in fact, the mass media, even with their capillarity, were accessible to the masses, but only passively. Those who had access to radio and to the print in order to publish content were very few. Today anyone can publish their ideas, but above all, anyone can act as a sounding board. And the speed with which a news or a hoax spreads has become exponential.

Mentioning some recent episodes: the case of Mandragora in Bonduelle spinach (false!), The news of the death of Vitalik Buterin, founder of Ethereum (false!), The attack to the Trump voters by the CEO of PepsiCo (false!) , the presence of the HIV in Pepsi’s drink (false!), the idea that the American movement Antifa was organizing a coup d’etat to seize the US institutions (also false!).

Facebook and Google’s algorithms personalize the contents they present us, biasing toward those that “we like”. The final purpose of this filtering is to show promotions as much tailored as possible. However, it has the side effect of shrinking more and more the chances to find content that is not aligned with our points of view. This is the so-called ‘filter bubble‘ (an expression coined by Eli Pariser), that is, the information bubble within which the Internet user risks being locked up, in a sort of intellectual isolation.

A paradox: more data, less pluralism

In the case of Facebook, the phenomenon is pretty much evident. The filter bubble, where basically every conversation on Facebook is locked in, is the result of an integrated logic in the Newsfeed:

“it shows the user only the contents that are most likely to interest him”.

The paradox is that the more content circulates on Facebook, the better the algorithm becomes to understand what is the content that interests us the most. So basically, the wider the content, the less likely we are to bump into something different from our (presumed) expectations.

The same bubble is also found in Google Search, Instagram and – to a lesser extent, but growing – on Twitter.

David Rhode, director of the “New Yorker” concludes disconsolately:

“Perhaps people are overwhelmed, because there is more information available than in any other historical period, but we tend to find the news where our thoughts are reinforced. Incredibly, the technology has produced as a result the fact of being exposed to less than before, and it is something that I did not think possible. “(Interview with Gea Scancarello,” Only the reporters will save us from the fake news . ” Pagina99 , October 27 2017).

People love sensationalized news, they attract clicks, and who’s reading them isn’t even interested in knowing if they are reading a fake or something true. This kind of traffic generates revenue for the authors, encouraging even more the production of fake news.

The role of advertising in fake news

As if all that was not enough, the filter bubble phenomenon is amplified by the online advertising’s microtargeting mechanisms. By exploiting these mechanisms it is possible to send opportune messages to suitable targets, with more probability of inducing a reaction (likes, sharing, favorable comment). This practice goes under the name of dark advertising.

In recent US elections – e.g. US presidential elections, Brexit referendum, French presidential elections, German national policies – these mechanisms have been widely exploited. The Russiagate shows that online political advertising risks becoming an instrument of pressure and interference by foreign powers in a state’s internal affairs, a new, cybernetic, type of cold war. Paraphrasing von Clausewitz, we could say that Facebook is the continuation of politics by other means.

Artificial intelligence, big data, and automated fake news

The fake news cited above were mostly due to errors, to pranks or intended as denounce of some type. Much broader is the impact of disinformation campaigns, where disinformation means the spread of false and/or tendentious news with the precise intent of generating confusion and causing damage.

The diffusion

“When What Happened , the latest book by Hillary Clinton debut on Amazon, the reception was incredible … so incredible that Amazon did not believe it and erased 900 of the 1600 reviews for being suspiciously false, written by people who said they love or hate the book, but they had neither purchased it nor read it. ” (quoted by Could AI Be the Future of Fake News and Product Reviews?- Scientific American).

It is known that in social media the news is not spread through a linear path, but with a “hub” distribution. In other words, there are some users particularly “influential”, often with hundreds of thousands of followers. Posts of these type of users have an exponentially higher chance of spreading.

resonance boxes of Fake News

These resonance boxes have become targets for the so-called social bots, which according to Chengcheng Shao of Indiana University, are most of the accounts that spread fake news.

The spread of fake news is rapidly becoming a large-scale business, no longer tied only to the random student, spreading pranks hidden inside his garage, for a few hundred dollars per day.

By now, the game is moving on the real disinformation, to get to influence government elections or even the performance of the financial market. Automation has become a requirement, indeed we can say that it is likely that soon the falsification of text-based news will no longer be enough. Technology is almost ready to falsify stories, creating images, video, and audio completely out of thin air.

Art or counterfeiting? The image synthesis

Today’s accessibility to big data has made it possible to develop unprecedented artificial intelligence algorithms. Almost without realizing it, we have moved from the analysis algorithms to the generation algorithms.

The automatic synthesis of images had already begun to gain traction with software like Pix2Pix, able to create photorealistic images starting from drawings created by Generative Adversarial Networks (it will be the subject of further study in another article). Today new technologies are coming out with increasingly surprising results almost every day. StackGAN, for example, implements two layers of adversarial networks, one that produces a first level of low-resolution images, and a second stage that on the basis of this produces a more refined image high resolution, practically photorealistic.

The automatic synthesis of images. Bird exemple

Art or counterfeiting? The synthesis of videos

As if it wasn’t enough, the image synthesis is now moving towards its most natural evolution: the video synthesis.

Last August, musician Françoise Hardy, in all the beauty of her 20s, appeared in a YouTube video (below). The singer responded to the pressing questions of Chuck Todd (off the court) about why Trump had sent his press secretary Sean Spicer to lie publicly, speaking of “alternative facts”.

What’s wrong? Well, first of all Françoise Hardy is now 73 (when she was 20 Todd was not born yet, and Trump obviously was not the president), and the voice that is heard for those who had not already recognized it is that of Kellyanne Conway. But the surprising thing is that it is not just the voice of Kellyanne Conway put on a video of the Hardy, but a video entirely generated. It is, in fact, a work of Mario Klingemann , who through (again) antagonistic neural networks were able, using old video clips of the singer as training, to “recreate” automatically the face of the Hardy, with the labial synchronized to the voice of Kelyanne Conway!

Another interesting technology DeepStereo, which from different images of places can “predict” and reconstruct the scenes as if they were taken from different points of view.

Art or counterfeiting? Speech synthesis

Speech synthesis is an already widespread field, but the technology used commercially (such as Siri) is still made up of recorded vocal fragments, then chained together. Being inherently limited to the set of recorded phrases, it sounds “true” only under certain circumstances, and is therefore not yet suitable for a realistic “counterfeiting”.

But the “deep audio”, that is the audio generated entirely by neural networks, is another story. It is about making the network learn the characteristics of the voices with which it is fed in the training, and to obtain faithful reconstructions in any context.

It is about working directly on raw audio files, which is usually avoided because it requires very heavy sampling, even 16k per second, with important structures that emerge at various time scales (see below).

raw audio files has structures that emerge at various time scales

The technology is there: for now, it is not yet available on a commercial scale, but there is.

For example, Google has developed Wavenet, a convoluted neural network that in training is fed with human voices in the form of raw wave. Wavenet determines the sampling value in every single step based on the probabilistic distribution of the network.

fake news wavenet

At the end the network is able to reproduce even “filling” sounds such as breaths, stuttering etc (listen below an example).

The results, compared to other technologies already used by Google are much higher, and in the case of Mandarin are also quite close to the human performance.

mandarine vs us fake news speech

Below a comparison between audio samples of the voice generation technologies used by Google:

Parametric Technology

Concatenative technology

Wavenet

Considering also the progress in the fields of image generation and video, it is not difficult to think of a near future in which it will be possible not only to “create” photographs and videos of never-before-seen scenes, but also to put in people’s mouth completely “created” phrases from nothing. The world of fake news and misinformation is about to enter a new radiant era where the only limit is imagination, and the difference between “fake” and reality will be increasingly blurred.

Notes

[1]: To get out of it was reported that the observations were terminated due to a fire that would have reduced the magical telescope and the entire laboratory(!) to Ashes.

[2]: Social Text, a journal of post-modern cultural studies.

[3]: Subsequently, the editors of the magazine, even regretting having published the piece, published an editorial where they refuted Sokal’s conclusions on the alleged lack of rigor of the newspaper, claiming to have noticed the “bizarre” of the piece, but to have considered it an attempt honest to venture into the field of postmodernism philosophy by a physics professor.

Robots, dark advertising, image and voice synthesis: algorithms serving disinformation. How to craft automated fake news.

A long history of fake news

The Great Moon Hoax

War of the Worlds

The Sokal affair

Internet and the sounding boards

A paradox: more data, less pluralism

The role of advertising in fake news

Artificial intelligence, big data, and automated fake news

The diffusion

Art or counterfeiting? The image synthesis

Art or counterfeiting? The synthesis of videos

Art or counterfeiting? Speech synthesis

Notes

RELATED ARTICLESMORE FROM AUTHOR

This is how we have automated the planning, scheduling, and dispatching processes for the fashion industry

Ublique for the fashion industry: from an optimized distribution to intelligent choices for an optimized replenishment of stores

Supply Chain for the Fashion Industry: Covid-proof logistics

Protocols anxiety? Let’s Commander!

Data center, predicting electricity consumption with artificial intelligence

RELATED ARTICLES MORE FROM AUTHOR