Read, Attend and Comment: A Deep
Architecture for Automatic News Comment
virtual assistants / chatbots
30.09.2019 - 06:48:25
Read, Attend and Comment: A Deep Architecture for Automatic News Comment Generation
30.09.2019 - 06:51:56
, level: 1,
The Coming Age of Imaginative Machines: If you aren't following the rise of synthetic media, the 2020s will hit you like a digital blitzkrieg
The faces on the left were created by a GAN in 2014; on the right are ones made in 2018.
Ian Goodfellow and his colleagues gave the world generative adversarial networks (GANs) five years ago, way back in 2014. They did so with fuzzy and ethereal black & white images of human faces, all generated by computers. This wasn't the start of synthetic media by far, but it did supercharge the field. Ever since, the realm of neural network-powered AI creativity has repeatedly kissed mainstream attention. Yet synthetic media is still largely unknown. Certain memetic-boosted applications such as deepfakes and This Person Does Not Exist notwithstanding, it's safe to assume the average person is unaware that contemporary artificial intelligence is capable of some fleeting level of "imagination."
Media synthesis is an inevitable development in our progress towards artificial general intelligence, the first and truest sign of symbolic understanding in machines (though by far not the thing itself--- rather the organization of proteins and sugars to create the rudimentary structure of what will someday become the cells of AGI). This is due to the rise of artificial neural networks (ANNs). Popular misconceptions presume synthetic media present no new developments we've not had since the 1990s, yet what separates media synthesis from mere manipulation, retouching, and scripts is the modicum of intelligence required to accomplish these tasks. The difference between Photoshop and neural network-based deepfakes is the equivalent to the difference between building a house with power tools and employing a utility robot to use those power tools to build the house for you.
Succinctly, media synthesis is the first tangible sign of automation that most people will experience.
Public perception of synthetic media shall steadily grow and likely degenerate into a nadir of acceptance as more people become aware of the power of these artificial neural networks without being offered realistic debate or solutions as to how to deal with them. They've simply come too quickly for us to prepare for, hence the seemingly hasty reaction of certain groups like OpenAI in regards to releasing new AI models.
Already, we see frightened reactions to the likes of DeepNudes, an app which was made solely to strip women in images down to their bare bodies without their consent. The potential for abuse (especially for pedophilic purposes) is self-evident. We are plunging headlong into a new era so quickly that we are unaware of just what we are getting ourselves into. But just what are we getting into?
Well, I have some thoughts.
I want to start with the field most people are at least somewhat aware of: deepfakes. We all have an idea of what deepfakes can do: the "purest" definition is taking one's face replacing it with another, presumably in a video. The less exact definition is to take some aspect of a person in a video and edit it to be different. There's even deepfakes for audio, such as changing one's voice or putting words in their mouth. Most famously, this was done to Joe Rogan.
I, like most others, first discovered deepfakes in late 2017 around the time I had an "epiphany" on media synthesis as a whole. Just in those two years, the entire field has seen extraordinary progress. I realized then that we were on the cusp of an extreme flourishing of art, except that art would be largely-to-almost entirely machine generated. But along with it would come a flourishing of distrust, fake news, fake reality bubbles, and "ultracultural memes". Ever since, I've felt the need to evangelize media synthesis, whether to tell others of a coming renaissance or to warn them to be wary of what they see.
This is because, over the past two years, I realized that many people's idea of what media synthesis is really stops at deepfakes, or they only view new development through the lens of deepfakes. The reason why I came up with "media" synthesis is because I genuinely couldn't pin down any one creative/data-based field AI wasn't going to affect. It wasn't just faces. It wasn't just bodies. It wasn't just voice. It wasn't just pictures of ethereal swirling dogs. It wasn't just transferring day to night. It wasn't just turning a piano into a harpsichord. It wasn't just generating short stories and fake news. It wasn't just procedurally generated gameplay. It was all of the above and much more. And it's coming so fast that I fear we aren't prepared, both for the tech and the consequences.
Indeed, in many discussions I've seen (and engaged in) since then, there's always several people who have a virulent reaction against the prospect neural networks can do any of this at all, or at least that it'll get better enough to the point it will affect artists, creators, and laborers. Even though we're already seeing the effects in the modeling industry alone.
Look at this gif. Looks like a bunch of models bleeding into and out of each other, right? Actually, no one here is real. They're all neural network-generated people.
Neural networks can generate full human figures, and altering their appearance and clothing is a matter of changing a few parameters or feeding an image into the data set. Changing the clothes of someone in a picture is as easy as clicking on the piece you wish you change and swapping it with any of your choice (or result in the personal wearing no clothes at all). A similar scenario applies for make-up. This is not like an old online dress-up flash game where the models must be meticulously crafted by an art designer or programmer— simply give the ANN something to work with, and it will figure out all the rest. You needn't even show it every angle or every lighting condition, for it will use commonsense to figure these out as well. Such has been possible since at least 2017, though only with recent GPU advancements has it become possible for someone to run such programs in real time.
The unfortunate side effect is that the amateur modeling industry will be vaporized. Extremely little will be left, and the few who do remain are promoted entirely because they are fleshy & real human beings. Professional models will survive for longer, but there will be little new blood joining their ranks. As such, it remains to be seen whether news and blogs speak loudly of the sudden, unexpected automation of what was once seen as a safe and human-centric industry or if this goes ignored and under-reported— after all, the news used to speak of automation in terms of physical, humanoid robots taking the jobs of factory workers, fast-food burger flippers, and truck drivers, occupations that are still in existence en masse due to slower-than-expected roll outs of robotics and a continued lack of general AI.
We needn't have general AI to replace those jobs that can be replicated by disembodied digital agents. And the sudden decline & disappearance of models will be the first widespread sign of this.
Actually, I have an hypothesis for this: media synthesis is one of the first signs that we're making progress towards artificial general intelligence.
Now don't misunderstand me. No neural network that can generate media is AGI or anything close. That's not what I'm saying. I'm saying that what we can see as being media synthesis is evidence that we've put ourselves on the right track. We never should've thought that we could get to AGI without also developing synthetic media technology.
What do you know about imagination?
As recently as five years ago, the concept of "creative machines" was cast off as impossible— or at the very least, improbable for decades. Indeed, the phrase remains an oxymoron in the minds of most. Perhaps they are right. Creativity implies agency and desire to create. All machines today lack their own agency. Yet we bear witness to the rise of computer programs that imagine and "dream" in ways not dissimilar to humankind.
Though lacking agency, this still meets the definition of imagination.
To reduce it to its most fundamental ingredients: Imagination = experience + abstraction + prediction. To get creativity, you need only add "drive". Presuming that we fail to create artificial general intelligence in the next ten years (an easy thing to assume because it's unlikely we will achieve fully generalized AI even in the next thirty), we still possess computers capable of the former three ingredients.
Someone who lives on a flat island and who has never seen a mountain before can learn to picture what one might be by using what they know of rocks and cumulonimbus clouds, making an abstract guess to cross the two, and then predicting what such a "rock cloud" might look like. This is the root of imagination.
As Descartes noted, even the strongest of imagined sensations is duller than the dullest physical one, so this image in the person's head is only clear to them in a fleeting way. Nevertheless, it's still there. Through great artistic skills, the person can learn to express this mental image through artistic means. In all but the most skilled, it will not be a pure 1-to-1 realization due to the fuzziness of our minds, but in the case of expressive art, it doesn't need to be.
Computers lack this fleeting ethereality of imagination completely. Once one creates something, it can give you the uncorrupted output.
Right now, this makes for wonderful tools and apps that many play around with online and on our phones.
But extrapolating this to the near future results in us coming face to face many heavy questions, and not just of the "can't trust what you see variety."
Because think about it.
If I'm a musical artist and I release an album, what if I accidentally recorded a song that's too close to an AI-generated track (all because AI generated literally every combination of notes?) Or, conversely, what if I have to watch as people take my music and alter it? I may feel strongly about it, but yet the music has its notes changed, its lyrics changed, my own voice changed, until it might as well be an entirely different artist making that music. Many won't mind, but many will.
I trust my mother's voice, as many do. So imagine a phisher managing to steal her voice, running it through a speech synthesis network, and then calling me asking me for my social security number. Or maybe I work at a big corporation, and while we're secure, we still recognize each other's voice, only to learn that someone stole millions of dollars from us because they stole the CEO's voice and used to to wire cash to a pirate's account.
Imagine going online and at least 70% of the "people" you encounter are bots. They're extremely coherent, and they have profile images of what looks to be real people. And who knows, you may even forge an e-friendship with some of them because they seem to share your interests. Then it turns out they're just bundles of code.
Oh, and those bot-people are also infesting social media and forums in the millions, creating and destroying trends and memes without much human input. Even if the mainstream news sites don't latch on at first, bot-created and bot-run news sites will happily kick it off for them. The news is supposed to report on major events, global and local. Even if the news is honest and telling the truth, how can they truly verify something like this, especially when it seems to be gaining so much traction and humans inevitably do get involved? Remember "Bowsette" from last year? Imagine if that was actually pushed entirely by bots until humans saw what looked like a happenin' kind of meme and joined in? That could be every year or perhaps even every month in the 2020s onwards.
Likewise, imagine you're listening to a pop song in one country, but then you go to another country and it's the exact same song but most of the lyrics have changed to be more suitable for their culture. That sort of cultural spread could stop... or it could be supercharged if audiences don't take to it and pirate songs/change them and share them at their own leisure.
Or maybe it's a good time to mention how commissioned artists are screwed? Commission work boards are already a race to the bottom— if a job says it pays three cents per word to write an article, you'd better list your going rate as 2 cents per word, and then inevitably the asking rate in general becomes 2 cents per word, and so on and so forth. That whole business might be over within five to ten years if you aren't already extremely established. Because if machines can mimic any art style or writing style (and then exaggerate & alter it to find some better version people like more), you'd have to really be tech-illiterate or very pro-human to want non-machine commissions.
And to go back to deepfakes and deep nudes, imagine the paratypical creep who takes children and puts them into sexual situations, any sexual situation they desire thanks to AI-generated images and video. It doesn't matter who, and it doesn't have to be real children either. It could even be themselves as a child if they still have the reference or use a de-aging algorithm on their face. It's squicky and disgusting to think about, but it's also inevitable and probably has already happened.
And my god, it just keeps going on and on. I can't do this justice, even with 40,000 characters to work with. The future we're about to enter is so wild, so extreme that I almost feel scared for humanity. It's not some far off date in the 22nd century. It's literally going to start happening within the next five years. We're going to see it emerge before our very eyes on this and other subreddits.
I'll end this post with some more examples.
Nvidia's new AI can turn any primitive sketch into a photorealistic masterpiece. You can even play with this yourself here.
Waifu Synthesis- real time generative anime, because obviously.
Few-Shot Adversarial Learning of Realistic Neural Talking Head Models | This GAN can animate any face GIF, supercharging deepfakes & media synthesis
Talk to Transformer | Feed a prompt into GPT-2 and receive some text. As of 9/29/2019, this uses the 774M parameter version of GPT-2, which is still weaker than the 1.5B parameter "full" version."
Text samples generated by Nvidia's Megatron-LM (GPT-2-8.3b). Vastly superior to what you see in Talk to Transformer, even if it had the "full" model.
Facebook's AI can convert one singer's voice into another | The team claims that their model was able to learn to convert between singers from just 5-30 minutes of their singing voices, thanks in part to an innovative training scheme and data augmentation technique. as a prototype for shifting vocalists or vocalist genders or anything of that sort.
TimbreTron for changing instrumentation in music. Here, you can see a neural network shift entire instruments and pitches of those new instruments. It might only be a couple more years until you could run The Beatles' "Here Comes The Sun" through, say, Slayer and get an actual song out of it.
AI generated album covers for when you want to give the result of that change its own album.
Neural Color Transfer Between Images [From 2017], showing how we might alter photographs to create entirely different moods and textures.
Scammer Successfully Deepfaked CEO's Voice To Fool Underling Into Transferring $243,000
"Experts: Spy used AI-generated face to connect with targets" [GAN faces for fake LinkedIn profiles]
This Marketing Blog Does Not Exist | This blog written entirely by AI is fully in the uncanny valley.
Chinese Gaming Giant NetEase Leverages AI to Create 3D Game Characters from Selfies | This method has already been used over one million times by Chinese gamers.
"Deep learning based super resolution, without using a GAN" [perceptual loss-based upscaling with transfer learning & progressive scaling], or in other words, "ENHANCE!"
Expert: AI-generated music is a "total legal clusterf*ck" | I've thought about this. Future music generation means that all IPs are open, any new music can be created from any old band no matter what those estates may want, and AI-generated music exists in a legal tesseract of answerless questions
And there's just a ridiculous amount more.
My subreddit, /r/MediaSynthesis, is filled with these sorts of stories going back to January of 2018. I've definitely heard of people come away in shock, dazed and confused, after reading through it. And no wonder.
© kyberia.sk v2.3 - solon's revenge
provider(s) of this
guarant Thee nothing
page generation took 0.00515 sec