Episode 3 of the Baillie Gifford Prize’s ReadSmart Podcast, out Friday 29 May, has me discussing trends in artificial intelligence with the mathematician Hannah Fry. Razia Iqbal is in the chair. Available from all high-class streaming emporia; this is the Spotify one spoti.fi/3bJa5v7
A COUPLE of days before the opening of Trevor Paglen’s latest photographic installation, From “Apple” to “Anomaly”, a related project by the artist found itself splashed all over the papers.
ImageNet Roulette is an online collaboration with artificial intelligence researcher Kate Crawford at New York University. The website invites you to provide an image of your face. An algorithm will then compare your face against a database called ImageNet and assign you to one or two of its 21,000 categories.
ImageNet has become one of the most influential visual data sets in the fields of deep learning and AI. Its creators at Stanford, Princeton and other US universities harvested more than 14 million photographs from photo upload sites and other internet sources, then had them manually categorised by some 25,000 workers on Amazon’s crowdsourcing labour site Mechanical Turk. ImageNet is widely used as a training data set for image-based AI systems and is the secret sauce within many key applications, from phone filters to medical imaging, biometrics and autonomous cars.
According to ImageNet Roulette, I look like a “political scientist” and a “historian”. Both descriptions are sort-of-accurate and highly flattering. I was impressed. Mind you, I’m a white man. We are all over the internet, and the neural net had plenty of “my sort” to go on.
Spare a thought for Guardian journalist Julia Carrie Wong, however. According to ImageNet Roulette she was a “gook” and a “slant-eye”. In its attempt to identify Wong’s “sort”, ImageNet Roulette had innocently turned up some racist labels.
From “Apple” to “Anomaly” also takes ImageNet to task. Paglen took a selection of 35,000 photos from ImageNet’s archive, printed them out and stuck them to the wall of the Curve gallery at the Barbican in London in a 50-metre-long collage.
The entry point is images labelled “apple” – a category that, unsurprisingly, yields mostly pictures of apples – but the piece then works through increasingly abstract and controversial categories such as “sister” and “racist”. (Among the “racists” are Roger Moore and Barack Obama; my guess is that being over-represented in a data set carries its own set of risks.) Paglen explains: “We can all look at an apple and call it by its name. An apple is an apple. But what about a noun like ‘sister’, which is a relational concept? What might seem like a simple idea – categorising objects or naming pictures – quickly becomes a process of judgement.”
The final category in the show is “anomaly”. There is, of course, no such thing as an anomaly in nature. Anomalies are simply things that don’t conform to the classification systems we set up.
Halfway along the vast, gallery-spanning collage of photographs, the slew of predominantly natural and environmental images peters out, replaced by human faces. Discrete labels here and there indicate which of ImageNet’s categories are being illustrated. At one point of transition, the group labelled “bottom feeder” consists entirely of headshots of media figures – there isn’t one aquatic creature in evidence.
Scanning From “Apple” to “Anomaly” gives gallery-goers many such unexpected, disconcerting insights into the way language parcels up the world. Sometimes, these threaten to undermine the piece itself. Passing seamlessly from “android” to “minibar”, one might suppose that we are passing from category to category according to the logic of a visual algorithm. After all, a metal man and a minibar are not so dissimilar. At other times – crossing from “coffee” to “poultry”, for example – the division between categories is sharp, leaving me unsure how we moved from one to another, and whose decision it was. Was some algorithm making an obscure connection between hens and beans?
Well, no: the categories were chosen and arranged by Paglen. Only the choice of images within each category was made by a trained neural network.
This set me wondering whether the ImageNet data set wasn’t simply being used as a foil for Paglen’s sense of mischief. Why else would a cheerleader dominate the “saboteur” category? And do all “divorce lawyers” really wear red ties?
This is a problem for art built around artificial intelligence: it can be hard to tell where the algorithm ends and the artist begins. Mind you, you could say the same about the entire AI field. “A lot of the ideology around AI, and what people imagine it can do, has to do with that simple word ‘intelligence’,” says Paglen, a US artist now based in Berlin, whose interest in computer vision and surveillance culture sprung from his academic career as a geographer. “Intelligence is the wrong metaphor for what we’ve built, but it’s one we’ve inherited from the 1960s.”
Paglen fears the way the word intelligence implies some kind of superhuman agency and infallibility to what are in essence giant statistical engines. “This is terribly dangerous,” he says, “and also very convenient for people trying to raise money to build all sorts of shoddy, ill-advised applications with it.”
Asked what concerns him more, intelligent machines or the people who use them, Paglen answers: “I worry about the people who make money from them. Artificial intelligence is not about making computers smart. It’s about extracting value from data, from images, from patterns of life. The point is not seeing. The point is to make money or to amplify power.”
It is a point by no means lost on a creator of ImageNet itself, Fei-Fei Li at Stanford University in California, who, when I spoke to Paglen, was in London to celebrate ImageNet’s 10th birthday at the Photographers’ Gallery. Far from being the face of predatory surveillance capitalism, Li leads efforts to correct the malevolent biases lurking in her creation. Wong, incidentally, won’t get that racist slur again, following ImageNet’s announcement that it was removing more than half of the 1.2 million pictures of people in its collection.
Paglen is sympathetic to the challenge Li faces. “We’re not normally aware of the very narrow parameters that are built into computer vision and artificial intelligence systems,” he says. His job as artist-cum-investigative reporter is, he says, to help reveal the failures and biases and forms of politics built into such systems.
Some might feel that such work feeds an easy and unexamined public paranoia. Peter Skomoroch, former principal data scientist at LinkedIn, thinks so. He calls ImageNet Roulette junk science, and wrote on Twitter: “Intentionally building a broken demo that gives bad results for shock value reminds me of Edison’s war of the currents.”
Paglen believes, on the contrary, that we have a long way to go before we are paranoid enough about the world we are creating.
Fifty years ago it was very difficult for marketing companies to get information about what kind of television shows you watched, what kinds of drinking habits you might have or how you drove your car. Now giant companies are trying to extract value from that information. “I think,” says Paglen, “that we’re going through something akin to England and Wales’s Inclosure Acts, when what had been de facto public spaces were fenced off by the state and by capital.”
Thanks (I assume) to the those indefatigable Head of Zeus people, who are even now getting my anthology We Robots ready for publication, I’m invited to this year’s Berlin International Literature Festival, to take part in Automatic Writing 2.0, a special programme devoted to the literary impact of artifical intelligence.
Amidst other mischief, on Sunday 15 September at 12:30pm I’ll be reading from a new story, The Overcast.
In 1871, the polymath and computer pioneer Charles Babbage died at his home in Marylebone. The encyclopaedias have it that a urinary tract infection got him. In truth, his final hours were spent in an agony brought on by the performances of itinerant hurdy-gurdy players parked underneath his window.
I know how he felt. My flat, too, is drowning in something not quite like music. While my teenage daughter mixes beats using programs like GarageBand and Logic Pro, her younger brother is bopping through Helix Crush and My Singing Monsters — apps that treat composition itself as a kind of e-sport.
It was ever thus: or was once 18th-century Swiss watchmakers twigged that musical snuff-boxes might make them a few bob. And as each new mechanical innovation has emerged to ‘transform’ popular music, so the proponents of earlier technology have gnashed their teeth. This affords the rest of us a frisson of Schadenfreude.
‘We were musicians using computers,’ complained Pete Waterman, of the synthpop hit factory Stock Aitken Waterman in 2008, 20 years past his heyday. ‘Now it’s the whole story. It’s made people lazy. Technology has killed our industry.’ He was wrong, of course. Music and mechanics go together like beans on toast, the consequence of a closer-than-comfortable relation between music and mathematics. Today, a new, much more interesting kind of machine music is emerging to shape my children’s musical world, driven by non-linear algebra, statistics and generative adversarial networks — that slew of complex and specific mathematical tools we lump together under the modish (and inaccurate) label ‘artificial intelligence’.
Some now worry that artificially intelligent music-makers will take even more agency away from human players and listeners. I reckon they won’t, but I realise the burden of proof lies with me. Computers can already come up with pretty convincing melodies. Soon, argues venture capitalist Vinod Khosla, they will be analysing your brain, figuring out your harmonic likes and rhythmic dislikes, and composing songs made-to-measure. There are enough companies attempting to crack it; Popgun, Amper Music, Aiva, WaveAI, Amadeus Code, Humtap, HumOn, AI Music are all closing in on the composer-less composition.
The fear of tech taking over isn’t new. The Musicians’ Union tried to ban synths in the 1980s, anxious that string players would be put out of work. The big disruption came with the arrival of Kyoko Date. Released in 1996, she was the first seriously publicised attempt at a virtual pop idol. Humans still had to provide Date with her singing and speaking voice. But by 2004 Vocaloid software — developed by Kenmochi Hideki at the Pompeu Fabra University in Barcelona — enabled users to synthesise ‘singing’ by typing in lyrics and a melody. In 2016 Hatsune Miku, a Vocaloid-powered 16-year-old artificial girl with long, turquoise twintails, went, via hologram, on her first North American tour. It was a sell-out. Returning to her native Japan, she modelled Givenchy dresses for Vogue.
What kind of music were these idoru performing? Nothing good. While every other component of the music industry was galloping ahead into a brave new virtualised future — and into the arms of games-industry tech — the music itself seemed stuck in the early 1980s which, significantly, was when music synthesizer builder Dave Smith had first come up with MIDI.
MIDI is a way to represent musical notes in a form a computer can understand. MIDI is the reason discrete notes that fit in a grid dominate our contemporary musical experience. That maddenning clockwork-regular beat that all new music obeys is a MIDI artefact: the software becomes unwieldy and glitch-prone if you dare vary the tempo of your project. MIDI is a prime example (and, for that reason, made much of by internet pioneer-turned-apostate Jaron Lanier) of how a computer can take a good idea and throw it back at you as a set of unbreakable commandments.
For all their advances, the powerful software engines wielded by the entertainment industry were, as recently as 2016, hardly more than mechanical players of musical dice games of the sort popular throughout western Europe in the 18th century.
The original games used dice randomly to generate music from precomposed elements. They came with wonderful titles, too — witness C.P.E. Bach’s A method for making six bars of double counterpoint at the octave without knowing the rules (1758). One 1792 game produced by Mozart’s publisher Nikolaus Simrock in Berlin (it may have been Mozart’s work, but we’re not sure) used dice rolls randomly to select beats, producing a potential 46 quadrillion waltzes.
All these games relied on that unassailable, but frequently disregarded truth, that all music is algorithmic. If music is recognisable as music, then it exhibits a small number of formal structures and aspects that appear in every culture — repetition, expansion, hierarchical nesting, the production of self-similar relations. It’s as Igor Stravinsky said: ‘Musical form is close to mathematics — not perhaps to mathematics itself, but certainly to something like mathematical thinking and relationship.’
As both a musician and a mathematician, Marcus du Sautoy, whose book The Creativity Code was published this year, stands to lose a lot if a new breed of ‘artificially intelligent’ machines live up to their name and start doing his mathematical and musical thinking for him. But the reality of artificial creativity, he has found, is rather more nuanced.
One project that especially engages du Sautoy’s interest is Continuator by François Pachet, a composer, computer scientist and, as of 2017, director of the Spotify Creator Technology Research Lab. Continuator is a musical instrument that learns and interactively plays with musicians in real time. Du Sautoy has seen the system in action: ‘One musician said, I recognise that world, that is my world, but the machine’s doing things that I’ve never done before and I never realised were part of my sound world until now.’
The ability of machine intelligences to reveal what we didn’t know we knew is one of the strangest and most exciting developments du Sautoy detects in AI. ‘I compare it to crouching in the corner of a room because that’s where the light is,’ he explains. ‘That’s where we are on our own. But the room we inhabit is huge, and AI might actually help to illuminate parts of it that haven’t been explored before.’
Du Sautoy dismisses the idea that this new kind of collaborative music will be ‘mechanical’. Behaving mechanically, he points out, isn’t the exclusive preserve of machines. ‘People start behaving like machines when they get stuck in particular ways of doing things. My hope is that the AI might actually stop us behaving like machines, by showing us new areas to explore.’
Du Sautoy is further encouraged by how those much-hyped ‘AIs’ actually work. And let’s be clear: they do not expand our horizons by thinking better than we do. Nor, in fact, do they think at all. They churn.
‘One of the troubles with machine-learning is that you need huge swaths of data,’ he explains. ‘Machine image recognition is hugely impressive, because there are a lot of images on the internet to learn from. The digital environment is full of cats; consequently, machines have got really good at spotting cats. So one thing which might protect great art is the paucity of data. Thanks to his interminable chorales, Bach provides a toe-hold for machine imitators. But there may simply not be enough Bartok or Brahms or Beethoven for them to learn on.’
There is, of course, the possibility that one day the machines will start learning from each other. Channelling Marshall McLuhan, the curator Hans Ulrich Obrist has argued that art is an early-warning system for the moment true machine consciousness arises (if it ever does arise).
Du Sautoy agrees. ‘I think it will be in the world of art, rather than in the world of technology, that we’ll see machines first express themselves in a way that is original and interesting,’ he says. ‘When a machine acquires an internal world, it’ll have something to say for itself. Then music is going to be a very important way for us to understand what’s going on in there.’
By the end of the show, I was left less impressed by artificial intelligence and more depressed that it had reduced my human worth to base matter. Had it, though? Or had it simply made me aware of how much I wanted to be base matter, shaped into being by something greater than myself? I was reminded of something that Benjamin Bratton, author of the cyber-bible The Stack, said in a recent lecture: “We seem only to be able to approach AI theologically.”
Mark Allan / Barbican
On All Hallow’s Eve this year, at London’s Barbican Hall, the London Contemporary Orchestra, under the baton of their co-artistic director Robert Ames, managed with two symphonic pieces to drown the world and set it ablaze in the space of a single evening.
Giacinto Scelsi’s portentously titled Uaxuctum: The legend of the Maya City, destroyed by the Maya people themselves for religious reasons, evoked the mysterious and violent collapse of that once thriving civilisation; the second piece of the evening, composer and climate activist John Luther Adams’s Become Ocean, looked to the future, the rise of the world’s oceans, and good riddance to the lot of us.
Lost Worlds was a typical piece of LCO programming: not content with presenting two very beautiful but undeniably challenging long-ish works, the orchestra had elected to play behind a translucent screen onto which were projected the digital meanderings of an artistically trained neural net. Twists of entoptic colour twisted and cavorted around the half-seen musicians while a well-place spotlight, directly over Ames’s head, sent the conductor’s gestures sprawling across the screen, as though ink were being dashed over all those pretty digitally generated splotches of colour.
Everything, on paper, pointed to an evening that was trying far too hard to be avant garde. In the execution, however, the occasion was a triumph.
The idea of matching colours to sounds is not new. The painter Wassily Kandinsky struggled for years to fuse sound and image and ended up inventing abstract painting, more or less as a by-product. The composer Alexander Scriabin was so desperate to establish his reputation as the founder of a new art of colour-music, he plagiarised other people’s synaesthetic experiences in his writings and invented a clavier à lumières (“keyboard with lights”) for use in his work Prometheus: Poem of Fire. “It is not likely that Scriabin’s experiment will be repeated by other composers,” wrote a reviewer for The Nation after its premiere in New York in 1915: “moving-picture shows offer much better opportunities.” (Walt Disney proved The Nation right: Fantasia was released in 1937.)
Now, as 2018 draws to a close, artificial intelligence is being hurled at the problem. For this occasion the London-based theatrical production company Universal Assembly Unit had got hold of a recursive neural net engineered by Artrendex, a company that uses artificial intelligence to research and predict the art market. According to the concert’s programme note, it took several months to train Artrendex’s algorithm on videos of floods and fires, teaching it the aesthetics of these phenomena so that, come the evening of the performance, it would construct organic imagery in response to the music.
Mark Allan / Barbican
While never obscuring the orchestra, the light show was dramatic and powerful, sometimes evoking (for those who enjoy their Andrei Tarkovsky) the blurriness of the clouds swamping the ocean planet Solaris in the movie of that name; then at other moments weaving and flickering, not so much like flames, but more like the speeded-up footage from some microbial experiment. Maybe I’ve worked at New Scientist too long, but I got the distinct and discomforting impression that I was looking, not at some dreamy visual evocation of a musical mood, but at the the responses of single-celled life to desperate changes in their tiny environment.
As for the music – which was, after all, the main draw for this evening – it is fair to say that Scelsi’s Uaxuctum would not be everyone’s cup of tea. For a quick steer, recall the waily bits from 2001: A Space Odyssey. That music was by the Hungarian composer György Ligeti, who was born about two decades after Scelsi, and was — both musically and personally — a lot less weird. Scelsi was a Parisian dandy who spent years in a mental institution playing one piano note again and again and Uaxuctum, composed in 1966, was such an incomprehensibly weird and difficult proposition, it didn’t get any performance at all for 21 years, and no UK performance at all before this one.
John Luther Adams’s Become Ocean (2013) is an easier (and more often performed) composition – The New Yorkermusic critic Alex Ross called it “the loveliest apocalypse in musical history”. This evening its welling sonorities brought hearts into mouths: rarely has mounting anxiety come wrapped in so beautiful a package.
So I hope it takes nothing away from the LCO’s brave and accomplished playing to say that the visual component was the evening’s greatest triumph. The dream of “colour music” has ended in bathos and silliness for so many brilliant and ambitious musicians. Now, with the judicious application of some basic neural networking, we may at last be on the brink of fusing tone and colour into an art that’s genuinely new, and undeniably beautiful.
On paper, Pierre Huyghe’s new exhibition at the Serpentine Gallery in London is a rather Spartan effort. Gone are the fictional characters, the films, the drawings; the collaborative manga flim-flam of No Ghost Just a Shell; the nested, we’re not-in-Kansas-any-more fictions, meta-fictions and crypto-documentaries of Streamside Day Follies. In place of Huyghe’s usual stage blarney come five large LED screens. Each displays a picture that, as we watch, shivers through countless mutations, teetering between snapshot clarity and monumental abstraction. One display is meaty; another, vaguely nautical. A third occupies a discomforting interzone between elephant and milk bottle.
Huyghe has not abandoned all his old habits. There are smells (suggesting animal and machine worlds), sounds (derived from brain-scan data, but which sound oddly domestic: was that not a knife-drawer being tidied?) and a great many flies. Their random movements cause the five monumental screens to pause and stutter, and this is a canny move, because without that arbitrary grammar, Huyghe’s barrage of visual transformations would overwhelm us, rather than excite us. There is, in short, more going on here than meets the eye. But that, of course, is true of everywhere: the show’s title nods to the notion of “Umwelt” coined by the zoologist Jacob von Uexküll in 1909, when he proposed that the significant world of an animal was the sum of things to which it responds, the rest going by virtually unnoticed. Huyghe’s speculations about machine intelligence are bringing this story up to date.
That UUmwelt turns out to be a show of great beauty as well; that the gallery-goer emerges from this most abstruse of high-tech shows with a re-invigorated appetite for the arch-traditional business of putting paint on canvas: that the gallery-goer does all the work, yet leaves feeling exhilarated, not exploited — all this is going to require some explanation.
To begin at the beginning, then: Yukiyasu Kamitani , who works at Kyoto University in Japan, made headlines in 2012 when he fed the data from fMRI brain scans of sleeping subjects into neural networks. These computer systems eventually succeeded in capturing shadowy images of his volunteers’ dreams. Since then his lab has been teaching computers to see inside people’s heads. It’s not there yet, but there are interesting blossoms to be plucked along the way.
UUmwelt is one of these blossoms. A recursive neural net has been shown about a million pictures, alongside accompanying fMRI data gathered from a human observer. Next, the neural net has been handed some raw fMRI data, and told to recreate the picture the volunteer was looking at.
Huyghe has turned the ensuing, abstruse struggles of the Kamitani Lab’s unthinking neural net into an exhibition quite as dramatic as anything he has ever made. Only, this time, the theatrics are taking place almost entirely in our own heads. What are we looking at here? A bottle. No, an elephant, no, a Francis Bacon screaming pig, goose, skyscraper, mixer tap, steam train mole dog bat’s wing…
The closer we look, the more engaged we become, the less we are able to describe what we are seeing. (This is literally true, in fact, since visual recognition works just that little bit faster than linguistic processing.) So, as we watch these digital canvases, we are drawn into dreamlike, timeless lucidity: a state of concentration without conscious effort that sports psychologists like to call “flow”. (How the Serpentine will ever clear the gallery at the end of the day I have no idea: I for one was transfixed.)
UUmwelt, far from being a show about how machines will make artists redundant, turns out to be a machine for teaching the rest of us how to read and truly appreciate the things artists make. It exercises and strengthens that bit of us that looks beyond the normative content of images and tries to make sense of them through the study of volume, colour, light, line, and texture. Students of Mondrian, Duffy and Bacon, in particular, will lap up this show.
Remember those science-fictional devices and medicines that provide hits of concentrated education? Quantum physics in one injection! Civics in a pill! I think Huyghe may have come closer than anyone to making this silly dream a solid and compelling reality. His machines are teaching us how to read pictures, and they’re doing a good job of it, too.
Can machines tell stories? I’ll be discussing AI’s movie-making ambitions with Dara O’Briain this Friday at 2.45pm. It’s a panel event, part of New Scientist Live, which runs from 22 to 25 September at ExCel.