With the rise of systems that can write, draw, and make music prompted only by a few words, creatives are increasingly worried about A.I.’s ability to devalue and even erase the need for their craft. For some, “artificial intelligence” or A.I. have almost become almost trigger words in the media.
When the BBC recently reported Paul McCartney’s announcement that “A.I. would be used” to create a “final” Beatles song with all four of the Beatles’ voices included, McCartney received so much backlash, media coverage, and concern over the process that he had to come out with a second statement on social media clarifying what he meant. No deepfake-vocal-A.I. John Lennon was used, nor some sort of lyric writing machine. Instead the tech was able to go into an existing, though low-quality, recording of the song and pull out Lennon’s actual vocals, take out an electric buzz and other background noises, and make his vocals viable for use.
As that incident shows, A.I. use can be uncomfortable and frightening. A complex issue, artists both embrace and resist the technology and views on the matter range anywhere from it being an exciting tool to enhance human creativity to a technology that will eradicate the need for humans.
One of the first questions to arise is whether or not generative music, being trained on datasets of old songs, has the possibility to be “good” or truly original. Singer-Songwriter Nick Cave believes the answer is no. After a fan sent Cave a song written by ChatGPT “in the style of Nick Cave,” he responded in his weekly post to The Red Hand Files, calling it “a grotesque mockery of what it means to be human” and emphasizing the artistry, pain, and humanity that ‘true’ songwriting requires. “[ChatGPT] could perhaps in time create a song that is, on the surface, indistinguishable from an original, but it will always be a replication,” he wrote. “Songs arise out of suffering. Algorithms don’t feel, data doesn’t suffer. Writing a good song is not mimicry, or replication, or pastiche, it is the opposite. It is an act of self-murder that destroys all one has strived to produce in the past, A.I. can only mimic.”
To Nick Cave’s point, the things A.I. cannot do are what has made music so interesting and exciting in the past. Both the Rolling Stones and the Beach Boys were heavily influenced by Chuck Berry. But neither sound anything like him or each other. If you trained an A.I. system on all of Berry’s work, it would currently be implausible to expect it to come up with anything other than imitative Chuck Berry songs. Those who fear the future of A.I. should be comforted by the fact that true inspiration is different from data input. Evolution and boundary pushing is (currently) only possible with creative, human, minds at work, rethinking the ways of the past. A.I. is not able to reimagine a world totally unlike any other, but can create in reference to old ideas.
And as TikTok continues to meme-ify and commercialize music from all generations and artists are pressured by their labels to become TikTokers themselves and write for virality and sound-bites, some argue that the music industry is desperate for a breath of fresh air. Ezra Sandzer-Bell is the creator of AudioCipher, a plugin that uses musical cryptography to turn words into melodies in a Digital Audio Workstation (DAW). While AudioCipher itself does not use A.I., it puts a spotlight on sites that are.
Ezra Sandzer-Bell is the creator of AudioCipher, a plugin that uses musical cryptography to turn words into melodies in a Digital Audio Workstation (DAW). While AudioCipher itself does not use A.I., it puts a spotlight on the sites that are. “Right now, there’s just so much commercial crap, and then there’s people who have no money or time so they’re just regurgitating styles that already existed and it’s a culture play,” says Sandzer-Bell. “They’re not creating a new game, they’re just playing an existing game. I’m looking for something that’s complex and rich and different and nuanced and innovative.” According to Sandzer-Bell, A.I. tools are going to revolutionize the game, giving artists the tools and freedom to “to do something innovative that hasn’t been done before, and out of that we might start seeing new styles born. To me, that’s where the most novelty could exist.”
One site pushing innovation is WarpSound, an adaptive A.I. music system that was trained using only their own musicians (i.e. without copyright infringements). According to its founder and CEO, Chris McGarry, the system is able to compose and produce “original generative A.I. music in real time, on demand.” This “conversational creative flow” as McGarry calls it, offers users the ability to utilize A.I. as a real-time song producing and writing partner, bouncing ideas off of them, and using the system as a limitless source of new material that you as the artist get to shape, mold, and build on top of. “These machines are tools that unlock new ways of expressing and creating music, unlock new ways of interacting with it, playing with it,” says McGarry. “But no one can be a human but a human.”
In one presentation of WarpSound’s abilities, he showed the site’s setup. A dial for BPM sits next to a big blue button marked “GENERATE.” Underneath, there are controls for lead, pad, bass, percussion where you are able to control the volume, vibe, “wetness,” and filter of each and also enables you to “roll the dice” on what time of sound you’ll get for each. If you do so, WarpSound’s A.I. will compose and produce a bass or percussion for you. WarpSound also allows you to change genres between dance, hip-hop, and lo-fi, mutate the sound to be more robot-like or even “slime-ified,” and add special sounds in. After hitting the “Generate” button, and without messing with any of the dials, the system immediately starts playing music. Not immediately satisfied, McGarry went back in to mutate the sound and change the balances of the instruments.
“Conversational flow is this concept of real time dynamic generative music,” McGarry says. “What we’re seeing with ChatGPT is the power of this flow where you as a user have this idea, you’re looking for something, you text prompt ChatGPT, you get something out, and then you can refine that. So this is conceptually similar to that except with music. What’s the fastest time to creativity? It’s ‘I have an idea, I express it in language, I have the system interpret it and deliver music.’ We’re building towards a system where a consumer could iterate on that and refine it.”
Many A.I. leaders and supporters share McGarry’s vision: remove the emphasis of creativity from realizing an idea to simply having one. This could be life-changing to a creator who is disabled in some way, or maybe can’t afford their own equipment or music lessons. McGarry believes A.I.’s greatest benefit will be its ability to make music more accessible than ever. “I think music is our first language, even before we articulate words.
I think it’s a universal, borderless, language and I think it’s our most powerful language. What we’re seeing with generative A.I. is really the ability to give everyone a way to be self-expressive with this language, and to be able to speak this language again.” But musicians who have devoted their lives to mastering an instrument or musical skill are, understandably, concerned about the advancing tech and its potential to disrupt or even eradicate their profession. Additionally, as these generators are able to compose beats, jingles, or even film scores better and better, jobs may become even scarcer for working musicians.
Though not discussing music production or creation specifically, Business Insider reported that many A.I. enthusiasts believe if you can get ahead of the machine, there’s really no cause for concern. At the 2023 World Economic Forum’s Growth Summit, Richard Baldwin, an economist and professor at the Geneva Graduate Institute in Switzerland, said that “A.I. won’t take your job, it’s somebody using A.I. that will take your job.”
On the other side, however, people like Martin Clancy, musician and the founding chair of the Institute of Electrical and Electronics Engineers’ (IEEE) Global A.I. Ethics Arts Committee, warns people of potential cultural losses that may be overlooked.“What’s at stake,” he told The New York Times, “are things we take for granted listening to music made by humans, people doing that as a livelihood and it being recognized as a special skill.
Nearly everyone agrees, however, that, good or bad, A.I. is going to have a huge impact on the world. Chris McGarry believes that “adaptive music” is going to play a huge role in the future of A.I. across industries. “These machines are tools that unlock new ways of expressing and creating music, unlock new ways of interacting with it, playing with it,” says McGarry. A big market is game studios and twitch streamers that want music that responds to player behaviors and player actions.
So instead of having the same track on a loop or hard cuts between tracks while a player works their way through the game, the player’s behavior and actions would be mapped to a system like WarpSound which would change the music, adding in kettle drums and increasing the intensity of the music, for example, as the player reaches the boss. It’s pretty remarkable to see the true “adaptiveness” of this technology, its ability to seamlessly move between ideas. Imagine one song smoothly morphing into another right when you ask it to. Switching between percussion rhythms or moving from dance mode to lo-fi, the system composes a transition, in real time, into the new sound.
Holly Herndon is an American artist and composer who completed her Ph.D. at Stanford University’s Center for Computer Research in Music and Acoustics. She worked to get ahead of the A.I. curve, recently developing what she calls her digital twin, Holly+. The voice instrument and website is described as an “experiment in communal voice ownership.” The A.I. enables anyone to upload audio and have it sung back in Herndon’s voice. Her website stresses the importance of artist’s being the one’s to push new technology forward, not corporations, and hopes that this experiment (Holly+) will allow “artists to take control of their digital selves without obstructing experimentation with punitive copyright lawsuits.”
As both a musician and doctor in computer science, Herndon offers a unique perspective. She cannot imagine vocal deep fakes disappearing and even argues that “the voice is inherently communal, learned through mimesis and language, and interpreted through individuals.” Instead of being disempowered by the advancement in technology, she says that a “balance needs to be found between protecting artists, and encouraging people to experiment with a new and exciting technology. In stepping in front of a complicated issue, we think we have found a way to allow people to perform through my voice, reduce confusion by establishing official approval, and invite everyone to benefit from the proceeds generated from its use.”
Holly+ is also an economic experiment, working to understand licensing and ownership of art in the age of A.I. Anyone is able to use Holly+ free of charge for unofficial use, but “the vocal model IP for Holly+ will be owned by a DAO coop which can vote and approve official usage, and funds generated from the usage and licensing of the tools will be shared with the co-op to fund new tool development.” The ability to collaborate with your favorite artist’s voice could transform how fans and other creators interact and are inspired.
One concern with A.I. deep fakes, however, is that people will use them to say hateful things, or endorse ideas and products that the owner of that voice may not agree with. Herndon works around this with her ability to vote to approve or disapprove of “official” usage, but it won’t always be possible to stop every infringement or misuse. Additionally, Sandzer-Bell, believes that policing every use of platforms like Holly+ can be a slippery slope in terms of free speech and creative expression, and fears a much more despicable use of the technology.
“Speech is speech. It’s up to listeners to decide [what] they want to support and if they want to listen to [hateful messages] or not and it’s always going to have to be a collective effort. The thing that worries me a lot more than saying hateful things with someone’s voice, is impersonating someone’s voice to rob their family members or something like that. I’m much more worried about that. Now people are going to have safe words and that’s just the way it’s going to be and hopefully no one will learn the safewords.” While there is currently no tangible solution to this problem, many hope that the same technology used to create vocal deep fakes will be able to detect them in the future. And as artist-developed experiments like Holly+ run into these issues, the hope is they can be solved in a way that helps drive a safe and respectful space for both artists and continued innovation.
Experts in the field agree that generative music is headed towards more platforms like Holly+, where artists train their own A.I. in their unique style and sell access. But they are also interested in seeing how it transforms even the way we define what a “song” is. On Spotify and Apple Music, songs are forced into boxes we don’t think about,” says Sandzer-Bell. “They can only be so long, they need a title, they fit into EPs and Albums. Artists are constrained to things we take for granted because we just think ‘that’s how songs work.’ But no, there’s other types of music. I think what could happen is music is going to introduce and bring in new genres of music. So if you can think of it, it’s going to be able to do it.”
One site that is pushing the way we think about songs and music is Dadabots, a platform that makes “raw audio neural networks that can imitate bands.” They train each neural network to generate sequences of things like raw acoustic waveforms of metal albums. On their site they explain that as their A.I. listens, it “tries to guess the next fraction of a millisecond. It plays this game millions of times over a few days. After training, we ask it to come up with its own music, similar to how a weather forecast machine can be asked to invent centuries of seemingly plausible weather patterns.” Then they take what they like from what it creates and arrange it into an album. While they didn’t ask permission to use the songs they train on, they also were not selling any of the generated music and contacted the band(s) afterwards.
Additionally, they have 24/7 streams of A.I. generated “lofi classic metal” and what they call “Relentless Doppelganger Neural Technical Death Metal.” Similarly, WarpSound offers a 24/7 streaming service, but adds the ability for users to vote on how the stream should change, whether the music should be robot-ified or crystalized, include more cowbell or add a chainsaw. “Are they putting soul and funk musicians out of a job with this?” Sandzer-Bell asks when talking about the streams. “No, absolutely not. What they’re doing is rendering infinite music out of a cloud and streaming it to YouTube. It’s an exciting approach to thinking about what A.I. can do outside of the box.”
There remain many questions and concerns about how our lives will be impacted by the introduction and expansion of generative A..I. Part of what makes the topic so unsettling, however, is that we’re watching it unravel in real-time, often playing catch-up and struggling to get ahead of the curve. “It’s something we’re all sort of tackling at the moment and trying to deal with,” Paul McCartney told the BBC. “It’s the future. We’ll just have to see where that leads.”