Gen AI

See a song

March 23rd, 2024

thumb

I recently piped the Billboard 100 API to the Genius lyrics API to GPT-4 to an Instagram account to see what computers minus ears might see in our songs.

I giggled a lot, and learned:

  • Just feeding lyrics with no other prompts only generated that cheesy digital art style.

  • At first, I thought photorealism was what I wanted, so I added prompts like award-winning photograph, Sigma 85 mm f/8, Slow Shutter Speed, Golden Hour Lighting, Uneven skin tone. Some worked, but mostly it was still cheesy digital.

  • The best ones used simple style cues for absurd media, like old timey newspaper, wooden carving, a sweater, puppets, and the ultra-best, with felt animal characters.

  • DALL·E 3 rewrites the prompt to add more detail before executing it, so none of the images were generated from raw lyrics. That seemed to favor "mood scapes" and "kitchen sink" drawings that blended (or just cataloged) all the specific imagery in the song. No key images here. Paint It Black is not, for instance, just a red door being painted black. The row of cars is not even painted black. :/ But oddly, the rewritten prompt got it right, so something must have broken down or asserting artistic license someplace downstream:

    Visualize a red door transforming into a rich, jet-black color, eliminating all other pigments from the frame. Near the door, imagine a row of cars, also coated in black, decorated with flowers.

  • The rewritten prompt did not just add more visual detail, as I expected. Usually, it highlighted emotional and symbolic cues, which is both impressive extraction of subtext, and weird that it helps with the visualization. Here's a typical one, for the Tool song "Reflection":

    In this scene, a solitary individual is on the edge of a vast, empty expanse, possibly signifying a metaphorical pit of despair. Above, the sky fills with a full, radiant moon, a million little fragments of light cascading onto the individual below, reflecting off their form. This strange light can be seen as a symbol of hope and potential, breathing life into deserted spaces. The person begins to rise, there is a sense of crucifying the ego, leaving behind the negativity. The overall image should convey a journey from self-pity to self enlightenment and unity, embodying the key themes present in the lyrical content.

  • A lot of songs got screened for content violations. But just retrying once or twice tended to get around it. Probably the rewritten prompt factor?

Here are a few of my favorites, with an assist from Midjourney on the stock photos.

Octopus's Garden by The Beatles as a film noir

A notebook on a desk shows how Dall-E re-interpreted the lyrics of The Beatle's Octopus's Garden, and the picture it drew from that prompt.

Lyrics ~ Insta


Moonraker by Shirley Basey as a mural on the side of a brick building

A notebook with lyrical ideas about the song Moonraker.

Lyrics ~ Insta


Smells Like Teen Spirit by Nirvana as felt animals

(I love arctic fox as an albino, and calico cat as a mulatto. That's creative! I also love trying to pick out which of those little expressions is exhibiting a mix of rebellion, self-doubt, and a search for identity, and which appear overconfident, and which seem aware of their absurdity.)

A notebook with a paragraph inspired by Smells Like Teen Sprit, and an image of an albino fox

Lyrics ~ Insta


Dream Girl Evil by Florence and the Machine as a medieval tapestry

Lyrics ~ Insta


Mother by Tori Amos made from legos

Lyrics ~ Insta


Fallin 4 U Lyrics by Nicki Minaj as a child's drawing

Lyrics ~ Insta


From Russia With Love by Matt Monro as a sweater

Lyrics ~ Insta