Generative AI tools have only been around for a few years. Already, millions and millions of words have been published about the technology’s ethics, opportunities, pitfalls, biases, limitations and so on. Everyone – whether they’re a journalist, author or anonymous Twitter user – seems to have a take of varying heat.
One thing’s certain, though: the future is filled with white noise. We’re at the precipice of an era where anyone, regardless of time, wealth, knowledge or talent, can generate a lifetime’s worth of artistic work in perhaps a few weeks. The forthcoming flood of information – most of it bland and unremarkable – will be staggering, making Spotify, Netflix and Youtube’s present-day libraries look tiny in comparison.
Still, we have to hope that among all the static, original ideas will continue to shine through. Like this post on Reddit last week, for example, where a Melburnian used two AI tools to create surprisingly beautiful Miss Universe-style portraits themed as Melbourne suburbs.
Save 20% when you buy two or more Broadsheet books. Order now to make sure they arrive in time for Christmas.SHOP NOW
We liked them so much we got in touch with their creator and asked him to generate eight extra suburb portraits exclusively for Broadsheet. The man, Greyson, didn’t want his surname or other identifying details published, but we know he’s a 33-year-old research assistant who’s lived all over Melbourne and currently rents in the CBD.
Here’s what he had to say about his Miss Melbourne project, including why the images so often fall into unsubtle stereotypes – a recurring problem when it comes to generative AI.
How long have you been making images using AI?
I dabbled in a few AI art generators last year, including Dall-E, but it was only in March this year that I began paying serious attention to them, as it had become all too apparent that their capabilities had advanced significantly. I find AI art impossible to ignore now, and the process involved in generating images can feel addictive.
How did you get into it?
I’ve long been interested in art and art history, and it’s clear AI is now ushering in a new artistic paradigm. Millions of people are using it to generate countless staggering images on the fly. And given how quickly things are advancing, it's exciting and kind of scary to think about what AI will be capable of a few years from now.
For people who don’t understand how ChatGPT and Midjourney work – how does the computer come up with details like each woman’s ethnicity, the colours of the dress, motifs and so on?
I don’t know for sure how ChatGPT comes up with certain details. But since it has been trained on large swathes of the Internet, including sites like Wikipedia and census data, it must be clued into demographics, and any associated traits and stereotypes. The negative ones ChatGPT avoids, as its programmers have implemented content moderation tools to prevent anything hateful, be it racist, sexist and so on. When it came to the suburbs, ChatGPT used glowing, inclusive terms. And yet despite the efforts of its programmers, it did on the odd occasion come up with details that some might deem problematic, such as dumpling-shaped pockets for the Box Hill costume. Midjourney failed to generate these pockets for some reason.
How did you come up with this particular idea? Are there any other interesting projects you’ve worked on?
I’ve spent hours exploring online communities devoted to AI art, and have been impressed by many of the themes and concepts people have come up with. One such concept I remember seeing concerned European stereotypes. In hindsight this was probably the genesis of the Miss Universe costumes. Something else I’ve tried in Midjourney is reconstructing events from the lives of famous historical people. Events that may never have been documented in a visual medium, and that are now considered mysterious and unsolved. If there are surviving portraits of the person, I combine them in Midjourney, which can then create a strikingly believable, photorealistic average of their features. Part of me has fun believing that in some of Midjourney’s reconstructed events, there might be clues as to what really went down.
Tell me about the process in detail, from putting a prompt into ChatGPT to looking at the final image and deciding whether it's good enough to keep. How many images do you reject and for what reasons?
I began by asking ChatGPT to design Miss Universe-style costumes but for Melbourne suburbs. The costumes had to convey the character and cultural identity of a given suburb, in a creative, glitzy, extravagant, even humorous way. I also instructed ChatGPT to be concise, and to tailor its answers as prompts for an AI art generator. These prompts I then copy-pasted into Midjourney. Some of these worked off the bat, and Midjourney was able to interpret them easily and coherently. With other prompts it struggled, and I had ChatGPT condense and simplify these. Midjourney still struggled sometimes with certain details, and I had to regenerate its output up to 10 times to get a satisfactory result. One example being Miss Heidelberg, as Midjourney had a hard time creating realistic paintbrushes.
What’s your favourite image from the set? Why?
I have lived in and often visit Brunswick, and when ChatGPT came up with Miss Brunswick’s avocado hat, I knew I was onto some potential gold. It turns Midjourney cannot easily convert the avocado into a fashion accessory. Its first few attempts looked like mutant radiated blobs. Seeing the final image materialise was very gratifying. Dare I say, it’s perfect.