Love them or loathe them, AI art generators such as openAI’s Dall-E 2 are here to stay, and in the space of just a year or so the number of such programs has increased, and the quality of their output has improved dramatically.
These AI tools are still in their first iterations, but they’ve already made headlines for creating photorealistic images good enough to win competitions, although there’s also plenty that AI art generators can't do, and its hyperbole to suggest they’re ever going to fully replace photographers.
Avid adopters of this technology will argue that reality is the least interesting thing to shoot for with AI art generators, and that reimagining historic works of art, or applying the style of famous artists to your own images is all good, clean fun.
So what are your options if you want to dabble in AI image creation? We’ve got to grips with some of the most popular AI art generators, and below we’ll talk about the user experience and quality of output you can expect. After that we’ll explain more about what they are, how they work, and why they can be considered controversial.
The first AI art generator to attract widespread attention was OpenAI’s Dall-E 2 and it has gone on to push image generator boundaries. Its second iteration even includes a beta version of an image editing tool that allows you to fine-tune the AI-generated results.
Overall, this is the most user-friendly AI art generator, and getting started couldn’t be easier. The main browser window has a search bar, and you simply need to type in your word prompts, and a minute or two later up pops four separate 1,024 x 1,024 pixel image variations.
All of your – or rather yours and Dall-E 2’s – creations can be kept in a panel on the right hand side of the interface in corresponding rows of four. When you select a row, the word prompts reappear in the search bar – a really helpful feature if you want to recall your word choice and further mix up the word prompts for different results.
Click on one of the four variations and it expands for closer inspection, and you’ll also see a download button, plus Edit, Variations, Share and Save options.
The Edit option opens up the beta image editing tools, with options such as ‘generation frame’ – which Dall-E 2 calls ‘Outpainting’ – which expands the original image borders with further word-prompted image squares.
The fruit of Outpainting is a larger image with additional detail – we made a six-frame image with a resolution exceeding 3,000 x 2,000 pixels, but the resolution is theoretically unlimited.
Pre-existing images can be imported to the editor, too – your own photos for example – and AI edits made to them. These generated edits are not limited to the weird and wacky – ‘change photo to Dali painting style’ or ‘make me smile instead of frowning’; you can also apply more conventional Photoshop-style edits such as selective sharpening.
You can also create variations of your favorite generated images, share images easily, and save images to collections for future reference. So while it’s easy to get started, there’s also plenty of scope to dig deeper and get more out of the tool, too.
Although there’s little to complain about regarding the user experience with Dall-E 2, it doesn’t take our top spot for producing photo-realistic images straight off the bat, which is arguably the biggest con for photographers, and potentially a deal breaker. Many of the original images created in the generator have an obvious painterly feel to them, while critical details such as people’s hands and other limbs go amiss, compromising all-round believability.
We do like how faithfully Dall-E 2 can follow word prompts, but it can also take a few rounds of prompts to get a satisfactory result. Still, this is one of the few free generators for light users – after an initial free 25 credits, there’s a rolling monthly 15 free credits allowance. If you need more, it costs $15 per 115 credits.
Midjourney uses the ‘Discord’ social platform, which can be accessed via your web browser or downloaded as an app to Windows, Mac, iOS and Android. The user experience is way more complex than Dall-E 2, but there are also benefits.
In the Discord web app, you join a chat room where you can view the creations of other users on a live feed. To create your own, you type ‘/imagine’ followed by your choice of word prompts in the message thread.
A minute later, your resulting image appears in the feed, with four options condensed into a 1,024 x 1,024 pixel image that can be opened in the browser and optionally downloaded.
Expand the resulting image for upscale and variation options for each of the four images. For example, selecting U1 upscales the top-left image to a single 1,024 x 1,024 pixel image, while V4 prompts a further four variations of the generated image.
The process is fiddlier than other generators, but you’re more likely to fine tune to a satisfactory result quicker – those variations of an image maintain attributes of the original, but in a portrait for example, might include a slight change in facial expression.
It all takes some getting used to, and the constant flow of other user’s creations has its pros and cons. In its favor, this ‘social’ side to Discord can be inspiring as other users' creations pop up – and not just the results but also the related word prompts.
On the downside, the constant flow can be distracting and frustrating if you simply want to get on with creating, without losing your own creations in the chat room ether. There’s also content that you might not want to be exposed to – some users seem hell-bent on generating explicit images.
In terms of image quality, Discord has come on leaps and bounds since it was first released, and is capable of the most photorealistic output of the image generators featured here, trumping Dall-E 2 and Stable Diffusion. Images have greater clarity, sharpness and saturation.
On the flipside, the app struggles to keep it ‘real’; no amount of word prompts such as ‘cloudy overcast light’ can override the algorithm that favors overly saturated golden-hour glow in your landscapes.
That said, on social platforms like Instagram, where a lot of photography is hyper-stylized, Midjourney images can fit right in – we’ve seen viral ‘photographer’ accounts turn out to be AI frauds. If the end image being high-clarity is all that matters, Midjourney’s Discord app is the most powerful tool available.
After a decent initial allowance of 50 free credits, Midjourney is a subscription service, with prices starting at $8p/m. It’s good value for those likely to embrace AI images, but steep compared to the competition.
Stable Diffusion is available under various guises. Its simplest form is the Stable Diffusion web app, but there’s more to be had in Stability AI’s DreamStudio beta web app, where almost all of the versions of Stable Diffusion are available, up to the most recent version.
We’ve found Stable Diffusion’s DreamStudio to sit somewhere between open AI’s Dall-E 2 and Midjourney’s Discord for its ease of use and quality of output.
It has a standalone user interface like Dall-E 2, where you can input word prompts and refine the results. There are more options to play with than in Dall-E 2 – one example being a scale of how closely the generator follows word prompts, plus options regarding the format and size of output.
The workflow side doesn’t match Dall-E 2, which groups images together and offers collection folders – here the focus is very much on the current image being created rather than managing archives.
There are also a lot of similarities in output quality between Dall-E 2 and Stable Diffusion, with both falling short of Midjoruney’s Discord web app for photorealism. At a push, we’d say Stable Diffusion produces slightly better-quality images, but has similar shortcomings to Dall-E 2 when it comes to photorealism.
The images that pop up never cease to surprise, and many are unusable. For example, the word prompts ‘person walking dog on beach, calm seas, footprints in the sand, sunrise’ rendered an array of results, including multiple dogs, some with missing limbs, a man walking on water, and so on.
A big win is that it’s possible to continue using Stable Diffusion for free, so even if you don’t get the image you envisaged, you won’t be wasting any money.
What is an AI art generator?
Simply put, an AI art generator is a computer program that creates images – such as illustrations, paintings or photos – from word prompts using artificial intelligence.
How does an AI art generator work?
Images are generated using a neural network – a series of algorithms that recognize underlying relationships from a set of data through a process that mimics how the human brain works – scraping from a huge database of images taken from the internet.
Are AI art generators ethical?
Controversy surrounds AI art generators. There’s uncertainty around how images are sourced, and where from. To give one example Getty’s legal proceedings against Stability AI, claims that the AI company “unlawfully copied and processed millions of images protected by copyright.”
At the user level, there have been repeated examples of the use of AI to generate photorealistic images not being disclosed, and of the manipulative use of AI generated images for fraud; plus of course the proliferation of ‘deep fake’ images, leading us to question what is and isn’t real.