Take a good look at the sweet elderly lady artistically pictured above – real or not real? It’s hard to tell, isn’t it?
If you know anything about AI art generators already like openAI’s Dall-E, you probably know they can churn out fantastical art with frightening efficiency, but recently they have also been hitting the headlines for getting better at making photorealistic images, like a drone shot of surfers on a beach that won a photo contest.
Back to that image – yes, that sweet elderly lady, with a striking (and personally worrying) resemblance to my dear late grandmother – is not real, it’s generative AI and one of my early attempts using a text-to-image AI art generator.
One question hangs heavy for photographers like me – could AI art generators make us and our cameras insignificant, niche, or even obsolete? In a bid to find out I have got to grips with some of the best AI art generators.
The good news for photographers is that there are currently way more scenarios that AI art generators can’t do so well as in the headline image, but the technology is only just getting started.
AI image generators – getting started
Along the way, I have landed on Midjourney – that uses an app called Discord – to be the most effective at generating photorealistic results straight off the bat. Over-saturation and high clarity are the norm and often not the most realistic, but the end results boast the sharpest detail. openAI’s Dall-E 2 and Stability AI's Stable Diffusion follow word prompts more faithfully.
I started with a few simple word prompts – window light portrait, black and white, elderly lady – and a minute later, a reasonable selection of four portraits popped up. Not as believable as I expected, but not bad.
Midjourney’s Discord app contains breakout chat rooms, inside which you make your own images and see other users’ word prompts and resulting images appear on the feed. Bamboozled by the constant flow of word prompts and resulting images while waiting for my own to appear, I started seeing other’s prompts like photorealistic, 85mm f/1.2 lens, Sony lens, and so on, in addition to what I would call subject descriptions.
There is an ‘art’ to getting more realistic images from an AI art generator (excuse the poor word choice there). After adding those extra words, up popped the headline image in this article (for a closer look, see below). Gulp.
But this type of image is an outlier - AI art generators have plenty more weaknesses for those trying to create photorealistic results. Let’s now take a look through a variety of images I made.
AI generated images – early results
The most convincing AI generated images that I have encountered bar none are head and shoulders portraits with a somber expression. Forget asking for a toothy smile because you’ll get squared off white blocks. There are often artifacts – blocks of sharp focus, like eyebrow hair. Full body portraits are also problematic especially when hands and limbs are involved – the most well known nemesis to AI images. If you want believable AI generated portraits, you’re going to need to keep it close.
Landscape, astro and drone photography
I was able to get believable landscape images when the prompts were general; rolling hills, beach, starry sky, drone shot. When prompting for specific landmarks or subjects, then the problem on reality appears. Also, landscape photography is about detail and AI art generators are generally limited to a modest resolution of 1024 x 1024 pixels – AI images are for small displays only. Dall E 2’s image editor does allow you to expand on the original generated image to a theoretically unlimited resolution via peripheral detail in additional image blocks, but that doesn’t improve the resolution of the original image.
You can specify lens types and brands within word prompts. At this point in time I doubt that information is fully realized in the resulting image, like lens characteristics including out-of-focus areas (although some of my Dall-E 2 efforts rendered chromatic aberrations - an unwanted and real lens distortion), but you can get a rough focal length perspective. Generally if you take a much closer look at wildlife images – obviously we’re considering that the subject needs to be a specific species that exists – and it becomes clearer that the image is generated rather than real.
Close up, macro
There must be a lot of macro images in an AI art generator’s bank, because the technical quality is decent. However, there are obvious errors regarding realism. You might find a butterfly perched on a flower facing an unnatural direction, insinuating that an animal’s natural behavior is often unaccounted for. Also if you specify a particular species – red admiral butterfly in this case – the details are usually inaccurate.
Forget about photo realistic AI generated action images. On one side of the coin you get graphics-based images that are clearly not photos, while on the other you get a photo style but of subjects with contorted limbs and motion.
I’ve learned that AI art generators do have severe limitations which I’ll also share in more detail in a separate article 5 things AI art generators can’t do. With limited 1024 x 1024 pixel output and quality of detail (at least to a trained eye), on scrutiny these images are designed for smartphone displays only.
However, most people view images primarily on their smartphones, without expecting fakery, and that’s arguably part of the problem that photographers – especially those trying to make a living in creative industries – face from AI today. These generators could well become the go-to place for content-hungry businesses that need image concepts for their social media and marketing.
Casting vision further ahead, and on the assumption that the quality of output these generators churn out will only improve via updates, there will still be a need for human creativity. Most importantly for our own well being, purpose, meaning and originality. But also because these machines will need to feast off original material made by us mortals – if the internet is saturated with AI generated images, the negative feedback loop will result in a drop in quality.