I currently have a subscription to both Midjourney and ChatGPT but find myself using Dall-E more and more often for image generation tasks. I wanted to carry out some tests to decide if I still needed the Midjourney subscription.
Midjourney vs Dall-E | The Differences
Dall-E can be used directly within a ChatGPT chat prompt just by asking “create a image of X”. ChatGPT is available via mobile app or web app.
Midjourney is connected to discord where you need to enter a /imagine command followed by the prompt
You’ll notice that MidJourney produces four images and you can create variations or upscale one of these.
Midjourney optimizes it’s models based on what images you choose creating constant improvement and a path towards artistic quality.
Some people might argue that it’s over optimized and you always end up with dystopian robots but there’s definitely a higher quality output in Midjourneys models currently.
Some of the images that we will see in the tests below have a elegant beauty that I believe is created by the constant feedback and optimization that midjourney are doing with their system.
Having used both products for a while the main difference I find is that MidJourney excels when you spend time creating short, concise, well defined prompts.
ChatGPT is built on a large language model and it has far superior ability to understand large amounts of text and descriptions.
For example I could go into ChatGPT and post the contents of this article and say “create an image to illustrate this article”. This simply wouldn’t work with MidJourney. It doesn’t have the same ability to process language and relies on you being able to create good prompts.
Average speed times across the tests below were:-
Dall-E: 17.2 seconds
Midjourney: 33.1 seconds
Note that this was carried out over a single period of time and the response times likely vary widely depending on load. Also MidJourney is creating 4 sample images in this time for you to choose between whereas Dall-E is creating a single large image.
Dall-E is capable of tweeking images based on language processing. This is still a bit hit and miss and from my previous experience I find it works perhaps 50% of the time.
In the example above I asked it to put the character from a previous prompt (see test below) into an image. It actually created a very different character which is still somewhat based on the original computer programmer but lost many traits described in the previous prompt such as the blonde hair and blue eyes.
With midjourney we can create variations on an image and also have options to zoom out or pan across filling in the blank space. This is somewhat more limiting but in it’s current form, perhaps more useful.
ChatGPT plus costs $20+TAX per month
Midjourney has 4 tiers and for occasional daily use I find the basic plan is adequate
Midjourney vs Dall-E | Image Quality Tests
For all these tests I added “Create an image of …” to Dall-E and “/imagine …” for Midjourney.
Models were ChatGPT-4 and Midjourney 5.2
The single images are Dall-E and the 4 images in a square are Midjourneys.
A sleek and modern logo for a tech startup, a stylized cloud and lightning bolt, using a blue and silver color scheme, with a minimalist and futuristic
In this test I think Dall-E actually done a better job creating a more usable logo. Midjourney seems to have missed the cloud part of the prompt completely.
A children’s book illustration showing a boy and a friendly dragon having a tea party in a magical forest
This is where Midjourney really starts to shine, it’s like animation moved on 20 years. The 3d rendering and lighting is just much higher quality. Having said that the Dall-E image might be more suited to a children’s book illustration.
A vector background for a pitch deck, abstract geometric pattern, shades of blue and green, with subtle gradients, modern clean look
Again in this example I think Dall-E creates a better final product. While I could probably get something close with a better prompt using midjourney, straight out of the model Dall-E is more usable.
A artistic portrait of a woman in renaissance style, wearing an elegant gown, seated in front of a classic Italian landscape, with soft natural lighting
Dall-E’s image has weird eyes but other than that it’s quite good, still probably unusable. Hand are particularly difficult for models to get right and you’ll notice in the top left Midjourney image it went a bit wrong.
Another interesting point is that the Midjourney women are all stunningly beautiful. The model has learnt that users will select attractive images and it’s now optimizing towards some for of collective idea of beauty.
Overall the midjourney images in my opinion are better and more usable.
A breathtaking fantasy landscape with a floating island above a crystal clear lake. The island should have a small, ancient looking castle surrounded by lush greenery and waterfalls cascading down to the lake
This is the kind of prompt that AI image generators excel at. Imaginary worlds and scenes which don’t require any form of accuracy and are open to interpretation.
With this prompt being a little bit longer I thought midjourney might struggle with the detail of waterfalls down to a lake but it done an excellent job. Perhaps the only criticism is a lack of variety in this particular response from midjourney with all the images following the same style which is that common midjourney look. Adjusting the style and creating a more descriptive prompt would likely get to wherever you wanted to go.
The Dall-E image is good but again it just feels like it’s a few years behind if it was a clip from an animation film. Probably years will equate to months in terms of AI training and the model is going to improve significantly.
A futuristic cityscape at night with flying cars, towering skyscrapers with neon lights, and a diverse crowd of people and robots walking on the streets, lively and vibrant atmosphere
These are the types of images I could play with all day on both systems. Both designs are great and you can take your pick based on personal taste.
A realistic mockup of a new fly fishing reel, black matt metal, futuristic spokes, stylish, advanced features, cutting edge design
I think Midjourney really nailed this task. The Dall-E image isn’t particularly useful because the spokes aren’t right and it’s not really added much creativity to existing products. Midjourney on the other hand has created four products that I want to buy.
A plate of Lasagna with garlic bread, beautifully presented, Michelin star, find dining presentation, on a white plate with artistic garnishing and attention to detail
The Dall-E image here is bordering on useless, it couldn’t be used as inspiration for someone creating an actual meal.
Midjourney on the other hand created the exact kind of designs I was imagining with jus and pea sprouts worthy of any fine dining establishment. If you want to impress someone perhaps try this next time you are making a meal.
A computer programmer cartoon character, male, blonde hair, black top, blue eyes, Pixar style
The Dall-E image here is very usable, the fact it’s on a white background means I could cut it out and put it on to something. It is however another example of just how far MidJourney is currently ahead. The midjourney images could be screenshots from an actual Pixar movie.
The shading and rendering is better, the characters are more human, the lighting is exceptional. Midjourney really excels at this type of prompt, it is like working with a more talented artist.
Through these tests we’ve been able to compare Midjourney and Dall-E, it’s evident that each platform has its distinct strengths and weaknesses. Dall-E is integrated seamlessly with ChatGPT, offers user-friendly access and quicker results. Its ability to understand and interpret complex prompts due to the underlying language model is a significant advantage, especially for tasks that require comprehension of extensive descriptions.
Midjourney, though requiring a bit more effort in crafting concise and precise prompts, excels in artistic quality and detail. The continuous feedback and optimization process in Midjourney seem to contribute to its higher quality output, particularly noticeable in more artistic or detailed tasks like animation style illustrations and complex scenery.
The speed and cost aspects also differentiate the two platforms. Many users such as myself will already have a ChatGPT plus subscription and Dall-E is really just a bonus extra on that subscription. Do I really need Midjourney as well? At the moment I can’t bring myself to cancel it. For $10 a month it’s an incredible tool that is still, in my opinion, the most capable AI model for artistic illustration and creative image generation.
The choice between Dall-E and Midjourney depends largely on the user’s requirements.
For tasks requiring quick, straightforward image generation for business purposes, or complex textual descriptions, Dall-E is more suitable. For users where artistic quality, detail and style are paramount, Midjourney still has a significant edge.
Both platforms, demonstrate the rapid advancements in AI driven image generation, each evolving and improving in their unique ways. As AI technology continues to advance, the gap in capabilities and quality between these platforms may narrow, offering even more powerful tools for creators and professionals.