It's Magic
By Dirty Old Biker
Introduction It's past time for another model comparison. This time I didn't do every model, like I did on the last one. I tried to focus more on the ones that I perceive are used the most. Of course, I have no real idea which those are, so it's mostly based on my opinion, lol. The Prompt The Explanation Whenever I see a prompt suggestion, for example, a Vincent prompt for those who know Vincent, I read it and start to like aspects of it, and then I uncover the problems with it. The most common issue I have with prompt suggestions is that it seems they almost always revolve around a lone subject and a desolate landscape, or a post apocalyptic city. What is this obsession with solitude and destruction? I think it's supposed to put you in a contemplative mood. Maybe I'm just too shallow, and crave instant gratification. I don't often want to be trapped in thoughts of my own destruction. I found such a suggestion the other day, liking what it could be, but I would have to make some changes. The crumbling remains of a city became a beautiful city park. Her torn and tattered robes became a pretty sundress, and so on. The Prompt Text A powerful sorceress, her eyes glowing with arcane energy, conjures a storm of raw magic above a beautiful city park. Her pretty sundress whips around her as lightning illuminates the dark, sky. Her bare feet ground her magic. The style is a blend of gritty realism for the harsh landscape and fantastical elements for the magic. Vivid color gradients in the storm clouds shift from bruised purples to electric blues. Cinematic lighting emphasizes the dramatic intensity of the scene. The Comparison GPT 1.5 Low GPT 1.5 is always a safe bet. It rarely messes up, and the outputs are usually beautiful. In these three images, the scene is clear, albeit a little dark. The subject and the background are very well-defined and detailed. There is plenty of drama. The sorceress is not inherently evil. Rather, you're left wondering what her story is. Why is she standing in a stream of water, barefoot, and channeling environmental forces of unknown magnitude. Personally, I really want to know what happens next. Grok Imagine The new kid on the block, Grok is pretty reliable. It may not always be quite as stunning as GPT, but it's solid. What I like about it is that it rarely hallucinates. The things it draws generally turn out looking like they're supposed to. Also, there often seems to be enough drama to sell the image. It's not boring. Where I have trouble with this model is that I'm always tempted to compare it to GPT, and it doesn't usually win that. The images aren't quite as bright. They don't catch your heart the way GPT does. I'd love to hear opinions on this, especially differing opinions. Flux Pro Ultra Ultra always makes stylistically smooth, perfect images embellished with a bit of overdramatization (e.g., the woman is floating). I think it overshot the mark on this prompt. It looks okay, but there's something off, and I'm not sure how to describe it. For one thing, she looks too small in the picture, and as we all know, Flux has trouble with the limbs, especially when they're thinner. In the second image, you almost can't see her legs. The dress doesn't look right to me in either one. I'm likely being overly picky. Flux Klein, Flux 2 Flex, Flux 2 Pro, & Flux 2 Max These are the newer generation of Flux. The images are all "correct", but less interesting. • Flux Klein ( top-left ): The woman almost looks like a robot. Her posture is boring, there's no facial expression. The sky is quite a lot calmer, too. It's like this is the safe for children version. Stuff is happening, but it's not really that dangerous. • Flux 2 Flex ( top-right ): In some ways, this is quite similar to Klein. Her dress is not very disturbed. The chaos is a little diminished. That said, the magic sigils are exciting and her posture and stance make her look more engaged in the situation, and her environment definitely looks more disturbed. • Flux 2 Pro ( bottom-left ): This one is weird. The woman's posture and stance are just as boring as in Klein, but the chaos is bigger. Flex is a higher quality model, but (at least in this case) Pro has a better image from a content point of view. • Flux 2 Max ( bottom-right ): There is nothing wrong with this image in my opinion. It has everything I want to see and it looks great. This easily competes with Grok, and surpasses Ultra. Hunyuan Image 3 & HiDream Full This is an interesting pair. On the one hand, there's Hunyuan, which delivered a fantastic image full of everything. There's nothing boring about it, there's plenty of drama. The background looks turbulent despite the bright day, and the woman looks amazing. Ignoring the one extra finger, everything else about her looks perfect. In my opinion, this is almost as good as GPT. Then there's HiDream. I'm used to that model producing overly clean, bland images, but this one is not bland at all. I'm pretty impressed by it, to be honest. Imagen 4 Ultra & Ideogram v3 Balanced Sometimes, Imagen does a great job, and other times, it doesn't measure up. I'm not as happy with it as I would have liked. The ideogram one, however, turned out rather interesting, even though she's floating slightly. Lucid Origin Standard, Lucid Origin Ultra, & Minimax Image 01 I like the first image. I think, for a test model, the image turned out far better than I expected. Comparing it to #2, I'm not sure if I like 2 better than 1. In 1, her fingers are a mess, and it looks like she's floating a bit, but the tumultuous sky looks better. In 2, her fingers are also a mess, and she doesn't look like she's the one controlling the magic. I'm getting very similar vibes in the Minimax image as with the Lucid Ultra image. Nano Banana, Nano Banana Pro, & Seedream 4.5 I'm not disappointed with the first image. It's better than Flux 2 Pro and Flex, in my opinion. NBP, on the other hand, is the only model that added noteworthy bystanders. There are a couple of other images that have people very far off in the distance, but here, they are close by and seemingly watching the show. That said, the woman's stance is lame. I don't mind the ugly, gritty mud, but she's just standing there like another robot. Is it worth 160 credits? No. Seedream did very well, though. The image is exciting and full of drama. Everything looks great. Qwen Image, Z-Image Turbo, & P-Image While the Qwen image looks really good, both of the other two kinda missed. They don't look like they're involved in the magic, except for their eyes. I think the P-Image version looks like a pretty lady taking a walk during a lightning storm. At least with ZIT, her eyes look like they should. Wan 2.2, Wan 2.5, & Wan 2.6 In this scenario, 2.2 seems to lean more towards anime. I did a few, and they all were anime. 2.5 and 2.6 both lean more toward realistic. 2.2 also looks the least like she's meant to. She could just be standing there enjoying the weather, whereas the other two are very involved in it. 2.2 makes me think of Klein a lot in this image. Final Thoughts As we can clearly see from all the model comparisons I've done so far, Each model has its own strengths and weaknesses. If you want consistence and reliability with undeniable quality, you would probably want to go with GPT 1.5. If you want a more realistic slant along with stylized beauty, you might pick Flux Pro Ultra, etc.