Midjourney V7 enhanced with GPT 2.0
By Sealine
Recently I have started experimenting with creating images with Midjourney V7 and editing and enhancing them with GPT 2.0 image to image. I had primarily been using GPT 2.0 since it released. It's still a model I love, and it produces fantastic results. However, I've found the images produced, especially with simple prompts, can start to look like everyone else's. You start thinking, I know that face, I know that clothing style, I've seen that background, etc... More detailed prompts and styles can minimize or even eliminate this issue. For me, I can't visualize a finished product. I often only have a vague idea of what I want meaning detailed prompts aren't always possible, especially from the start. My images grow and evolve with each generation as I modify, change and enhance my prompt. Midjourney V7, in my experiences, produces some interesting artistic images even with minimal prompting. It can often take images in a direction I didn't think of. The model also produces some beautiful female faces that are different from the typical GPT output I'm used to. On the negative side. it can be inconsistent with each of the 4 generations you receive with each run varying wilding. There can also be glaring issues and I've found images are also often flat. Looking at a batch of images created with Midjourney V7 I often find myself thinking, I love that but... And that is where GPT 2.0 image to image comes in handy. In this blog I'm going to present a series of images that were run with Midjourney V7 and enhanced with GPT 2.0 image to image. First up, I wanted something unique for an elephant image and was not happy with anything my prompting skills were producing so I ran the prompt through Midjouney V7. Here's the three images I didn't chose to enhance (click to expand): Next, the image on the left is the fourth Midjourney image and the one I chose to enhance. It's a great image. It got the patterns I was asking for but the image read a little flat to me and the elephants didn't stand out enough from the background. Using the image on the left as my reference in GPT 2.0 Medium 1k, I used an incredibly simple prompt "Keeping style and patterns of original, give the image more depth and dynamic composition. Elephants are the focus." The result is on the right and that is the image I chose to use for my post. (click to expand) Next up I was working on a vague theme of teal. I ran a simple prompt though Midjouney and following are the three in the set I didn't use. (click to expand) On the left is the image I chose to use. Her face drew me in, that wistful glance and soft features. However, that was more teal than I wanted and again, the image read flat to me. With the reference and a simple prompt into GPT 2.0 Medium 1k, " Enhance this image with depth. Remove some teal tint in skin. Eyes and lips match the teal overview of image. Captivating beauty." Results on the right. (click to expand) In this next batch I ran in Midjourney I found two images I loved. This was really vague prompt letting the model do its thing but also means the results vary more than my previous examples. The two images I didn't use from the four image set below (click to expand): The first image I chose to use is on the left. Beautiful captivating face, one of my favorite things about Midjourney V7, but the floral headpiece is muddy and what's with the mini racoon? So again, into GPT 2.0 Medium 1k with the reference and a simple prompt, "Enhance beautiful blossoms on her head. Replace racoon with a beautiful bouquet. Add depth. Beautiful and captivating." Resulting image on the right. The second image from the set I chose to use is on the left. It is a great image. I'd be fine using it. Nice painterly style, beautiful eye color, overall great look. But I figured why not see what GPT 2.0 medium 1k could do with it in image to image. The prompt was, "Enhance this beauty. Add depth. Beautiful and captivating." Results on the right. In this match-up, I really love both images and would be a bit torn in choosing one to use. Feel free to share your opinion in the comments. This brings us to the mermaid picture in the header. I only have two of the unused set for this image because I hated one so much I deleted it before I knew I was writing this blog. Since I did prompt for a mermaid, I have no idea what Midjouney was thinking. However, Midjourney knocked it out of the park with the interpretation I chose to use. It wasn't right for what I was working on at the time, but it was a take on the prompt I never would have thought of. But that hand hanging over the tub had to be fixed! The prompt I used to enhance the Midjourney reference was "Correct image so only four fingers visible in hand hanging over edge of bathtub. Blend seamlessly. Keep mermaid face the same, including expression and eyes. Change nothing else." Appreciate you taking the time to read my blog! Please share your thoughts in the comments and clap my blog (up to 20 times) if you enjoyed it!