I’ve been very fortunate to be involved early on with 3 of the latest AI text-to-image art systems. First was MidJourney, second was Dall-e 2, and now I’ve been able to get in on early access for Stable Diffusion. From a tech perspective, each of these generate different styles of images and have their various strengths and weaknesses. What is more curious is to observe the cultures each of these tools has surrounded themselves with.
When I got on MidJourney, there was a sense of exploration amongst ourselves. We seemed to all be getting in on something new and unique and we were all seemingly working together to explore this new tool. MidJourney attracted enthusiasts who wanted to learn and explore this new tool, together. There was a lot of collaboration and openness as we experimented with different prompts and getting what we wanted out of the system.
When I first got access to Dall-e and started to get into the surrounding unofficial communities, there was a striking different tone compared to the folks who were using MJ. From the restricted access, there was a lot more scams showing up where people were taking advantage of this lack of access. People were charging money to run prompts, they were charging money for invites(that didn’t exist), and there was a much bigger sense of trying to use Dall-e for commercial purposes.
SD is the epitome of tech bro culture. Everything surrounding their release was all hype. Folks who had requested beta access were granted the ability to get in on their Discord server, but we still had to wait over 24 hours before the bot even came online. During this time, the mods, founder, and server staff continually teased us with SD’s generations and kept on building the hype. It seemed like they were focused more on growing a user fan base first instead of making sure the product was refined before launch. The founder’s messaging about bringing in “influencers” and making statements about how “well funded” they are further exemplified this opinion.
Combining glitch art with the AI was only a matter of time. With my early experiments, I used the AI output through my glitching process. With Dall-E 2 producing incredibly photorealistic renderings, I decided to give it a try to produce glitched images.
Turns out Dall-E 2 can produce some really nice glitch images! Just adding the words “glitch, data mosh” to the prompt will get the AI to produce it. Combining those terms with “vaporwave” or “synthwave” can also produce some dynamic color ranges with a variety of distortions
In the limited time I spent experimenting, the resulting renderings were very stunning. In addition to creating glitch images, Dall-E 2 does a great job of generating variations on existing glitch images.
I’ve found myself using Dall-E 2 to produce variations of existing images more than I am using it to create new ones. While it is extremely powerful at image generation, the ability to make variations on existing images makes it stand out from any of the other text-to-image AI systems.
I was very fortunate to get access to two very powerful text-to-image AIs within a short period of time. I had been on MidJourney for a few weeks before my email to join OpenAI’s Dall-E 2 platform arrived. I honestly forgot I even applied to Dall-E so it was a very pleasant surprise.
Dall-E 2 is a much more powerful AI. It produces incredibly detailed renderings that could easily be confused with real life images. The accuracy is quite stunning and frightening at the same time.
Unicorn Loafers generated by Dall-E 2
While experimenting with different prompts, I decided to try the same prompt in both systems to see what images they would produce. From these experiments, I got the impression that MidJourney produces more abstract renderings whereas Dall-E is more literal. When using a single word prompt, like “nothing”, MidJourney produced images that invoked the emotion or demonstrated what the meaning of the word was; whereas Dall-E produced images that were literal examples of the word.
As you can see above, the differences are distinct. I know the AI’s were trained on different models, but with working with them it is important to know where their strengths lie.
If I want to create an incredibly realistic looking literal thing I will use Dall-E. If I want more abstract, concept art style rendering, I will use MidJourney to complete that task. Not to say either cannot produces images like the other, but I’ve found them to just be stronger at one thing and suit very different purposes.