Comparing Image Generation Tools

Overview

Intro

As AI generated images have rapidly evolved in past few years, the different tools used to create AI images have similarly developed. Nearly all AI image generation tools can create amazing images specific to a user’s query. So what tool should use be used? There is no silver bullet, tradeoffs are associated with each image generation tool. This post highlights the important differences between four of the most popular tools: Bing Image Creation, Midjourney, Stable Diffusion WebUI, and ComfyUI. Bing Image Creation and Midjourney offer a streamlined experience while Stable Diffusion Web UI and ComfyUI allow users to dig deeper into image generation.

What are your thoughts on AI image generation tools? Share your thoughts in comments below the post.

Quick Comparison

A high level view over some the key differences between the tools

Category	ComfyUI	SD WebUI	Midjourney	Bing
Configurability	Very High	High	Medium	Low
Learning Curve	Steep	Medium	Simple	Very Simple
Free				Free w/ time quota
Opensource & Extensions
Prompts Unrestricted

Tool by Tool Breakdown

Bing Image Creation

Bing image creation

Bing Image Creation is very simple way to get started generating AI images. Using Dall-e 3 under the hood, this service is extremely good at understanding user prompts. With other types of AI image generation tools, there is often a disconnect between what the text prompt states and what the image generation outputs. While clever prompt engineering can be used to help alleviate this disconnect, Dall-e 3 excels at text interpretation. This makes it very simple to make distinct styles, as can be seen at https://designer.microsoft.com/image-creator. Simplicity is a key strength of this tool. If you are just starting out with creating AI images, Bing Image Creation is a great place to start. This simple interface more than gets the job done for many use cases. For example, generating a fun cartoon image and sharing it on social media is simple, quick and time-efficient with Bing Image Creation. However, if you would like to dig in and gain more control over the image generation, the simple interface is more limiting than the other options presented.

Midjourney

Midjourney is a paid service that is able to turn prompts into breathtaking AI art. On initial look this service may look similar to Bing Image Creation, but there are important number of differences. One key difference is that Midjourney is more geared towards creating AI art than Bing Image Creation. While Bing is more easily able to understand prompts, Midjourney generally able to create more realistic and artistic images, as seen below at the Midjourney Community Showcase.

Midjourney community showcase

Midjourney also offers more tools when creating images. One such tool is keyword prompting as seen below. Midjourney also features full fledged documentation.

Midjourney documentation

A primary strength of Midjourney is allow users to create incredible stable diffusion works, without having to deal with a lot of the tinkering of complexity of an open source tool. You are able more control over the output compared to simpler tools such as the Bing Image Creation tool, at the cost of more complex text prompting and slightly less consistent results in image generation.

Stable Diffusion WebUI

Stable Diffusion WebUI is the most popular open source AI image generation tool. As opposed to the previous options, Stable Diffusion WebUI is typically installed on a users own system. While this is a less convenient option, it allows users to have more control and options. One such added capability is control net utilization. Control nets can directly control image generation aspects such as poses, colors, and style.

So why is Stable Diffusion WebUI the most popular open source AI image generation framework? The ingenuity and simplicity of the interface. Despite the relatively simple interface, users can generate images with the precision they wish through extensions and configurations not available on closed source tools. The catch is that it’s more difficult and time consuming to learn the ropes vs the closed source options mentioned. Users can match the image generation quality of closed tools with some effort.

Civit AI is a community that shares both image generation models and configurations to easily get started with Stable Diffusion WebUI. For example, in the image below a user has shared an image generated with Stable Diffusion WebUI. The user has also given the ability to copy and paste exactly how the image was generated into Stable Diffusion WebUI. This allows anyone to replicate the image creation exactly for free.

Civit AI Post

ComfyUI

The last tool this article that will be covered is ComfyUI. This another open source tool with most of the same benefits and drawbacks that Stable Diffusion WebUI has. As the image above suggests, main difference between Stable Diffusion WebUI and ComfyUI is the interface choice. ComfyUI’s interface is less simple and can quickly get convoluted if you are not careful. On initial glance, the added complexity doesn’t seem worth it for the end user. However, building AI image workflows through ComfyUI allows deeper understanding and configuration of AI image generation. For many advanced use cases ComfyUI is an excellent option. For example, automatically generating a mesh for a face can easily be done ComfyUI and the Impact Pack extension as can be seen below.

Mediapipe workflow

Additionally, the node interface of ComfyUI offers increased performance when reusing models for image generation. ComfyUI is an excellent choice when users want use AI image generation as fine point tool, as opposed to more casual use cases.