Comparing DALL-E 2 and Stable Diffusion 2.0
In this post, I'm comparing the two popular text-to-image models based on a specific query
I've been writing frequently on the development of text-to-image AI models lately. This weekend, I decided to compare two popular frameworks (DALL-E 2 and Stable Diffusion 2.0) based on a specific query.
I typed “A man in his forties eating an ice cream”. Nothing too surreal (this could be a concept for a commercial stock image). This is what I got:
The skin texture looks realistic but besides that… the “uncanny valley” aspect is undeniable. Also the ice cream looks rather rough.
This is a bit better overall, but look at the strange orange eyebrows and mismatched eyes…
Among all the renderings, this is the only image featuring a non-Caucasian man. Overall, this is not bad. The eyes look unnatural and mismatched still.
Keep reading with a 7-day free trial
Subscribe to The PhilaVerse to keep reading this post and get 7 days of free access to the full post archives.