Google's ImageFX and Whisk

Google's ImageFX and Whisk

Google's ImageFX and Whisk

DECEMBER 17, 2024

The AI image generation landscape is experiencing explosive growth, with nearly 15.5 billion AI-generated images created to date and approximately 34 million new images being generated daily. Google has introduced two powerful tools in this space: ImageFX and Whisk. What's particularly fascinating is how these tools produce remarkably similar results from identical prompts, suggesting a sophisticated underlying model that consistently interprets and generates images in a highly structured way. What you see in my tests below is the reuse of my reliable test prompts, but you will quickly see where Google has added limitations. No real humans can be generated. And don't even think about trying to remix Disney's IP. You can get around it by just generated a human in a rodent costume or South American soccer player.

Castle

Neuschwanstein castle, lightning, pixar style, volumetric lighting, unreal engine, hyper realistic, hyper detailed, maximum details, photorealistic, 8k, rimlight

ImageFX Interpretation

Whisk Interpretation

Family

Family on their laptops while sitting around a Christmas tree with presents underneath and looking worried because they have to finish up work, realistic, 4k

ImageFX Interpretation

Whisk Interpretation

Mickey

Evil mickey mouse taking a selfie in Disneyland surrounded by shocked families, realistic, 70s style polariod, 8k

ImageFX Interpretation

Whisk Interpretation

Messi

Lionel messi wearing his Argentina uniform floating in the air with beams of light behind him posed like Jesus. Sunrise breaking behind him and a soft halo behind his head, in the style of a gothic stained glass window of a church, volumetric lighting, unreal engine, hyper realistic, hyper detailed, maximum details, photorealistic, 8k, rimlight, maximum details

ImageFX Interpretation

Whisk Interpretation

Robot

rusty robot with bow tie, portrait, 8k, ultra realism, chrome background

ImageFX Interpretation

Whisk Interpretation

What's particularly intriguing about these results is the remarkable consistency between ImageFX and Whisk outputs. When given identical prompts, both tools generate images that are strikingly similar in composition, structure, and overall interpretation. This unusual level of consistency suggests that both tools are leveraging the same sophisticated underlying AI model, which appears to have a very structured approach to interpreting and generating images.

The examples above demonstrate this consistency across a wide range of prompts - from architectural scenes with the Neuschwanstein Castle to complex character compositions with the Mickey Mouse scenario, and from emotional family scenes to stylized portraits. In each case, both tools produce results that share remarkably similar layouts, lighting, and compositional elements.

This consistency between ImageFX and Whisk represents a significant achievement in AI image generation technology. It suggests that Google has developed a highly reliable and deterministic image generation model that can consistently interpret and render complex prompts. This level of reliability and predictability could be particularly valuable for professional applications where consistent results are crucial.

You can try both tools yourself at labs.google/fx/tools/image-fx for ImageFX and labs.google/fx/tools/whisk for Whisk. The similarity in results between these tools showcases the robustness of Google's approach to AI image generation, setting a new standard for consistency and reliability in the field.