The new AI image generation tool, Whisk, from Google, enables users to create images using photos instead of text descriptions. Available in the United States via Google Labs, this experimental tool provides a unique way to create images.
Whisk Key Features
How Whisk Works
- Drag and drop images to define the subjects, scene, and style of the desired image.
- Create items like stickers, enamel pins, and digital plush toys.
- Built with Google’s Gemini AI and Imagen 3 image generation model.
Unique Capabilities
- Enables image-based prompting without relying on text-based prompts.
- Analyzes automatically uploaded images and writes detailed text descriptions.
- Allows creative “remixing” of visual elements.
- Offers flexibility to edit and refine generated images.
Technical Process
- Image Analysis: Gemini AI analyzes uploaded images.
- Text Captioning: Produces detailed text captions for the images.
- Image Generation: The Imagen 3 model uses these captions to generate new images.
- Creative Output: The new image reflects the “essence” of the input images, with variations in details like height, hairstyle, skin tones, etc.
Availability and Limitations
- Currently available only in the United States via Google Labs.
- Designed for rapid visual exploration, not pixel-perfect editing.
- Targeted at creators, designers, and anyone interested in AI-powered image generation.
Important Note: According to Google, Whisk is a “creative tool” that lets people dive into visual ideas quickly and freely, encouraging a playful approach to AI image generation.