Google has unveiled its latest AI experiment, Whisk, designed for Labs testers, offering a fresh approach to image generation. This innovative tool allows users to “prompt with images,” streamlining the creative process by enabling them to upload images that define three key areas: subject, scene, and style. By utilizing the capabilities of Gemini and Imagen 3, Whisk extracts essential characteristics from the uploaded images, providing a unique alternative to traditional text-based prompts.
Importantly, Whisk is engineered not to replicate images exactly. Instead, it introduces variations in attributes such as subject height, hairstyle, and skin tone, acknowledging the significance of these details to users. After generating an image, Whisk includes a “review and edit” feature, allowing users to refine the output to better suit their vision. This model aims to enhance the speed and efficiency of image generation, encouraging users to remix their creations for further inspiration. Once an image is generated, Whisk will also produce a detailed description of the creation process.
As of December 16, U.S. Labs testers can sign up to explore Whisk and its capabilities.
New Veo and Imagen Versions
In addition to Whisk, Google has announced updates to Imagen 3 and Veo 2. The latest iteration of Imagen 3 promises to deliver “brighter” and “better composed” images, with improved adherence to user prompts and richer detail in the final renders. This update is rolling out globally in ImageFX within Google Labs, just in time for the year’s end.
These enhancements come several months after Imagen 3’s quiet launch in the U.S., where it gained visibility through Google’s Vertex AI platforms. Google emphasizes its commitment to safety, implementing measures to prevent the generation of illegal or offensive imagery.
Meanwhile, Veo 2 is also receiving significant updates, broadening its capabilities as an AI video generator. The new version allows users to create “incredibly high-quality” videos across various subjects and styles, with the ability to reach 4K resolution. Users can provide detailed descriptions, such as specifying an “18mm lens,” and Veo 2 will accurately replicate the desired wide-angle shot. Enhanced understanding of expressions and movements further enriches the AI-generated video experience, while the updates also aim to reduce the frequency of hallucinated details.
The rollout of Veo 2’s updates is occurring today in Google Labs’ VideoFX, with a gradual expansion of test participants. Interested users can join the waitlist for access. Looking ahead, Google has hinted at plans to integrate Veo 2 into YouTube Shorts and other products in the coming year, building on earlier discussions about this exciting development.