I tried Copilot Vision, and it could change how you use Windows forever

Adding visual capabilities to artificial intelligence presents a unique set of challenges. The question often arises: should AI observe our every action? While many would prefer a more selective approach, the potential for AI to assist visually when needed is undeniably appealing. Microsoft’s latest offering, Copilot Vision, stands out as a remarkable advancement in AI-driven visual assistance.

Unveiling Copilot Vision

During a vibrant event celebrating both the launch of Copilot and Microsoft’s 50th Anniversary, the tech giant introduced the Copilot Vision update for its Windows and mobile applications. This innovative feature allows users to point their camera at objects, enabling the AI to identify them in real time. The integration of OpenAI’s GPT generative models enhances this experience, providing updates across memory, search, personalization, and visual capabilities.

Having witnessed Copilot Vision in action, it is clear that this update is among the most significant advancements in the suite, even though it will be rolled out in two phases.

(Image credit: Future / Lance Ulanoff)

Currently accessible through the Windows Desktop app, Copilot Vision can recognize the applications you are using. By activating Copilot—either by clicking the icon or pressing the designated key—you can select the new eyeglasses icon. This feature reveals a list of open applications; for instance, we had Blender 3D and Clipchamp running simultaneously. Importantly, while Copilot is aware of the apps in use, it does not continuously monitor them.

(Image credit: Future / Lance Ulanoff)

Upon selecting Blender 3D, the experience transformed. Copilot demonstrated an impressive ability to discern the application in use, responding intelligently based on the specific project. For instance, while working on a 3D coffee table design, we inquired about making the design more traditional. Despite providing minimal context, Copilot’s response was contextually rich and relevant.

When we shifted our focus to adding annotations in the app, Copilot seamlessly adapted its response, guiding us to the icon for adding annotations without disrupting our workflow. This capability could significantly enhance productivity, allowing users to maintain their focus without the need to pause for external searches or lengthy explanations.

Looking Ahead

(Image credit: Future / Lance Ulanoff)

In another demonstration, we directed Copilot Vision to our Clipchamp project, asking how to enhance video transitions. Instead of relying on text prompts, Copilot Vision visually indicated the transitions tool with an animated arrow, guiding us through the necessary steps. Although this feature is still in development and did not always function flawlessly, its potential to transform user interaction with applications is evident.

(Image credit: Future / Lance Ulanoff)

Future demonstrations have hinted at Copilot Vision’s ability to delve deeper into applications like Photoshop, showcasing tools with precision. This evolution of AI assistance could redefine how users interact with software, offering a more intuitive and visually guided experience. While the current version of Copilot Vision is already available and capable of recognizing the applications and projects in use, the more advanced features remain on the horizon, with no specific timeline yet established. However, the anticipation for these enhancements is palpable, especially after witnessing their potential in real-time.

Winsage
I tried Copilot Vision, and it could change how you use Windows forever