“If it’s this easy, why don’t more Windows apps use a PC’s NPU?” — Microsoft MVP demonstrates how he added meaningful AI to an app in just 10 minutes

AI PCs equipped with specialized Neural Processing Units (NPUs) are increasingly becoming a staple in the tech landscape. These devices are not just about enhanced performance; they open the door to innovative applications that leverage on-device AI capabilities. While the Copilot+ tools integrated into Windows 11 serve as a primary example of utilizing the NPU, a recent exploration into third-party applications reveals a broader spectrum of possibilities.

In a recent discussion, Microsoft MVP Lance McCarthy emphasized a crucial point regarding the ease of integrating AI into applications through Microsoft’s Windows AI APIs. According to McCarthy, these APIs allow developers to harness the power of the NPU without the complexities typically associated with cloud APIs or custom models. This simplicity is particularly appealing for developers looking to enhance their applications with AI functionalities.

As outlined in McCarthy’s blog, the only prerequisite for utilizing these APIs is that the user must possess a Copilot+ PC equipped with a capable NPU. For many, this requirement is already met, especially for those who have purchased a new laptop in the past year.

Adding AI to a Windows app is way easier than I thought

McCarthy highlights several AI tools that are readily available for developers. Among these is Phi Silica, a local language model that brings many Large Language Model (LLM) features to the desktop environment. Another notable tool is AI Text Recognition, which excels in optical character recognition (OCR), enabling the conversion of physical documents into searchable text.

Additionally, AI Imaging tools offer functionalities such as image scaling, sharpening, and object extraction, while Windows Studio Effects can be seamlessly integrated into applications to enhance camera and audio quality.

One of the standout examples of these tools in action is McCarthy’s own Xkcd Viewer app, which allows users to browse xkcd comics with features for saving and sharing. Recognizing the need for accessibility, McCarthy incorporated AI-powered image descriptions, transforming the experience for visually impaired users. He notes that simply adding text readouts would not do justice to the artistic nuances of the comics, which rely heavily on visual elements.

This is a perfect use case for the Image Description service, which understands the context of the image and then describes it in a way that can be entertaining for a vision-impaired user! It tries conveying the comedy behind the image, which is better than a plain screenreader.

Lance McCarthy

McCarthy’s modification of the app took a mere ten minutes but added significant value, showcasing how accessible AI can enhance user experience. For those interested in delving deeper into the development process, McCarthy’s blog offers a wealth of information, making it an engaging read for both seasoned developers and novices alike.


Join us on Reddit at r/WindowsCentral to share your insights and discuss our latest news, reviews, and more.


Winsage
"If it's this easy, why don't more Windows apps use a PC's NPU?" — Microsoft MVP demonstrates how he added meaningful AI to an app in just 10 minutes