multimodal capabilities

AppWizard
December 8, 2025
Last week, a demonstration of Android XR glasses took place at Google's Hudson River office, showcasing features such as visual assistance and gyroscopic navigation. These glasses are part of a developer kit for Android developers. Google aims to integrate these devices with Android phones and smartwatches by 2026. The strategy for AI glasses includes two types: one focusing on audio and camera features, and another incorporating a display for visual cues. Developer Preview 3 of the Android XR SDK is set to launch soon, supporting a wide range of existing third-party Android apps. The glasses can display navigation routes and driver information for Uber rides. Gemini, the assistant, provides contextual information immediately upon wearing the glasses. The Samsung Galaxy XR headset has new features like PC Connect and travel mode, while Xreal's Project Aura glasses offer a 70-degree field of view and access to Android apps. The anticipated price for Project Aura could be around ,000, with a potential late next year launch.
AppWizard
November 24, 2025
Gemini 3 Pro has reclaimed AI benchmarks, outperforming previous models and leading the LMArena and WebDev Arena leaderboards. It introduces a new AI mode in Google Search that enhances user interactions with multimodal responses. The model features advanced reasoning and independent coding capabilities, enabling it to tackle complex queries and manage multi-step tasks autonomously. Gemini 3 Pro achieved a score of 1,501 points on the LMArena leaderboard, surpassing Grok 4.1 Thinking, and scored 37.5% on Humanity's Last Exam without tool use. In the WebDev Arena, it has an ELO score of 1,487 and excels in various coding benchmarks. Gemini 3 is available in Google AI Studio, Vertex AI, and Gemini CLI, and serves as the foundation for Google Antigravity, an AI-powered integrated development environment. It is currently rolling out to Google’s consumer products, including Search and the Gemini app, with plans for broader availability in the U.S.
Winsage
August 6, 2025
NVIDIA has partnered with OpenAI to enhance the gpt-oss models for NVIDIA GPUs, enabling rapid inference and supporting millions of users on NVIDIA RTX AI PCs. The gpt-oss-20b and gpt-oss-120b models, trained on NVIDIA H100 GPUs, feature open-weight reasoning and can handle context lengths of up to 131,072 tokens. Users can utilize these models through frameworks like Ollama, which provides a user-friendly interface for experimentation. The models are optimized for RTX GPUs and support applications such as web search and coding assistance. Developers can also access the models via Microsoft AI Foundry Local and other frameworks, with NVIDIA contributing to open-source projects to enhance performance.
AppWizard
August 6, 2025
Samsung is expanding its One UI 8 Beta starting August 11, initially for the Galaxy S24 series, Galaxy Z Flip 6, and Galaxy Z Fold 6 in markets like India, Korea, the U.K., and the U.S. In September, the beta will include the Galaxy S23 series, Z Fold 5, Z Flip 5, Tab S10 series, Galaxy A36 5G, and Galaxy A35 5G. Samsung plans to extend One UI 8 to more Galaxy Watch 8 models later this year. The update will feature advanced multimodal capabilities, enhanced Galaxy AI functionalities, and updates to core apps like Samsung Health and Calendar. A refreshed Quick Share UI and improved accessibility options will also be included. Users must sign up through the Samsung Members app to participate in the beta program.
AppWizard
May 28, 2025
The One UI 8 beta is now available for Galaxy S25 models in select regions, featuring enhanced AI capabilities, a tailored user experience for different device types, and proactive suggestions. It introduces improvements to the Reminder app, Quick Share, multitasking, Samsung Internet, and accessibility features. The rollout is limited to regions including Germany, Korea, the U.K., and the U.S., excluding the Galaxy S25 Edge. A stable version is expected to launch with new foldable devices this summer. Key features include multimodal capabilities, enhanced Now Bar and Now Brief features, local data processing options, and improvements to the Auracast feature. The Reminder app will consolidate tasks into a single interface, and Quick Share will receive enhancements. Additional features include improved file search, a redesign of Samsung Internet, multitasking enhancements, new Calendar features, and social health management options through Samsung Health. More features may be revealed as the beta progresses.
AppWizard
May 20, 2025
Google's advancements in AI Mode and Search include the rollout of a custom version of the Gemini 2.5 model, enabling users to pose complex queries. Project Astra introduces multimodal capabilities, allowing real-time conversations with Gemini using device cameras. The AI can process website elements and perform tasks like adding items to shopping carts or creating travel itineraries. Users can ask AI Mode to find specific tickets or activities tailored to their interests, with curated suggestions provided. AI Mode will enhance online shopping by integrating with the Shopping Graph, offering visually engaging product listings and a virtual "try-on" feature. Users can upload photos to experiment with outfits and track prices for products. The virtual try-on experience is being rolled out in Search Labs for U.S. users, with full availability of AI Mode in the U.S. starting today.
AppWizard
May 2, 2025
Gemini Live has transitioned from a voice-based AI assistant to a multimodal platform that can process camera feeds and screen-sharing inputs, enhancing user interactions with visual context. It requires an Android device with at least 2 GB of RAM and Android 10 or later, along with a Google One AI Premium subscription for access to camera and screen-sharing features. These features are complimentary for Google Pixel 9 and Samsung Galaxy S25 users, and newer Pixel devices may offer a trial for Gemini Advanced. To share a live video feed, users must launch Gemini, tap the Live icon, select the Camera button, and ensure the desired items are visible. For screen sharing, users open the relevant app or screen, activate Gemini, and select Share screen with Live. Gemini can summarize content and answer questions based on the shared screen. The multimodal capabilities are particularly beneficial for scenarios requiring detailed descriptions, positioning Gemini Live competitively alongside other AI platforms.
AppWizard
April 14, 2025
Google is experimenting with a redesign of its AI Mode search launcher, integrating it into the main Search bar of its Android app. This change aims to enhance user engagement with AI functionalities. The updated interface allows users to tap the Search bar while removing the Lens and voice icons. AI Mode was first introduced in early March for a select group of early access users. The new AI Mode button will be distinct from the traditional Search function and aims to improve user understanding of complex subjects by providing advanced reasoning and analytical capabilities. It delivers succinct answers with relevant links and detailed explanations when needed. Google plans to integrate its Gemini technology into AI Mode to introduce multimodal capabilities for more comprehensive responses to complex queries.
AppWizard
February 4, 2025
AI-powered search engine Perplexity has launched a new feature called Perplexity Assistant, available for Android devices, which integrates reasoning, search capabilities, and app functionalities. The assistant can perform multi-app actions, such as hailing rides and searching for songs, and can set reminders by creating calendar entries. It utilizes the phone's camera for contextual inquiries and maintains context across various tasks, like researching restaurants and making reservations. The assistant is initially free for users in 15 languages. CEO Aravind Srinivas acknowledged that some features may not perform as expected and improvements are planned. Perplexity has also introduced Sonar, an API service for enterprises, and acquired Read.cv. Founded in 2022, Perplexity has raised over 0 million in funding and processes over 100 million queries weekly. The company faces legal challenges from publishers, including lawsuits from News Corp and a cease and desist order from The New York Times, but emphasizes its commitment to respecting publisher content through a revenue-sharing program.
AppWizard
February 3, 2025
Perplexity AI has launched a new Android app called the Perplexity Assistant, available on the Google Play Store. The app is designed to assist with various tasks through voice, text, and camera interactions, and it can converse in 15 languages. It utilizes Perplexity’s proprietary search engine to provide real-time web information and maintain context across multiple tasks. Users can perform activities such as booking rides, identifying objects, and making restaurant reservations through voice commands. The app is free and aims to integrate Perplexity’s AI into users' daily workflows. Perplexity has also introduced an API called Sonar for businesses and acquired the professional social media platform Read.cv.
Search