language models Archives

AppWizard

June 18, 2026

‘[It] is going to change a lot about how games are made’: Epic merges Unreal Engine 5 with Unreal Engine for Fortnite to give game devs around the world Unreal Engine 6

Epic Games unveiled developments for Unreal Engine 6 at the State of Unreal event in Chicago, highlighting its evolution from Unreal Engine 5. The new engine will incorporate features from Fortnite and UEFN (Unreal Editor for Fortnite), which allows users to create game levels easily. Unreal Engine 6 will adopt open standards for tools, code, and APIs to simplify development across industries. The anticipated release is set for 2027, with early access expected by the end of that year. Verse, a new scripting language, will be central to the gameplay programming model, while C++ will remain foundational. The Scene Graph will replace the existing gameplay framework, and artificial intelligence will play a larger role, with the UE5.8 release introducing the MCP server plugin for deploying large language models.

AppWizard

June 13, 2026

Gemini 3.5 Flash lands on Google’s Android coding rankings, but it’s 3x the cost for slower performance

Google has released benchmark results for evaluating AI models in Android coding, revealing that the Gemini 3.5 Flash is the most resource-intensive model but ranks sixth overall. The benchmarks indicate that Gemini 3.5 Flash has higher latency and a 9% performance gap compared to its predecessor, Gemini 3.1 Pro Preview, despite being marketed as a faster alternative. In terms of cost, Gemini 3.5 Flash averages 355.9 tokens per benchmark run at approximately 7.1, while Gemini 3.1 Pro Preview uses only 73.3 tokens at about a third of that cost. The top-ranked models include GPT 5.5, GPT 5.4, and Gemini 3.1 Pro Preview, while Claude Opus 4.7 ranks fourth. The rankings feature both open-weight and closed-weight models, with the list remaining consistent since the last release, except for the removal of GPT 5.3 Codex.

Winsage

June 11, 2026

Microsoft is bringing AI features to more Windows 11 PCs — just in case you were under the impression that AI was being cut back

Microsoft is testing a new feature that allows developers to implement local language models on non-Copilot+ PCs running Windows 11. The Language Model APIs can now operate on any Windows 11 device with a compatible Nvidia GPU, specifically targeting GeForce RTX 30 series and newer models with at least 6 GB of video RAM. This initiative aims to democratize access to AI capabilities across a broader range of Windows 11 PCs, although not all PCs will gain access to exclusive Copilot+ AI functionalities.

Winsage

June 9, 2026

Hybrid AI Agents, Orchestration, and the Real Reason Microsoft is Fixing Windows 11 ⭐

Over the past weekend, the author reviewed session videos from Build 2026, focusing on Windows 11. The sessions were categorized into three areas: Windows app development, developer productivity, and agentic AI. The introduction of hybrid agents, which combine local AI sub-agents, was a notable development. Microsoft's partnership with Qualcomm led to the launch of Windows 10 on Arm-based PCs branded as Always Connected PCs, which evolved into Copilot+ PCs with Nuvia-powered chips for Windows 11. The Copilot+ PCs emphasize on-device AI experiences through powerful neural processing units (NPUs). The introduction of second-generation Snapdragon X2-based PCs is underway, although adoption is slower than expected. Microsoft confirmed that Windows 11 will integrate AI agents, including local AI agents and hybrid agents. The capabilities of local AI models are being enhanced, and Windows AI APIs will now support CPUs and GPUs. New Nvidia chipsets will enable large language models to run locally on PCs. The orchestration of local and cloud AI components is essential for efficient AI workload management. Microsoft is addressing user pain points to support a robust local foundation for hybrid AI, aiming to improve Windows 11 overall.

Winsage

June 2, 2026

PHANTOMPULSE RAT Uses UAC Bypass to Hijack Windows Systems

PHANTOMPULSE is a remote access trojan (RAT) used as a final payload in multi-stage attacks targeting Windows environments. It employs advanced post-exploitation techniques, including process injection, User Account Control (UAC) bypass, and a decentralized blockchain-based command-and-control (C2) mechanism. The malware utilizes three process injection techniques: module stomping, manual DLL mapping, and debug-driven execution. It avoids detection by using direct system calls instead of standard Windows APIs and incorporates a hardware breakpoint mechanism to disable security protections like AMSI and ETW. PHANTOMPULSE retrieves its C2 server address from Ethereum-based transaction data but has a vulnerability that allows defenders to hijack communication. It achieves persistence through scheduled tasks and supports self-healing capabilities. Privilege escalation is done using the "schuac" UAC bypass technique. The malware conducts system reconnaissance focusing on cryptocurrency wallets and messaging apps but does not directly steal credentials. It shows signs of AI-assisted development and shares tactics with DPRK-aligned threat groups, particularly in targeting cryptocurrency platforms.

Winsage

June 2, 2026

New PHANTOMPULSE RAT Campaign Uses UAC Bypass in Windows Attacks

The REF6598 intrusion set has revealed a Remote Access Trojan (RAT) named PHANTOMPULSE, which is distributed via malicious Obsidian plugins. It uses advanced evasion techniques, including a blockchain-based command and control (C2) channel and a public User Account Control (UAC) bypass to infiltrate Windows systems. The malware disables security measures like the Antimalware Scan Interface (AMSI) and Windows Lockdown Policy (WLDP) through a hardware-breakpoint technique, allowing it to avoid detection by signature-based memory scanners. PHANTOMPULSE hides its core files in encrypted registry blobs and creates scheduled tasks that appear as .NET Framework updates. Its decentralized C2 framework queries public blockchains for operational data, but it lacks sender authentication, which can be exploited by defenders. The malware inventories the system for antivirus software and targets high-value applications. It employs a UAC bypass technique called “schuac” to gain elevated permissions. Analysts attribute this campaign to DPRK-aligned threat actors, particularly the BlueNoroff group, due to its focus on cryptocurrency wallets and blockchain exploitation. Defenders can identify new infrastructure by searching blockchain ledgers for a specific hex signature associated with the malware's C2 encryption routine.

AppWizard

May 26, 2026

Google ranks the best AI for building Android apps, and the winner isn’t Gemini

Google launched the Android Bench benchmarking portal in March to help software developers evaluate AI models for Android app development. The leaderboard was updated last week to include open-weight models and new metrics for latency, tokens, and cost. Matthew McCullough, Google's VP of Product for Android Development, stated that the goal is to provide a benchmark for evaluating large language models (LLMs) in Android development. As of May 18, GPT 5.5 is the top AI model for Android app development, with Gemini 3.1 Pro and GPT 5.4 ranked as joint leaders. Android Bench evaluates LLMs based on real-world challenges and tasks sourced from public GitHub repositories. Other benchmarking tools in the Android ecosystem include Jetpack Microbenchmark, Jetpack Macrobenchmark, Firebase Performance Monitoring, Android Vitals, Apptim, and Android Performance Analyzer. The overall benchmark score on Android Bench is calculated using four core values: Confidence Interval Range, Average Latency Score, Average Total Tokens Score, and Average Cost. The test harness for Android Bench is publicly available on GitHub.

Winsage

May 26, 2026

Microsoft’s Copilot obsession backfired, and now it’s frantically erasing it from Windows

Microsoft has integrated its AI assistant, Copilot, into various products, including Bing and Windows 11, since early 2023. However, user dissatisfaction has led the company to shift its focus back to addressing core issues with Windows 11. Despite an aggressive rollout of Copilot across multiple platforms, it struggled to compete with specialized AI tools as users preferred solutions that could autonomously complete tasks. This resulted in backlash from users, earning Microsoft the nickname "Microslop." In response, Microsoft has initiated the "Windows K2" project to reallocate resources from Copilot to improve Windows 11, scaling back AI implementations and allowing users to customize their experience.

AppWizard

May 21, 2026

Google just tested a bunch of new AI models for Android app coding – here are the rankings

Google has updated its "Android Bench" rankings, introducing new AI models for Android app development, including open-weight models. The latest rankings, as of May 18, 2026, show GPT 5.5 at the top, surpassing GPT 5.4 and Gemini 3.1 Pro by nearly 2%. The update provides metrics such as average latency, total tokens used, and average cost per benchmark run. GPT 5.5 has a score of 74, with an average latency of 15.5, total tokens of 64.5, and an average cost of .9. In comparison, GPT 5.4 has a score of 72.4, with an average latency of 21.2, total tokens of 64.2, and an average cost of [openai_gpt model="gpt-4o-mini" prompt="Summarize the content and extract only the fact described in the text bellow. The summary shall NOT include a title, introduction and conclusion. Text: Google has refreshed its “Android Bench” rankings, unveiling a new lineup of AI models tailored for Android app development. This update introduces several “open-weight” models and provides deeper insights into the performance metrics, including token usage and associated costs. Large language models have increasingly demonstrated their prowess in coding, significantly enhancing the app development process. This trend has given rise to what is now known as “vibe coding.” Earlier this year, Google released a benchmark ranking that evaluated the top AI models for Android development, focusing on common tasks and adherence to best practices. Initially, the rankings were led by Gemini 3.1 Pro, with OpenAI’s GPT 5.4 later sharing the spotlight. However, as of the latest update on May 18, 2026, a new contender has emerged. GPT 5.5 has claimed the top position, surpassing GPT 5.4 and Gemini 3.1 Pro by nearly 2%. This update also enhances clarity by presenting average latency, total tokens utilized, and the average cost associated with each AI model. Google has provided documentation detailing the methodology behind these metrics. Average Latency: Time taken to complete 100 tasks across 10 runs Average Total Tokens: Token consumption for a complete benchmark run across 10 iterations Average Cost: Cost per benchmark run in US dollars at the time of testing While GPT 5.5 boasts superior performance, it comes at a cost—over twice that of Gemini 3.1 Pro for equivalent functions. Here’s a look at the top ten models based on Google’s latest data as of May 21, 2026: Model Score Avg Latency Avg Total Tokens Avg Cost New: GPT 5.5 74 15.5 64.5 3.9 GPT 5.4 72.4 21.2 64.2 .7 Gemini 3.1 Pro Preview 72.4 11.5 75.4 .0 New: Claude Opus 4.7 68.7 11.6 90.0 4.3 GPT 5.3 Codex 67.7 11.2 71.4 .6 Claude Opus 4.6 66.6 9.9 69.5 .4 GPT 5.2 Codex 62.5 24.3 124.4 1.9 Claude Opus 4.5 61.9 12.5 79.8 2.5 Gemini 3 Pro Preview 60.4 9.8 117.0 .7 New: GLM 5.1 59.7 33.4 80.2 .7 The rankings now feature a wider array of open-weight models, including Gemma, Qwen, DeepSeek, and MiMo, among others. GLM 5.1 has emerged as the highest scorer among these newcomers, closely followed by Kimi K2.6. Google is committed to updating the “Android Bench” on a monthly basis. With the anticipated release of Gemini 3.5 Pro and the already available 3.5 Flash, the competitive landscape will be intriguing to watch as Google seeks to reclaim its lead against OpenAI's advancements. More on Android: Follow Ben: Twitter/X, Threads, Bluesky, and Instagram FTC: We use income earning auto affiliate links. More." max_tokens="3500" temperature="0.3" top_p="1.0" best_of="1" presence_penalty="0.1" frequency_penalty="frequency_penalty"].7. Gemini 3.1 Pro has the same score as GPT 5.4 but with different latency and token metrics. The rankings also include other models like Claude Opus 4.7, GPT 5.3 Codex, and GLM 5.1, which has emerged as the highest scorer among newcomers. Google plans to update the rankings monthly.

AppWizard

May 20, 2026

Google now lets you vibe code native Android apps in AI Studio

Google has launched Google AI Studio, a platform for creating native, Kotlin-based Android applications using a prompt-based interface. It allows users to test apps on their devices or refine them in Android Studio. The platform integrates with the Android SDK and supports large language models (LLMs), making it accessible for non-developers. AI Studio uses technology from Gemini in Android Studio, enabling the use of mobile device features like sensors and GPS. It includes an integrated Android emulator for app previews and allows users to connect their Android phones for testing. Users can download app code as a zip file for further development in Android Studio. AI Studio connects directly to the Google Play Console for app uploads, requiring a Google Play developer account. Future updates will enable app sharing and integrations with Firebase for enhanced functionality.