video generation

AppWizard
February 18, 2025
Google is exploring the incorporation of video generation capabilities into its AI-powered digital assistant, Gemini, as indicated by references to “videogen” found in the code of the Google app version 16.6.23. The term “robin” is also linked to this feature, suggesting a connection to Gemini's existing capabilities. Currently, Google offers video generation tools through its Google Vids platform, which assists users in creating videos from concepts, but the integration of video generation within Gemini could enhance user experience. The video generation feature is not yet live, and its release timeline is uncertain.
Winsage
December 26, 2024
Copilot+ PCs are the first personal computers to run Small Language Models (SLM) directly on-device, allowing for quicker interactions without relying on the cloud. Microsoft has introduced the AI Dev Gallery, which offers over 25 samples for developers to integrate on-device AI features into applications on Windows 10 and 11. The gallery requires building the project in Visual Studio, needing at least 20GB of storage and a multi-core CPU. A GPU with 8GB VRAM is recommended for heavier models but not mandatory for lighter applications. The app has two operational modes: Sample and Models. Testing models for image generation typically requires around 5GB of bandwidth, while a smaller image upscaling model under 100MB was successfully tested, completing the process in under 30 seconds with peak RAM usage of 1GB. The resulting image resolution was 9272x4900, but clarity issues were noted, especially with text. The application lacks features for previewing images in larger formats or downloading outputs directly. A model named Detect Human Pose was able to identify positions within images, including desktop screenshots. Substantial storage and robust CPUs are necessary for effective model accommodation, and the practicality of downloading large models for niche use cases is questioned.
AppWizard
November 17, 2024
A new online game titled Oasis has emerged, utilizing artificial intelligence (AI) to generate every frame, creating a unique experience similar to Minecraft. Players explore a 3D world made of square blocks, mining resources and crafting items, with the environment evolving in real-time based on their actions. The game allows for customization by enabling players to upload their own images for personalized scenes. The game engine has been trained on millions of gameplay hours, replicating actions like moving and breaking blocks. Players may encounter peculiarities such as unexpected items in their inventory and a landscape that warps when not directly observed. The developers view Oasis as an exploration of AI's potential in gaming, with aspirations for AI-generated content to adapt to user preferences in real-time. Oasis is currently available for play, though players may need to join a queue.
AppWizard
October 31, 2024
Decart and Etched have developed a unique version of Minecraft that features real-time content generation, allowing players to experience unexpected transformations in the game environment due to AI "hallucinations." This innovation is powered by a model called Oasis, trained on extensive Minecraft gameplay data, enabling the AI to understand game mechanics without traditional coding. The current demo has limitations such as low resolution and short play sessions, but the companies are optimistic about future enhancements through advancements in chip design. Etched is working on a new chip that aims to improve performance by tenfold, which could lead to longer gameplay, fewer hallucinations, and better resolution. This chip is designed specifically for AI applications, focusing on inference rather than training. Despite skepticism about achieving the projected performance gains, Decart and Etched envision the potential for creating real-time virtual assistants like doctors or tutors. A demo of their AI-generated Minecraft experience is available for public exploration.
AppWizard
October 7, 2024
A user named Indiegameplus showcased an AI-generated remaster of Grand Theft Auto IV on the aivideo subreddit, featuring realistic graphics created from text prompts. The video demonstrates a character in a hyper-realistic urban environment, utilizing Runway's Gen-3 Alpha model for production. This technology highlights the potential for AI in gaming, suggesting that future game development may increasingly integrate AI-driven graphics processes. Nvidia has indicated that future versions of their Deep Learning Super Sampling (DLSS) technology will include neural rendering capabilities, potentially transforming the gaming landscape.
Winsage
October 4, 2024
Flux from Black Forest Labs is an AI image generation technology that has gained popularity for its high-quality visuals since its launch. Previously available only online due to resource demands, it can now be used locally with the introduction of Forge, a product that allows users to download Flux for offline use. Forge is a refined version of the Automatic1111 Stable Diffusion package, featuring a simplified installation process via the Pinokio launcher for Windows. The application offers a user-friendly Gradio interface for customizing image parameters and includes editing tools for in-painting and manipulation. Users can generate images in 60 to 90 seconds, and the software performs efficiently on modest hardware, such as an Nvidia RTX 4060 graphics card with 8GB of VRAM. Flux is free to use aside from electricity costs and excels in generating realistic images, particularly in handling text and intricate details.
AppWizard
August 7, 2024
The artificial intelligence video-generation market in mainland China is projected to grow from 8 million yuan in 2021 to 9.3 billion yuan by 2026, according to research firm LeadLeo. ByteDance's Jimeng is a key player in this market, showcasing its capabilities through various applications. Recent tests revealed mixed results in video quality, with one prompt producing a distorted three-second clip of a woman walking in Tokyo, while another prompt successfully generated a coherent three-second video of woolly mammoths in a snowy meadow. The performance of platforms like Jimeng will be monitored as they face technological and regulatory challenges.
Search