AI training

TrendTechie
February 7, 2025
Meta is alleged to have unlawfully used pirated books for training its AI models, downloading at least 81.7 terabytes of data from torrent sources, including 35.7 terabytes from Z-Library. Internal communications reveal employee concerns about the legality of downloading pirated content, while leadership reportedly obscured their activities and operated in "stealth mode." Plaintiffs are seeking renewed interrogations of Meta personnel, claiming previous testimonies were misleading, and asserting that Meta's copyright infringements may involve distribution of pirated books. Meta defends its actions as falling under "fair use" and intends to challenge the allegations. Additionally, Meta and its products are prohibited in the Russian Federation due to being designated as extremist.
TrendTechie
February 4, 2025
Meta is facing a lawsuit in 2023 regarding the training of its LLM model, Llama, with allegations of using pirated content from torrent trackers. A judge has ordered the release of original documents, revealing internal discussions about the appropriateness of using torrents for AI training. An engineer raised concerns about using torrents on a corporate laptop, confirming the use of pirated content. There are indications that Mark Zuckerberg may have approved the use of such materials. Among the sources of pirated content was LibGen, a repository of pirated books and articles. Meta is defending its actions by citing the legal doctrine of "fair use."
Winsage
November 23, 2024
Windows Insiders in the Dev Channel are receiving a preview of the Recall AI feature for Copilot Plus PCs, which includes functionalities like Recall and Click to Do. Recall captures snapshots of user activities, allowing for easy retrieval through natural language queries and a scrollable timeline. It is an optional feature requiring user consent, initially available only on Qualcomm-powered Copilot Plus PCs, with plans to support Intel and AMD systems later. Users can control their snapshots, including deleting them and excluding specific apps from recording, while sensitive information is automatically protected. Microsoft assures that no snapshots are sent to the cloud or used for AI training, emphasizing user privacy and security. The Click to Do feature allows users to perform actions with text and images from captured snapshots and will eventually enable broader interactions with screen content. Recall's introduction was delayed due to security concerns, but Microsoft has enhanced its security framework, allowing users to opt-in or uninstall the feature.
Winsage
November 20, 2024
Microsoft unveiled significant updates to its cloud and AI services at the Ignite conference in Chicago, including enhancements to Microsoft 365 Copilot, new AI agents, and plans to use Nvidia’s Blackwell GB200-powered AI servers. Nearly 70% of the Fortune 500 now use Microsoft 365 Copilot. New features include Copilot Actions for summarizing meetings and consolidating communications, and advanced AI agents like the Interpreter for Teams and Employee Self-Service Agent. Microsoft introduced the Azure AI Foundry SDK for building AI applications and announced the Windows 365 Link, a compact PC for cloud services, set to retail for starting in April 2025. The company also launched the Microsoft Security Exposure Management platform and enhanced security for AI applications with the Data Loss Prevention feature for Microsoft 365 Copilot. Microsoft’s shares have increased by 12% over the past year.
Winsage
August 1, 2024
Reddit has blocked Bing from accessing its data, but this is not due to an exclusive arrangement with Google. Google’s partnership with Reddit is valued at approximately million, allowing Google to enhance search results on Reddit using its AI capabilities. Reddit CEO Steve Huffman expressed frustration over Microsoft’s reluctance to negotiate for access to Reddit’s data. While Microsoft’s Bing has reduced its use of AI-generated summaries, OpenAI’s SearchGPT has secured a deal with Reddit to display search results incorporating its data. Microsoft adheres to the robots.txt standard, which restricts data scraping from Reddit. The standoff may impact Microsoft’s ambitions in the search engine market and could lead to challenges in negotiating with other digital publishers. Investors are concerned about Microsoft’s investments in AI without clear returns, with a study suggesting that up to 30% of AI projects may be abandoned by 2025.
Search