PostgreSQL

Winsage
June 27, 2025
Lyon, the third-largest city in France, is transitioning from Microsoft’s Windows and Office suite to open-source alternatives such as Linux, OnlyOffice, NextCloud, and PostgreSQL. This move is part of a broader strategy among European governments to enhance digital sovereignty and reduce reliance on American technology firms due to concerns over data security and autonomy. The city is developing a collaborative suite called Territoire Numérique Ouvert in partnership with regional digital organizations, which will be hosted in local data centers. The migration process is already underway, with municipal workstations moving from Windows to Linux and Microsoft Office being replaced by OnlyOffice. Lyon expects direct cost savings from this migration, including extending the lifespan of municipal hardware and reducing electronic waste. Over 50% of public contracts related to this project have been awarded to firms within the Auvergne-Rhône-Alpes region, all to French companies. Training for approximately 10,000 civil servants began in June 2025 as part of the transition to Linux.
Tech Optimizer
June 24, 2025
Snowflake plans to acquire Crunchy Data, a company specializing in enterprise-grade support and cloud-native solutions for PostgreSQL, with an estimated investment of around [openai_gpt model="gpt-4o-mini" prompt="Summarize the content and extract only the fact described in the text bellow. The summary shall NOT include a title, introduction and conclusion. Text: Acquisitions have become a hallmark of the competitive landscape among the Cloud Wars Top 10, with Snowflake now making headlines for its latest move. The company has announced plans to acquire Crunchy Data, a firm specializing in enterprise-grade support and cloud-native solutions for PostgreSQL. This strategic acquisition promises to enhance Snowflake’s already formidable technology stack. The Terms While the exact financial details remain undisclosed, reports indicate that Snowflake is poised to invest around 0 million in Crunchy Data. Established in 2012 and based in Charleston, South Carolina, Crunchy Data employs approximately 100 professionals. The company’s offerings include: Enterprise support tailored for mission-critical PostgreSQL deployments. Cloud-native software and tools designed for modern applications. Advanced security and compliance features, including support for FedRAMP and HIPAA regulations. Integrated replication, backup, and various data protection and recovery tools. With this acquisition, Snowflake plans to incorporate Crunchy Data’s PostgreSQL database into its AI Data Cloud. This integration is expected to streamline data transfer processes, thereby facilitating advanced AI development on the PostgreSQL platform. Vivek Raghunathan, SVP of Engineering at Snowflake, expressed the company’s vision: “Our goal is to provide the world’s most trusted and comprehensive data and AI platform to our customers. The proposed acquisition of Crunchy Data underscores our commitment to being the ultimate destination for enterprise data and AI solutions.” He further noted, “We’re addressing a substantial 0 billion market opportunity and fulfilling a critical need for our customers to integrate Postgres into the Snowflake AI Data Cloud.” AI Agent & Copilot Summit is an AI-first event to define opportunities, impact, and outcomes with Microsoft Copilot and agents. Building on its 2025 success, the 2026 event takes place March 17-19 in San Diego. Get more details. The Deal in Context The acquisition of Crunchy Data follows closely on the heels of Databricks’ recent purchase of Neon, a company focused on developer-first serverless PostgreSQL solutions. This trend highlights a growing emphasis on harnessing PostgreSQL’s capabilities to foster the development of agentic AI. But what drives this concentrated interest in PostgreSQL? PostgreSQL is an open-source object-relational database system celebrated for its reliability, robustness, and performance. The system supports both SQL (relational) and JSON (non-relational) querying, offering versatility in data handling. For two consecutive years, PostgreSQL has been recognized as the most popular database, currently utilized by 49% of developers. Its ability to manage both structured and semi-structured data makes PostgreSQL an ideal candidate for agentic AI, enabling robust querying and efficient handling of multiple agents simultaneously. As the Cloud Wars Top 10 continues to evolve, each company is carving out its own path toward achieving AI supremacy. Some are forming strategic partnerships to innovate, as seen in ServiceNow’s collaboration with Juniper Networks. Others, like Google Cloud, are focusing on leveraging their core strengths, such as developing a universal super assistant. In Snowflake’s case, the acquisition of Crunchy Data signifies a robust enhancement of its capabilities within the agentic AI landscape, further solidifying its pioneering role in the data domain—particularly through its AI Data Cloud. Ask Cloud Wars AI Agent about this analysis" max_tokens="3500" temperature="0.3" top_p="1.0" best_of="1" presence_penalty="0.1" frequency_penalty="frequency_penalty"] million. Crunchy Data, established in 2012 and based in Charleston, South Carolina, employs approximately 100 professionals and offers enterprise support for PostgreSQL deployments, cloud-native software, advanced security features, and data protection tools. The acquisition aims to integrate Crunchy Data’s PostgreSQL database into Snowflake’s AI Data Cloud to enhance data transfer processes and support advanced AI development. PostgreSQL is recognized for its reliability and versatility, being the most popular database used by 49% of developers.
Tech Optimizer
June 24, 2025
A bug in MySQL, reported in June 2005, remains unfixed after 20 years and is classified as “S2 (Serious).” This ongoing issue has led some developers to consider switching to PostgreSQL, which is known for its advanced features and active development. The persistence of this bug raises concerns about the long-term viability of MySQL for mission-critical applications and may affect user confidence in the platform's stability and security.
Tech Optimizer
June 24, 2025
The task involved querying a table named user_events to retrieve the most recent event for every user. A conventional SQL query was used, which performed poorly in production despite returning correct results. The inner query groups the entire table by user to find the latest event time, requiring Postgres to scan 100 million rows and perform 5 million aggregations, resulting in a temporary table with 5 million rows. The outer query then checks each row against this large list, significantly degrading performance due to the repetitive evaluations across the vast dataset.
Tech Optimizer
June 23, 2025
The extended statistics feature in PostgreSQL allows for the collection of additional statistics on specific sets of table columns, which is beneficial for datasets with implicit relationships between columns. For instance, in the power plant dataset, the primary_fuel column is linked to the country column, affecting query results and row count estimates. When using extended statistics, more accurate cardinality estimates can be achieved, such as improving the estimate for Norway from 93 to 1 row after implementing statistics on country and primary_fuel. Extended statistics can be defined in three types: MCV (Most Common Values), ndistinct, and dependencies. MCV is effective for common value combinations, while ndistinct is useful for estimating group counts in operations like GROUP BY. Despite their advantages, extended statistics are rarely used due to concerns about the time-consuming ANALYZE command and the complexity of determining when to create these statistics. Two rules of thumb guide the creation of appropriate statistics: Rule 1 suggests creating statistics based on index definitions, while Rule 2 focuses on real-world filter patterns. The extension concept involves collecting created object IDs and managing the timing for adding statistics definitions to the database. A columns_limit parameter and a stattypes parameter help manage the computational cost of generating extended statistics. Testing the extension showed that running ANALYZE took longer with the extension activated, particularly when including dependencies. Deduplication procedures were introduced to minimize redundant statistics, resulting in modest gains in time and a significant reduction in the volume of statistics. Comparisons with another statistics collector, joinsel, indicated that while it provides some benefits, it lacks the full capabilities of extended statistics, particularly in terms of dependencies.
Tech Optimizer
June 21, 2025
Reynold Xin, co-founder of Databricks, highlighted the outdated nature of online transaction processing (OLTP) databases, which have not evolved significantly since the 1990s and face issues like over-provisioning and performance challenges. Databricks is introducing Lakebase, a product that separates compute from storage to enhance the efficiency of transactional databases, particularly for AI applications. Lakebase allows for instantaneous branching of databases, significantly improving workflow efficiency. Built on open-source Postgres, it supports various open storage formats and offers a copy-on-write capability to reduce costs. The separation of compute and storage is essential as streaming data becomes more integral to enterprises, enabling scalability and timely insights. Databricks aims to manage the entire data lifecycle, ensuring data remains within its ecosystem for rapid reporting and analytics. The integration of Lakebase with existing infrastructure enhances developer experience and operational maturity. The architecture supports extensive experimentation at minimal cost, fostering innovation. As AI agents become more prevalent, the focus on data evaluation and reliability will grow, necessitating a deeper examination of model accuracy.
Tech Optimizer
June 21, 2025
The Amazon Aurora PostgreSQL-Compatible Edition supports managed blue/green deployments to minimize downtime and risks during updates. This deployment strategy involves creating a staging environment (green) that mirrors the production database (blue) through logical replication. The blue environment represents the current production database, while the green environment incorporates updates without changing the application endpoint. After validating changes, the green environment can be promoted to production. In case of issues post-upgrade, a rollback plan is essential, as the managed blue/green deployment feature does not provide built-in rollback functionality. A manual rollback cluster can be established using self-managed logical replication to maintain synchronization with the new version after a switchover. Before the switchover, two clusters exist: the blue cluster (production) and the green cluster (staging). After the switchover, three clusters are present: the old blue cluster (original production), the new blue cluster (updated production), and the blue prime (rollback) cluster (a clone of the old blue cluster). To implement the solution, prerequisites include a cluster parameter group for the new version database with logical replication enabled and familiarity with the Aurora cloning feature. The process involves creating a blue/green deployment, stopping traffic on the blue cluster, performing the switchover, deleting the blue/green deployment, cloning the old blue cluster to create the blue prime cluster, and establishing logical replication from the new blue cluster to the blue prime cluster. Limitations of the managed blue/green deployment include the inability to replicate certain DDL operations and the need to handle endpoint changes manually if a rollback is required. Setting up the rollback cluster incurs additional downtime. To roll back to the blue prime cluster, application traffic must be ceased, the application or DNS records updated, the subscription on the blue prime cluster dropped, and sequence values manually updated if necessary. This process is not automatic and requires careful planning and testing. In production, it is advisable to retain the new blue prime cluster until all applications have transitioned successfully, and the old blue cluster can be backed up for compliance before deletion. For testing purposes, all clusters should be deleted to avoid additional charges.
Tech Optimizer
June 21, 2025
EDB has enhanced its data platform, EDB Postgres AI, to integrate transactional, analytical, and AI workloads. The new engine for PostgreSQL offers independent scalability from cloud storage and is optimized for columnar formats like Iceberg and Delta Lake. It utilizes Apache DataFusion for efficient query execution, achieving speeds up to 30 times faster than traditional PostgreSQL and providing 18 times greater cost efficiency for cold transactional data storage. The analytics accelerator allows analytical queries to be processed by Apache DataFusion within PostgreSQL. EDB claims a total cost of ownership that is six times more favorable and transactional performance that is 30 percent faster than SQL Server. PostgreSQL is evolving to support analytics alongside its traditional transactional capabilities, with EDB's goal being to provide tools for seamless analytics within the PostgreSQL framework without redefining it as an analytics database.
Search