Making Adding AI Apps with Postgres Easier for Developers

With the increasing demand for AI applications, developers often find themselves navigating unfamiliar territories. Timescale, a prominent open-source PostgreSQL database vendor, has introduced a suite of tools aimed at simplifying this journey for developers lacking AI expertise. Their latest innovation, the pgai Vectorizer, seamlessly integrates the embedding process into Postgres, enabling the creation, storage, and management of vector embeddings alongside relational data without the need for external tools or additional infrastructure.

Built on the foundation of pgvector, the open-source extension that facilitates vector search in Postgres, Timescale recognizes that simply adding vector capabilities is insufficient for developers new to AI. Avthar Sewrathan, AI and developer product lead at Timescale, emphasizes that the responsibility of building AI applications often falls on software developers who may not have a background in AI or machine learning.

Embedding Creation on Autopilot

Sewrathan describes the pgai Vectorizer as a tool that automates embedding creation with a single SQL query, allowing developers to set it and forget it. As new data enters their tables, embeddings are automatically generated, alleviating concerns about data synchronization and scaling. This automation not only streamlines workflows but also addresses the operational challenges that developers face when integrating AI into their applications.

With pgai Vectorizer, developers can:

  • Manage all data for AI applications—vectors, metadata, event data—on a familiar PostgreSQL platform.
  • Automatically synchronize data changes to vector embeddings in real-time.
  • Switch between embedding models effortlessly for rapid testing and experimentation without code alterations.
  • Track model versions to ensure backward compatibility during rollouts, facilitating smooth transitions.

Web Begole, CTO at MarketReader, praises the pgai Vectorizer, stating, “It promises to streamline our entire AI workflow, from embedding creation to real-time synchronization, allowing us to deliver AI applications faster and more efficiently.”

“There are really tough engineering challenges that need to be overcome if you want to build a production-grade application.”
Avthar Sewrathan, AI product lead for Timescale

Sewrathan identifies four critical tasks that the pgai Vectorizer can replace:

  • Building an ETL pipeline: Automating the ingestion of source documents or images and orchestrating calls to AI models to create embeddings.
  • Chunking and formatting: Configuring data into the appropriate format and size for embedding models with minimal coding effort.
  • Scaling and managing the embedding creation pipeline: Handling queuing and rate limits for large volumes of embeddings automatically.
  • Synchronization: Ensuring that embeddings and corresponding metadata remain up-to-date, with notifications for any discrepancies.

Sewrathan argues that many databases are merely adding vector search capabilities without addressing the broader challenges of building AI applications. He asserts that the pgai suite aims to provide developers with comprehensive tools that encompass not just vector search, but also scaling, updating, and data synchronization.

Initially Just pgai

Launched last June, the pgai tool suite was designed to empower PostgreSQL developers to become proficient in AI engineering. Initially focused on simplifying semantic search and retrieval-augmented generation (RAG) tasks, pgai has since expanded its capabilities to support various AI models, including OpenAI, Ollama, Anthropic, and Cohere, with plans to include more in the future.

Improved Scaling

Another significant addition, pgvectorscale, enhances PostgreSQL’s ability to handle large-scale AI use cases. By introducing specialized data structures and algorithms for vector search, it aims to deliver performance comparable to dedicated vector databases while maintaining cost efficiency.

Utilizing solid-state disks for index storage, pgvectorscale offers substantial savings compared to traditional in-memory indexes. Developed in Rust, it also opens access to a growing community of developers.

All Open Source

As an open-source initiative, the pgai tools not only reduce infrastructure costs but also streamline development processes, allowing smaller teams to achieve more with less custom code. Sewrathan notes that the automation provided by the Vectorizer can significantly decrease the number of developers needed for complex tasks, enhancing productivity and innovation.

In a recent benchmark test, Timescale’s pgvector demonstrated superior performance and cost-effectiveness compared to standalone vector databases, reinforcing the advantages of integrating AI capabilities within the PostgreSQL environment. By choosing Timescale’s solutions, developers retain access to a full spectrum of data types and operational features essential for deploying robust production applications.

Tech Optimizer
Making Adding AI Apps with Postgres Easier for Developers