The Search Stack That Nearly Broke Us — Until We Did This

Scaling search isn’t just about adding bigger servers — sometimes you need the right tools. When we first launched finlight.me, our real-time financial news API, Postgres full-text search was more than adequate. It was fast, easy to set up, and seamlessly integrated into our early architecture. However, as our article volume increased and search demands grew more complex, we began to encounter limitations. This article chronicles our transition from Postgres to OpenSearch, the challenges we faced, and the strategic decision to retain Postgres as our source of truth, which proved to be a pivotal choice.

Initially, we leveraged Postgres’ built-in full-text search capabilities to facilitate article queries. By combining titles and content into a single tsvector field, we could perform efficient searches without the complications of casing, suffixes, or keyword order that basic %query% searches would struggle with. Incoming search queries were transformed into vectors, and Postgres adeptly ranked and returned relevant results. This setup met our needs effectively, boasting rapid response times and minimal overhead, all while fitting neatly into our ingestion and storage system.

However, as our article count surged into the hundreds of thousands, we began to notice performance degradation, revealing the cracks in our initial architecture.

Pain Points with Scaling Postgres

Initially, Postgres full-text search managed our expanding dataset reasonably well. Yet, as the number of articles grew, we faced significant challenges. The primary issue stemmed from how users combined free-text queries with additional filters such as publish date ranges, specific sources, or metadata fields. While Postgres excelled at indexing individual fields and utilized GIN indexes to enhance full-text search, we soon encountered a critical limitation: Postgres does not permit the combination of a GIN index with a regular B-Tree index in a composite index. This restriction hindered our ability to optimize both types of queries concurrently, forcing the database to either select a suboptimal plan or revert to sequential scans, which became painfully slow as our dataset expanded.

Index Management Nightmare: Optional Parameters and Growing Complexity

The flexibility of our API, which allowed users to combine various filters like publish date, source, and free-text search, added another layer of complexity. With every search parameter being optional, we faced a combinatorial explosion of potential query patterns. To maintain acceptable performance, we had to create distinct indexes for the most common combinations of parameters. Each new searchable field necessitated designing new indexes, analyzing query plans using EXPLAIN ANALYZE, and manually validating performance. This ongoing index tuning became tedious and unsustainable, and despite our efforts, we remained unable to efficiently optimize full-text search combined with metadata filters in a single query.

Pagination Collapse and User-Visible Slowness

As article volumes continued to rise, another performance bottleneck emerged: pagination. Our API utilized a page parameter that directly correlated with SQL OFFSET behavior. While this approach worked well at lower offsets, performance deteriorated sharply as users requested deeper pages. Ironically, the feature intended to enhance search speed — returning a small slice of results — ended up slowing down the process. Each paginated request compelled Postgres to scan, count, and skip thousands of rows before it could return results, recalculating large portions of the query plan with each request. Queries that once executed in under a second ballooned to tens of seconds, transforming an infrastructure challenge into a user experience failure. We recognized that even a finely tuned Postgres setup could not sustain the fast, flexible search capabilities required for our anticipated growth.

It became evident that we required a system specifically designed for search — one optimized for free-text queries, filtering, and fast pagination at scale. Having previously worked with Elasticsearch on freelance projects, we understood the advantages of dedicated search engines: inverted indexes, efficient scoring algorithms, and robust query flexibility. Consequently, we opted for OpenSearch, the community-driven fork of Elasticsearch, appreciating both its technical strengths and favorable licensing model.

Simultaneously, we made a crucial architectural decision: to decouple the write path from the read path. Postgres would serve as our single source of truth for ingested and processed articles, ensuring data integrity and consistency, while OpenSearch would function as the read-optimized layer, facilitating fast and flexible search without overburdening our ingestion pipeline. This approach allowed us to utilize the most suitable tools for each requirement: a reliable, relational database for storage and ingestion, and a high-performance search engine for querying.

Testing Phase: Starting Small and Learning Fast

Before fully committing to production, we deployed OpenSearch in a minimal-resource testing environment: a single-node cluster with limited RAM, designed solely for evaluation and tuning. In this setup, we quickly identified behaviors indicative of the system’s scaling needs. Over time, we observed missing indexes and declining search performance — likely symptoms of memory pressure and resource eviction events on the hosting side. Rather than a setback, these early tests validated a critical lesson: while OpenSearch could provide the performance we required, it necessitated production-grade resources for reliable operation. Testing with minimal resources allowed us to fine-tune index mappings, validate query performance, and plan capacity based on real-world behavior. It also reinforced our architectural choice to keep Postgres as the source of truth, ensuring that even if the search layer required recovery or rebuilding, the core data remained secure and consistent.

Scaling OpenSearch for Production

Equipped with insights from our testing phase, we transitioned to a production-grade OpenSearch deployment with resources aligned to our growth trajectory. We expanded the cluster to include multiple nodes, allocated sufficient RAM, and optimized index mappings to enhance both write and query performance. With this new configuration, search response times improved dramatically — even complex queries with deep pagination returned results in milliseconds rather than seconds.

The data flow also evolved: after articles passed through our real-time article processing pipeline — where they were collected, enriched, and analyzed — they were immediately directed into OpenSearch for rapid retrieval. Postgres remained the single source of truth, reliably storing all raw and processed data, while OpenSearch acted as the read-optimized layer tailored for search performance. We implemented regular snapshotting of the OpenSearch indices, ensuring that as our article base expanded, we could swiftly recover from failures or rebuild indexes without incurring downtime. Treating OpenSearch as an advanced cache rather than the primary database provided us with flexibility: we could evolve search schemas, rebuild indexes, or adjust mappings without jeopardizing core data integrity. Over time, as traffic increased and our dataset grew, the new architecture consistently performed well under load.

Currently, our system distinctly separates responsibilities among ingestion, storage, and retrieval. Postgres continues to serve as the single source of truth, reliably storing all raw and processed article data within a normalized relational structure. Articles traverse our real-time article processing pipeline — where they are scraped, enriched, and analyzed — before being fed into OpenSearch for optimized search performance. OpenSearch manages all user-facing search queries, enabling us to deliver swift, flexible results even during peak loads. Regular snapshotting, thoughtful index management, and a multi-node deployment ensure that our search infrastructure remains resilient and scalable as our dataset expands. By decoupling the write and read sides of the architecture and selecting the most appropriate tools for each need, we have constructed a system that is fast, reliable, and poised for ongoing growth.

Lessons Learned: Advice for Builders

  • Start simple, but design with scale in mind. Postgres full-text search served us well early on — but flexibility in design made migration possible later without major pain.
  • Separate read and write paths as early as practical. Trying to make a single database handle everything becomes exponentially harder as complexity grows.
  • Use the right tool for the job. A relational database excels at storage and consistency; a search engine excels at flexible retrieval and ranking.
  • Don’t underestimate optional query complexity. Supporting flexible API filters sounds simple until you have to index every possible combination.
  • Test lean, scale smart. Early testing with minimal resources taught us what production-grade OpenSearch really needed — and avoided costly surprises.
  • Keep a reliable source of truth. Having Postgres behind OpenSearch allowed us to rebuild, heal, and extend our search infrastructure without risking core data integrity.

If you’re building scalable APIs or working with large search datasets, I would be interested to hear how you are tackling similar challenges. Please feel free to share your thoughts or experiences in the comments!

Tech Optimizer
The Search Stack That Nearly Broke Us — Until We Did This