Back to Blog

I have a recurring conversation. Someone shows me an architecture diagram — a clean little web app, three or four engineers behind it — and there’s a Redis instance for caching, a separate queue (RabbitMQ or SQS), Elasticsearch for a single search box, a vector DB for the “AI feature” the founder promised, and somewhere in the middle, Postgres, doing the actual work.

Every one of those extra boxes is a subscription, a deployment, a backup story, a monitoring dashboard, an upgrade window, and a 2 a.m. incident waiting to happen. And in maybe 80% of the cases I see, every single one of them could just be Postgres.

This isn’t a “Postgres is magic” post. Postgres has limits, and I’ll get to them. But before you reach for another piece of infrastructure, it’s worth knowing how much the boring old elephant can actually do.

Postgres sitting on top of a pile of retired stack components


The thing people forget about Postgres

Postgres is not just a relational database. It’s an extensible object-relational database, and that word — extensible — is the part that matters. You get the standard ACID-compliant core, but you also get a healthy ecosystem of extensions that bolt on entire new capabilities without leaving the database.

In other words: it’s the kind of system you can keep adding to instead of replacing. That’s a rare property in this industry.

Here’s what I’ve actually replaced with it, in real projects.


NoSQL: JSONB and GIN

The classic argument for MongoDB is “I have unstructured data.” Fine. You can store unstructured data in Postgres, too, using the JSONB type. The B is the important part — it’s a decomposed binary representation, not a string you have to re-parse on every query.

CREATE TABLE events (
    id        bigserial PRIMARY KEY,
    payload   jsonb NOT NULL,
    received  timestamptz NOT NULL DEFAULT now()
);

CREATE INDEX events_payload_gin ON events USING GIN (payload jsonb_path_ops);

A GIN (Generalized Inverted Index) on a JSONB column behaves like the index at the back of a textbook — keys point directly to the rows that contain them. So you can do this:

SELECT id, payload->>'customer_id'
FROM events
WHERE payload @> '{"type": "checkout_completed"}'
  AND received > now() - interval '7 days';

…and join it against your normal relational tables in the same query, in the same transaction. You get the schema flexibility of a document store without giving up referential integrity. Most “we need MongoDB” projects I’ve seen would have been perfectly happy with this.

JSONB documents being indexed by a GIN tree


Background jobs: FOR UPDATE SKIP LOCKED

This one’s a little hidden gem. People reach for Redis or RabbitMQ for a job queue because “a SQL queue will deadlock.” That was true 15 years ago. It hasn’t been for a long time.

WITH next_job AS (
    SELECT id
    FROM jobs
    WHERE status = 'pending'
    ORDER BY created_at
    LIMIT 1
    FOR UPDATE SKIP LOCKED
)
UPDATE jobs
SET status = 'running', started_at = now()
FROM next_job
WHERE jobs.id = next_job.id
RETURNING jobs.*;

SKIP LOCKED tells Postgres: grab the first available row, lock it, and if you hit a row another worker already has — don’t wait, just skip it. The result is a wait-free worker pool sitting on top of a plain table. I’ve run systems doing thousands of jobs per second this way. Operationally it’s lovely: jobs are just rows you can SELECT, retry, audit, and back up like anything else.

If you need a queue and you already have Postgres, you probably don’t need another piece of infrastructure.


You don’t need Elasticsearch to power a search bar. You might if you’re indexing the New York Times archive, but for the search field on your SaaS dashboard? tsvector and tsquery are right there.

ALTER TABLE articles
ADD COLUMN search_doc tsvector
GENERATED ALWAYS AS (
    to_tsvector('english', coalesce(title, '') || ' ' || coalesce(body, ''))
) STORED;

CREATE INDEX articles_search_idx ON articles USING GIN (search_doc);

SELECT id, title
FROM articles
WHERE search_doc @@ plainto_tsquery('english', 'postgres performance');

Postgres handles stemming (so “running” matches “run”), stop words, and ranking. Add the pg_trgm extension and you get fuzzy matching that survives typos — useful when a user searches for “postgresss” instead of “postgres.” For 90% of in-app search use cases, this is more than enough, and it lives next to your data, in the same transaction, with the same backup story.


Vector search for AI features

If you’re shipping an AI feature, the temptation is to drop in a hosted vector database. Fine for prototypes — painful in production, because now you have a “hybrid search problem”: you want documents that are semantically similar to a query, but only the ones owned by the current user, only from the last 30 days, only in this project. Doing that across two systems over the network is slow, expensive, and ugly.

pgvector gives you vectors as a native column type, plus HNSW indexes for approximate nearest neighbor search:

CREATE EXTENSION vector;

ALTER TABLE documents ADD COLUMN embedding vector(1536);

CREATE INDEX documents_embedding_hnsw
ON documents USING hnsw (embedding vector_cosine_ops);

SELECT id, title
FROM documents
WHERE owner_id = $1
  AND created_at > now() - interval '30 days'
ORDER BY embedding <=> $2
LIMIT 10;

That’s the whole hybrid search problem solved with a single query. Your vector data sits next to your relational data, and your row-level security policies apply to both at once.

A toolbox labeled PostgreSQL with various database-shaped tools inside


Geospatial: PostGIS

If you do anything with maps or routing, this isn’t even a “Postgres can do it too” situation. PostGIS is the industry standard. It has been for years.

The GiST index is the trick: it draws bounding boxes around your geometries and rejects the obvious non-matches before doing any expensive geometric math. “All coffee shops within this polygon” goes from a server-melting query to milliseconds.

I’ve seen teams adopt commercial GIS systems that, under the hood, were running PostGIS the whole time.


Time-series with BRIN and partitioning

Telemetry, logs, IoT events — the usual reflex is “we need a time-series database.” Often you don’t. Postgres has declarative partitioning, and for time-ordered data the BRIN (Block Range Index) is a serious power tool.

A B-tree index stores an entry for every row. A BRIN index stores the min and max value for each block of data on disk. For sequentially-inserted time-series data, that’s all you need — when you query a range, Postgres skips entire physical chunks of the table that can’t possibly contain matching rows.

CREATE INDEX events_received_brin
ON events USING BRIN (received)
WITH (pages_per_range = 128);

Tiny index, huge table, very fast range scans. Pair it with monthly or daily partitions and you have something that comfortably handles billions of rows for most “I need a TSDB” workloads. If you cross into TimescaleDB territory, that’s also just an extension on top of Postgres — same engine, same tools.


Dashboards: materialized views, not Snowflake

A standard view re-runs its query every time. A materialized view runs the heavy query once and stores the result on disk. For dashboards that aggregate over a lot of data, this is the difference between “snappy” and “the database is on fire.”

CREATE MATERIALIZED VIEW daily_sales AS
SELECT date_trunc('day', occurred_at) AS day,
       region,
       sum(amount) AS total
FROM sales
GROUP BY 1, 2;

CREATE UNIQUE INDEX daily_sales_day_region ON daily_sales (day, region);

REFRESH MATERIALIZED VIEW CONCURRENTLY daily_sales;

CONCURRENTLY is the magic word — it refreshes in the background without locking out readers. Most “we need a data warehouse” stories I’ve heard could have started with this and grown into something heavier only if it actually became necessary.


The API layer: PostgREST, PG_GraphQL

This one’s more controversial, but worth a mention. Tools like PostgREST and the pg_graphql extension can generate a fully working REST or GraphQL API directly from your schema. Combine that with row-level security policies and a JWT-based auth layer, and your “backend” can shrink to a handful of SQL files.

I wouldn’t build a full product this way, but for internal tools, admin panels, and prototypes, it removes an absurd amount of boilerplate.


When not to do this

Here’s the honest part. Postgres scales beautifully vertically — bigger box, faster disks, you’ll get a long way. It scales much less gracefully horizontally. If you genuinely need:

  • millions of writes per second across a sharded cluster,
  • sub-millisecond in-memory caches for huge concurrent loads,
  • globally distributed multi-region active-active writes,

…then you need the specialized tools. That’s fine. But “fine” applies to a much smaller number of teams than the architecture-Twitter discourse would have you believe. Most products live and die long before they hit those limits.

The other place to be careful: don’t pile every workload onto a single Postgres instance and then act surprised when your OLTP traffic slows down because someone is also running a 12-hour analytical query on it. Split workloads by replica or by database when it matters.


The actual takeaway

The interesting question isn’t “can Postgres replace everything?” It can’t, and it shouldn’t try.

The interesting question is: for the next piece of infrastructure I’m about to add to my stack, do I actually need it, or am I just out of practice with Postgres? In my experience, the answer is “I’m out of practice” about half the time. The other half, you reach for the specialized tool with a clear conscience and a much simpler core.

Fewer moving parts means fewer incidents, fewer cloud bills, and fewer phone calls at 2 a.m. That’s worth a lot.


I work with databases for a living — mostly SQL Server in operations work, Postgres in product work. If you’ve got a stack you’re trying to simplify (or a Postgres performance problem that’s haunting you), I’m easy to find on LinkedIn.