# PostgreSQL Consulting — Architecture, Performance, and Migration for Production Workloads

PostgreSQL consulting means schema and architecture design for new applications, performance tuning when existing databases slow down, replication and high-availability configuration for systems that cannot afford downtime, and migration from Oracle, SQL Server, or DynamoDB to escape licensing costs or platform lock-in. FreedomDev has been doing this work since PostgreSQL 8 (2005). We design clusters that scale, write the indexes that turn slow queries fast, configure Patroni-managed failover that survives real outages, and ship migrations off proprietary databases that pay for themselves in 18-30 months. Remote-first. Flat-rate. Source code and runbook handed to your team.

## PostgreSQL Consulting — Architecture, Performance, and Migration for Production Workloads

PostgreSQL consulting means schema and architecture design for new applications, performance tuning when existing databases slow down, replication and high-availability configuration for systems that cannot afford downtime, and migration from Oracle, SQL Server, or DynamoDB to escape licensing costs or platform lock-in. FreedomDev has been doing this work since PostgreSQL 8 (2005). We design clusters that scale, write the indexes that turn slow queries fast, configure Patroni-managed failover that survives real outages, and ship migrations off proprietary databases that pay for themselves in 18-30 months. Remote-first. Flat-rate. Source code and runbook handed to your team.

---

## Capabilities

### PostgreSQL Architecture and Schema Design — Get It Right Once

The most expensive database mistakes happen at the schema design stage, before the first row is inserted. Choices that look reasonable in week one — a single `events` table without partitioning, JSONB fields where columns would be cleaner, no index strategy beyond primary keys — turn into rewrites in year three when the table has 200 million rows and the application has built business logic on top of the original shape. FreedomDev designs PostgreSQL schemas that handle the workload you have today and the workload you will have in three years. That means: **Proper normalization for OLTP, strategic denormalization for analytical reads.** Third normal form is the default. Materialized views and read-optimized denormalized tables cover the analytical query path. Refresh them on a schedule (`REFRESH MATERIALIZED VIEW CONCURRENTLY` so production reads are not blocked) rather than recomputing on every query. **Partitioning before you cross 100 million rows, not after.** PostgreSQL 10 introduced declarative partitioning; PostgreSQL 11-16 refined it to the point where it should be the default for time-series data, multi-tenant event streams, and any append-heavy workload. Range partitioning by `created_at` (daily, weekly, or monthly partitions depending on volume) keeps individual partitions small enough for fast index lookups and lets `DROP PARTITION` replace what would otherwise be hour-long DELETE operations. **Indexing strategy designed against the query plan.** B-tree for equality and range queries on scalar columns. GIN for JSONB containment queries and full-text search. GiST for geospatial data via PostGIS. BRIN for naturally-ordered data (timestamps, sequential IDs) — 10x smaller than B-tree, faster on the right query shape. Partial indexes for the common `WHERE status = 'active'` filter pattern that scans 5% of the table. **JSONB used surgically, not reflexively.** JSONB is the right answer for semi-structured data where the schema is genuinely flexible (event payloads, third-party API responses, configuration objects with varying shapes). It is the wrong answer for data that has a stable schema — relational columns are faster, smaller, and clearer in queries. We push back on JSONB-everywhere designs that surface in early architecture reviews. **Row-Level Security for multi-tenant isolation.** Multi-tenant SaaS using application-layer tenant filtering is a bug waiting to happen — one missing `WHERE tenant_id = ?` clause and a customer reads another customer's data. PostgreSQL's RLS policies make tenant isolation a database-level guarantee enforced on every query, including the ones developers forget to filter. **Real example.** Last year we designed the data layer for an Oregon-based agricultural IoT company collecting sensor readings from 2,400 field-deployed devices at 5-minute intervals. The naive design would have been one `readings` table with all 12 million rows per month going in. We partitioned by device_region + day, used BRIN indexes on the timestamp column, and built a retention policy that compresses readings older than 90 days into hourly aggregates using TimescaleDB's continuous aggregates. The system runs on a single 4-vCPU instance and queries that span "last 24 hours across all my devices" return in under 100ms.

### Performance Tuning — `pg_stat_statements`, EXPLAIN ANALYZE, and the Real Diagnostic Process

When a PostgreSQL database slows down, the answer is rarely "buy bigger hardware." It is almost always a missing index, an outdated statistics histogram, a query that explodes on the cartesian product because someone forgot a JOIN condition, or an N+1 pattern in application code that exploded once row counts crossed a threshold. Our tuning process: **Step 1: Enable `pg_stat_statements` and let it gather a representative window.** A week of production traffic captured by this extension surfaces every query the database has run, ranked by total time, mean time, calls, rows returned, and shared blocks read. The query that runs 50 times a second and takes 20ms is usually a bigger problem than the query that runs once a day and takes 5 minutes. `pg_stat_statements` makes that obvious; developer intuition usually misses it. **Step 2: Pull EXPLAIN (ANALYZE, BUFFERS) for the top 10 by total time.** The plan output tells you whether the optimizer is using indexes, what it estimates vs. actually returns (planner row-count estimation errors above 10x are a statistics problem), where buffer reads vs. disk reads land, and what the actual time per node was. We read these plans top-to-bottom; most performance problems show up clearly. **Step 3: Fix in order of impact.** Index creation is usually first; we use `CREATE INDEX CONCURRENTLY` so the operation does not lock the table. Statistics refresh (`ANALYZE` or tuning `default_statistics_target`) is second when row-count estimates are skewed. Query rewrites are third when the SQL itself is structurally inefficient. **Step 4: Vacuum and autovacuum tuning.** PostgreSQL's MVCC model leaves dead tuples behind that vacuum reclaims. On write-heavy tables, the default autovacuum thresholds are too conservative — by the time autovacuum kicks in, the table is bloated, indexes are bloated, and query performance has degraded measurably. We tune `autovacuum_vacuum_scale_factor` per table (often down to 0.05 for hot tables) so vacuum runs frequently enough to keep bloat under 10%. **Step 5: Connection pooling.** The most common production failure mode on PostgreSQL is connection exhaustion. Each PostgreSQL backend process costs ~10MB of memory; a misconfigured application pool can quickly exceed `max_connections` and start rejecting requests. PgBouncer in transaction-pooling mode multiplexes thousands of application connections onto a much smaller pool of backend connections. We deploy PgBouncer in front of every production PostgreSQL instance we manage. **Real diagnostic output from a recent engagement** (PostgreSQL 15, e-commerce platform's product search query): ``` BEFORE (no GIN index on tsvector, 4.2M product rows, query runs 90x/minute): Sort (cost=4127.43..4131.18 rows=1500 width=412) (actual time=842.18..843.05 rows=1500) Sort Method: external merge Disk: 8456kB -> Seq Scan on products (cost=0.00..127425.43 rows=1500 width=412) (actual time=2.42..839.83 rows=1500) Filter: ((to_tsvector('english'::regconfig, name || ' ' || description)) @@ to_tsquery('english'::regconfig, 'wireless & headphones'::text)) Rows Removed by Filter: 4198500 AFTER (CREATE INDEX product_search_idx ON products USING GIN (to_tsvector('english', name || ' ' || description)); plus query rewrite to use STORED generated column): Limit (cost=8.43..28.45 rows=20 width=412) (actual time=0.842..1.247 rows=20) -> Bitmap Heap Scan on products (cost=8.43..2841.18 rows=1500 width=412) (actual time=0.841..1.234 rows=1500) Recheck Cond: (search_tsv @@ to_tsquery('english'::regconfig, 'wireless & headphones'::text)) -> Bitmap Index Scan on product_search_idx (cost=0.00..8.05 rows=1500 width=0) (actual time=0.812..0.812 rows=1500) ``` Query time dropped from 843ms to 1.2ms. The site's product search went from feeling broken to feeling instant. Total work: 4 hours including the index build during a maintenance window.

### Migration from Oracle, SQL Server, and DynamoDB to PostgreSQL

The 2026 PostgreSQL migration market is dominated by three patterns: **Oracle → PostgreSQL.** The classic license-cost-driven migration. Oracle Database Standard Edition 2 lists at $17,500 per processor; Enterprise Edition is $47,500 per processor plus options (Partitioning, RAC, In-Memory). A 16-processor Oracle deployment lands in seven-figure annual licensing. PostgreSQL handles every workload Oracle handles, at zero licensing cost. The migration cost — schema conversion via `ora2pg`, PL/SQL to PL/pgSQL rewrites for stored procedures, application code review for Oracle-specific syntax (`NVL` → `COALESCE`, `DECODE` → `CASE`, `(+)` outer join → ANSI JOIN, sequences and IDENTITY behavior, DBMS_OUTPUT to logging) — is one-time. Most enterprises break even within 6 months. **SQL Server → PostgreSQL.** The mid-market license-cost-driven migration. Documented in detail in [`/services/sql-consulting/oregon`](/services/sql-consulting/oregon) — process, tooling, real timelines. Common driver: SQL Server Standard Edition per-core licensing has tightened year over year; mid-market companies running 8-16 cores see 30-50% TCO reduction by moving to PostgreSQL. **DynamoDB → PostgreSQL.** The 2025-2026 pattern. Companies that adopted DynamoDB early-Web-2-era are now hitting query flexibility limits, transaction model constraints, and the cost surprise that hits when read/write capacity scales linearly with traffic. PostgreSQL on AWS RDS or Aurora replaces DynamoDB for workloads where ad-hoc queries, transactional consistency, or relational joins matter more than DynamoDB's specific operational profile. Migration is non-trivial because DynamoDB's data model is fundamentally different; we model the target schema relationally and write custom ETL to extract from DynamoDB streams into the target. Typical project: 12-20 weeks. For all three migration patterns we follow the same disciplined approach: parallel run for 1-2 weeks where both databases receive the same writes, validation by row count and checksum, cutover during a documented maintenance window, 7-day post-cutover monitoring with the old database in read-only as rollback option.

---

## Use Cases

### Replication, High Availability, and Disaster Recovery

PostgreSQL's replication primitives are good enough that "no PostgreSQL HA solution exists" claims (which were true in 2010) are now obsolete. Streaming replication, logical replication, and the Patroni ecosystem give us the building blocks for any HA architecture from "single primary with one read replica" to "multi-region with sub-second failover." **Streaming replication for hot standby and HA.** Physical replication via WAL streaming gives byte-for-byte standby copies of the primary with sub-second replication lag in typical deployments. Standbys can serve read queries (hot standby), distributing read load across the cluster. For HA, we deploy Patroni — the open-source orchestration layer that handles automatic failover when the primary dies, leader election, and the safety guarantees that prevent split-brain scenarios. Patroni-managed PostgreSQL clusters can detect a primary failure and promote a standby in 15-30 seconds. **Logical replication for zero-downtime version upgrades.** PostgreSQL major version upgrades (15 to 16, 16 to 17) require dump/restore or a parallel-version strategy. Logical replication makes the parallel-version strategy clean: stand up a PostgreSQL 16 cluster, configure logical replication from the existing PostgreSQL 15 primary, let it catch up, validate the data, switch the application connection string during a maintenance window, decommission the old cluster. Total downtime: minutes, not hours. **Cross-region disaster recovery.** Asynchronous streaming replication to a geographically separate replica protects against region-wide outages. Replication lag is typically 1-5 seconds depending on inter-region latency. We test failover quarterly — if the runbook only works when the engineer who wrote it is online, it does not actually work. **Backups that have been restored.** Continuous WAL archiving + daily base backups via `pgBackRest` or `Barman`, landing in S3 or Azure Blob with cross-region replication and customer-managed KMS keys. We test restore quarterly. A backup that has never been restored is not a backup; it is a hope.

### PostgreSQL Extensions for Specific Workloads — PostGIS, TimescaleDB, pgvector

A meaningful slice of what makes PostgreSQL the right choice in 2026 is the extension ecosystem. We have deployed: **PostGIS** for geospatial workloads: routing logistics companies, agricultural IoT, location-aware retail analytics, asset tracking for distributed equipment. PostGIS gives PostgreSQL spatial indexes (GiST on geometry columns), spatial joins (`ST_Intersects`, `ST_Within`, `ST_DWithin`), and routing functions (`pgRouting` for shortest-path queries). The combination of PostGIS + a single PostgreSQL instance frequently replaces what would otherwise be a multi-product GIS stack. **TimescaleDB** for time-series workloads: IoT sensor data, application metrics, financial tick data, log analytics. TimescaleDB layers on top of PostgreSQL to add transparent partitioning (`hypertables`), columnar compression (10x storage reduction on time-series), continuous aggregates (precomputed rollups refreshed incrementally), and time-bucketed query functions. We use it whenever the workload is "lots of writes ordered by time, queries that span time ranges." **pgvector** for AI embedding storage and semantic search. The 2024-2026 wave of AI applications brought pgvector into mainstream production use. Storing OpenAI or Anthropic embeddings in a `vector(1536)` column with an HNSW or IVFFlat index lets PostgreSQL serve k-nearest-neighbor queries at scale, which is the foundational operation in RAG (retrieval-augmented generation) applications. We have deployed pgvector for legal document search, customer support knowledge base retrieval, and product recommendation engines. For mid-market companies that do not need a dedicated vector database, pgvector on PostgreSQL is usually sufficient and dramatically simpler operationally. **pg_partman** for automated partition management. The companion to time-series partitioning — `pg_partman` creates new partitions on schedule (daily, weekly, monthly) and drops old ones according to retention policy. Eliminates the operational burden of manual partition creation. **pgaudit** for audit logging that meets PCI/HIPAA. The `pgaudit` extension logs every SELECT, INSERT, UPDATE, DELETE against in-scope tables with user, query, and timestamp. Ships logs to immutable storage. The PostgreSQL-native answer to "we need audit trails for compliance." **Other extensions deployed less frequently but worth knowing**: `pg_cron` (scheduled jobs inside the database), `pg_trgm` (trigram fuzzy text search), `hstore` (key-value pairs, mostly superseded by JSONB), `tablefunc` for pivot operations, `postgres_fdw` for foreign data wrappers when you need PostgreSQL to query another PostgreSQL or external database.

### PostgreSQL on Managed Cloud — RDS, Aurora, Azure, GCP

About 70% of new PostgreSQL deployments we work on in 2026 are managed cloud services rather than self-hosted. The trade-offs: **AWS RDS for PostgreSQL.** Mature, predictable, supports every PostgreSQL extension we typically need (PostGIS, pgvector, TimescaleDB via the marketplace listing). Read replicas, automated backups, point-in-time recovery, Multi-AZ for HA. Limitations: no superuser access (some extensions and operations restricted), Multi-AZ replication uses synchronous DRBD-style replication that costs latency on every write. **AWS Aurora PostgreSQL.** RDS's bigger sibling — Aurora reimplements PostgreSQL's storage layer for cloud-native operation. Benefits: faster failover (under 30 seconds typically), six-way replicated storage across three AZs, read scaling via low-latency read replicas. Costs: ~20% more than RDS, some extension compatibility quirks, vendor lock-in to AWS. Right choice for workloads with bursty read patterns or HA-critical requirements. **Azure Database for PostgreSQL Flexible Server.** Microsoft's managed PostgreSQL. Competitive with RDS in features; better integration with Azure Active Directory and Azure DevOps; usually the right choice if the rest of the stack is Azure-native. **Google Cloud SQL for PostgreSQL.** GCP's offering. We have deployed it for clients running data warehouses adjacent to BigQuery. Solid feature set, integrates well with the GCP ecosystem. **Self-hosted on EC2 or bare metal.** Still the right answer for workloads with predictable steady-state utilization that can amortize the operational cost, regulated industries that require complete control, or massive databases (10TB+) where managed-service pricing becomes prohibitive. We design and operate self-hosted clusters when the math works.

### Custom PostgreSQL Development — When Consulting Becomes Development

A meaningful fraction of engagements start as "we need PostgreSQL consulting" and become "we need PostgreSQL plus the application that sits on it." FreedomDev does both. The work that recurs: - **Custom REST and GraphQL APIs over PostgreSQL.** Hand-rolled in Node.js (Fastify, NestJS), Python (FastAPI), or Go. We avoid heavy ORMs in performance-critical paths; raw SQL with parameter binding outperforms most ORMs by 30-50% on hot queries. - **Real-time data pipelines** using `LISTEN/NOTIFY` for application notifications and logical replication for downstream system synchronization. - **Reporting layers and data marts** modeled in PostgreSQL using star-schema patterns, refreshed by dbt or custom ETL, serving Power BI / Looker / Metabase / Superset. - **Embedded analytics** — customer-facing dashboards where each customer sees only their own data, enforced by Row-Level Security at the database layer rather than application-layer filtering. - **PostgreSQL as the queue.** For systems where Redis/RabbitMQ feels like over-engineering, PostgreSQL's `FOR UPDATE SKIP LOCKED` pattern gives you a transactional job queue inside the database you already have. We have deployed this for clients processing 10-100k jobs/day.

---

## Key Stats

- **2005 (PostgreSQL 8)**: First production PostgreSQL deployment
- **12-18 typical**: Concurrent PostgreSQL clients supported
- **12 TB single-instance**: Largest production PostgreSQL we have tuned
- **4 in the last 24 months**: Oracle → PostgreSQL migrations shipped
- **7 in the last 18 months**: pgvector applications deployed

---

## Frequently Asked Questions

### What is PostgreSQL consulting and when do companies hire a PostgreSQL consultant?

PostgreSQL consulting covers four categories of work: schema and architecture design for new applications, performance tuning when existing databases slow down, replication/HA/DR configuration for systems that cannot afford downtime, and migration from other database platforms. Companies hire PostgreSQL consultants when their internal team lacks deep PostgreSQL operational experience (most common in companies that adopted PostgreSQL recently after running Oracle or SQL Server), when a specific high-stakes engagement is on the line (a major migration, a production performance crisis, a compliance audit), or when they need an outside review of a design before committing to it.

### How is PostgreSQL different from MySQL, SQL Server, and Oracle?

PostgreSQL is the most standards-compliant of the four. Its SQL implementation supports more of the SQL standard than MySQL or SQL Server, including window functions, CTEs (with RECURSIVE), full ACID transactions across all storage engines, and stricter constraint enforcement. Compared to MySQL: PostgreSQL handles complex queries (multi-way joins, subqueries, analytical workloads) more efficiently; MySQL traditionally had a simpler operational profile but the gap has narrowed. Compared to SQL Server and Oracle: PostgreSQL is free, open-source, cross-platform, and has a faster-moving extension ecosystem. Most new application development in 2026 chooses PostgreSQL.

### Can PostgreSQL handle enterprise-scale workloads?

Yes. PostgreSQL runs at single-instance scales into the tens of terabytes; with sharding (Citus extension or application-level sharding), it scales horizontally. Notable production deployments: Apple, Instagram, Reddit (yes, the same Reddit that was originally on Cassandra), NASA, the European Space Agency, several major banks, and the largest US healthcare exchanges. The "is PostgreSQL enterprise-ready" question was answered ten years ago; the current question is "what is the right operational architecture for *my* enterprise workload" — which is what consulting addresses.

### How long does an Oracle to PostgreSQL migration take?

Depends on stored procedure volume more than data volume. A 200-table OLTP database with 50 stored procedures takes 12-16 weeks end-to-end. A similar database with 800 stored procedures and significant PL/SQL business logic takes 6-12 months because PL/SQL → PL/pgSQL conversion is hand-rewriting, not automatic translation. Data volume matters for the cutover window planning; logical replication keeps the cutover window under a few hours regardless of database size.

### What is pgvector and when should we use it?

pgvector is a PostgreSQL extension that adds a `vector` data type and approximate-nearest-neighbor search via HNSW or IVFFlat indexes. It is the right choice for AI applications where you need to store embedding vectors (typically 1536 or 3072 dimensions from OpenAI or Anthropic) and run similarity search against them. The killer use case is RAG: storing document chunks as embeddings and retrieving the top-k similar chunks for a given query embedding. For mid-market companies running PostgreSQL already, pgvector eliminates the need for a separate vector database (Pinecone, Weaviate, Chroma) and dramatically simplifies operational architecture.

### Should I use AWS RDS, Aurora, or self-host?

RDS for most workloads — the operational simplicity outweighs the slight performance disadvantage vs self-hosting. Aurora when you need faster failover or have bursty read patterns and want low-latency read replicas. Self-host when you have 10TB+ databases where managed-service pricing becomes prohibitive, when you need PostgreSQL extensions that the managed services do not support, or when you have regulatory requirements demanding complete control over the underlying infrastructure. Most of our 2026 deployments are RDS.

### How is FreedomDev's PostgreSQL consulting different from postgres.ai, Cybertec, or Pythian?

postgres.ai is Nikolay Samokhvalov's firm — among the deepest PostgreSQL specialists in the world. Cybertec (Austrian) is similarly deep. Pythian is a large multi-database consulting firm that includes PostgreSQL among many platforms. We compete on a different axis: we are an integrated software development company that does PostgreSQL at expert level. When the database problem is downstream of an application problem (which it usually is), we can fix both layers without coordinating with a separate development team. We are also based in West Michigan with flat-rate, US-based engineers — relevant for mid-market US companies that want a vendor in their time zone and pricing model.

---

**Canonical URL**: https://freedomdev.com/technologies/postgresql

_Last updated: 2026-05-12_