Skip to main content
Database

PostgreSQL Performance Optimization: Complete Guide to Database Tuning

By Tech Blog61 min read
PostgreSQL Performance Optimization: Complete Guide to Database Tuning

PostgreSQL Performance Optimization: Complete Guide to Database Tuning

PostgreSQL is a powerful open-source database, but achieving optimal performance requires understanding indexing, query optimization, and system tuning. This guide covers essential techniques to maximize your PostgreSQL performance.

Why PostgreSQL Performance Matters

  • User Experience: Slow queries lead to poor UX and user abandonment
  • Cost Efficiency: Optimized databases require fewer resources
  • Scalability: Proper optimization enables handling more traffic
  • Business Impact: Fast databases directly improve conversion rates

1. Index Optimization

Indexes are crucial for query performance but require careful planning.

1.1 B-Tree Indexes (Default)

-- Create basic index
CREATE INDEX idx_users_email ON users(email);

-- Composite index for multiple columns
CREATE INDEX idx_posts_author_date ON posts(author_id, created_at DESC);

-- Partial index for specific conditions
CREATE INDEX idx_active_users ON users(email) WHERE is_active = true;

-- Expression index
CREATE INDEX idx_users_lower_email ON users(LOWER(email));

1.2 Index Types

-- GIN index for full-text search
CREATE INDEX idx_posts_content_gin ON posts USING GIN(to_tsvector('english', content));

-- GiST index for geometric data
CREATE INDEX idx_locations_gist ON locations USING GIST(coordinates);

-- BRIN index for large sequential data
CREATE INDEX idx_logs_time_brin ON logs USING BRIN(created_at);

-- Hash index for equality comparisons
CREATE INDEX idx_users_id_hash ON users USING HASH(id);

1.3 Index Maintenance

-- Check index usage
SELECT schemaname, tablename, indexname, idx_scan, idx_tup_read, idx_tup_fetch
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC;

-- Remove unused indexes
DROP INDEX idx_unused_index;

-- Rebuild bloated index
REINDEX INDEX idx_users_email;

-- Rebuild all indexes on table
REINDEX TABLE users;

2. Query Optimization

2.1 EXPLAIN ANALYZE

-- Basic EXPLAIN
EXPLAIN SELECT * FROM users WHERE email = 'user@example.com';

-- EXPLAIN ANALYZE (actually runs the query)
EXPLAIN ANALYZE
SELECT u.name, COUNT(p.id) as post_count
FROM users u
LEFT JOIN posts p ON u.id = p.author_id
WHERE u.is_active = true
GROUP BY u.id, u.name
ORDER BY post_count DESC
LIMIT 10;

-- EXPLAIN with all options
EXPLAIN (ANALYZE, BUFFERS, VERBOSE, FORMAT JSON)
SELECT * FROM posts WHERE created_at > NOW() - INTERVAL '7 days';

2.2 Query Optimization Techniques

-- ❌ Avoid: SELECT *
SELECT * FROM users WHERE id = 1;

-- ✅ Better: Select only needed columns
SELECT id, name, email FROM users WHERE id = 1;

-- ❌ Avoid: OR conditions (may not use indexes)
SELECT * FROM posts WHERE author_id = 1 OR category_id = 5;

-- ✅ Better: Use UNION
SELECT * FROM posts WHERE author_id = 1
UNION
SELECT * FROM posts WHERE category_id = 5;

-- ❌ Avoid: Functions on indexed columns
SELECT * FROM users WHERE LOWER(email) = 'user@example.com';

-- ✅ Better: Use expression index or store lowercase
CREATE INDEX idx_users_lower_email ON users(LOWER(email));
SELECT * FROM users WHERE LOWER(email) = 'user@example.com';

2.3 JOIN Optimization

-- Use INNER JOIN instead of subqueries when possible
-- ❌ Slow subquery
SELECT * FROM posts
WHERE author_id IN (SELECT id FROM users WHERE is_active = true);

-- ✅ Faster JOIN
SELECT p.* FROM posts p
INNER JOIN users u ON p.author_id = u.id
WHERE u.is_active = true;

-- Use EXISTS for checking existence
-- ❌ Slow COUNT
SELECT * FROM users u
WHERE (SELECT COUNT(*) FROM posts WHERE author_id = u.id) > 0;

-- ✅ Faster EXISTS
SELECT * FROM users u
WHERE EXISTS (SELECT 1 FROM posts WHERE author_id = u.id);

3. VACUUM and Maintenance

3.1 VACUUM Operations

-- Manual VACUUM (reclaims space)
VACUUM users;

-- VACUUM ANALYZE (updates statistics)
VACUUM ANALYZE users;

-- VACUUM FULL (rewrites table, locks table)
VACUUM FULL users;

-- Check table bloat
SELECT schemaname, tablename,
  pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size,
  pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename) - pg_relation_size(schemaname||'.'||tablename)) AS external_size
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;

3.2 Autovacuum Configuration

-- Check autovacuum settings
SHOW autovacuum;
SHOW autovacuum_naptime;
SHOW autovacuum_vacuum_threshold;

-- Configure per-table autovacuum
ALTER TABLE users SET (
  autovacuum_vacuum_scale_factor = 0.1,
  autovacuum_analyze_scale_factor = 0.05
);

-- Monitor autovacuum activity
SELECT schemaname, relname, last_vacuum, last_autovacuum, last_analyze, last_autoanalyze
FROM pg_stat_user_tables
ORDER BY last_autovacuum DESC NULLS LAST;

4. Connection Pooling

4.1 PgBouncer Configuration

# pgbouncer.ini
[databases]
mydb = host=localhost port=5432 dbname=mydb

[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25
reserve_pool_size = 5
reserve_pool_timeout = 3

4.2 Application-Level Pooling

// Using pg-pool with Node.js
import { Pool } from 'pg'

const pool = new Pool({
  host: 'localhost',
  port: 5432,
  database: 'mydb',
  user: 'postgres',
  password: 'password',
  max: 20, // Maximum pool size
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 2000,
})

// Query with automatic connection management
async function getUser(id: number) {
  const result = await pool.query('SELECT * FROM users WHERE id = $1', [id])
  return result.rows[0]
}

// Transaction with connection from pool
async function createPost(userId: number, title: string, content: string) {
  const client = await pool.connect()
  try {
    await client.query('BEGIN')
    const result = await client.query(
      'INSERT INTO posts (author_id, title, content) VALUES ($1, $2, $3) RETURNING id',
      [userId, title, content]
    )
    await client.query('UPDATE users SET post_count = post_count + 1 WHERE id = $1', [userId])
    await client.query('COMMIT')
    return result.rows[0]
  } catch (e) {
    await client.query('ROLLBACK')
    throw e
  } finally {
    client.release()
  }
}

5. Table Partitioning

5.1 Range Partitioning

-- Create partitioned table
CREATE TABLE logs (
  id BIGSERIAL,
  user_id INTEGER,
  action TEXT,
  created_at TIMESTAMP NOT NULL
) PARTITION BY RANGE (created_at);

-- Create partitions
CREATE TABLE logs_2024_01 PARTITION OF logs
  FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');

CREATE TABLE logs_2024_02 PARTITION OF logs
  FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');

-- Create index on each partition
CREATE INDEX idx_logs_2024_01_user ON logs_2024_01(user_id);
CREATE INDEX idx_logs_2024_02_user ON logs_2024_02(user_id);

-- Query automatically uses correct partition
SELECT * FROM logs WHERE created_at >= '2024-01-15' AND created_at < '2024-01-20';

5.2 List Partitioning

-- Partition by category
CREATE TABLE products (
  id SERIAL,
  name TEXT,
  category TEXT,
  price DECIMAL
) PARTITION BY LIST (category);

CREATE TABLE products_electronics PARTITION OF products
  FOR VALUES IN ('electronics', 'computers', 'phones');

CREATE TABLE products_clothing PARTITION OF products
  FOR VALUES IN ('clothing', 'shoes', 'accessories');

6. Configuration Tuning

6.1 Memory Settings

-- postgresql.conf

-- Shared buffers (25% of RAM)
shared_buffers = 4GB

-- Effective cache size (50-75% of RAM)
effective_cache_size = 12GB

-- Work memory (per operation)
work_mem = 64MB

-- Maintenance work memory
maintenance_work_mem = 512MB

-- WAL buffers
wal_buffers = 16MB

6.2 Query Planner Settings

-- Cost-based optimizer settings
random_page_cost = 1.1  -- For SSD (default 4.0 for HDD)
effective_io_concurrency = 200  -- For SSD

-- Parallel query settings
max_parallel_workers_per_gather = 4
max_parallel_workers = 8

7. Monitoring and Diagnostics

7.1 Slow Query Log

-- Enable slow query logging
ALTER SYSTEM SET log_min_duration_statement = 1000; -- Log queries > 1s
SELECT pg_reload_conf();

-- Check slow queries
SELECT query, calls, total_time, mean_time, max_time
FROM pg_stat_statements
ORDER BY mean_time DESC
LIMIT 10;

7.2 Connection Monitoring

-- Active connections
SELECT count(*) FROM pg_stat_activity WHERE state = 'active';

-- Long-running queries
SELECT pid, now() - query_start AS duration, query
FROM pg_stat_activity
WHERE state = 'active' AND now() - query_start > interval '5 minutes'
ORDER BY duration DESC;

-- Kill long-running query
SELECT pg_terminate_backend(pid);

8. Caching Strategies

8.1 Application-Level Caching

import { Redis } from 'ioredis'

const redis = new Redis()

async function getUserWithCache(userId: number) {
  // Try cache first
  const cached = await redis.get(`user:${userId}`)
  if (cached) return JSON.parse(cached)

  // Query database
  const user = await pool.query('SELECT * FROM users WHERE id = $1', [userId])

  // Cache for 5 minutes
  await redis.setex(`user:${userId}`, 300, JSON.stringify(user.rows[0]))

  return user.rows[0]
}

8.2 Materialized Views

-- Create materialized view
CREATE MATERIALIZED VIEW user_stats AS
SELECT
  u.id,
  u.name,
  COUNT(p.id) as post_count,
  COUNT(c.id) as comment_count
FROM users u
LEFT JOIN posts p ON u.id = p.author_id
LEFT JOIN comments c ON u.id = c.user_id
GROUP BY u.id, u.name;

-- Create index on materialized view
CREATE INDEX idx_user_stats_post_count ON user_stats(post_count DESC);

-- Refresh materialized view
REFRESH MATERIALIZED VIEW user_stats;

-- Concurrent refresh (non-blocking)
REFRESH MATERIALIZED VIEW CONCURRENTLY user_stats;

9. Integration with Modern Frameworks

9.1 Next.js Integration

Works seamlessly with Next.js 15:

// lib/db.ts
import { Pool } from 'pg'

export const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20,
})

// app/api/users/route.ts
import { pool } from '@/lib/db'

export async function GET() {
  const result = await pool.query('SELECT id, name, email FROM users LIMIT 100')
  return Response.json(result.rows)
}

9.2 Docker Deployment

Optimize for Docker containerization:

# docker-compose.yml
version: '3.8'
services:
  postgres:
    image: postgres:16-alpine
    environment:
      POSTGRES_DB: mydb
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: password
    volumes:
      - postgres_data:/var/lib/postgresql/data
      - ./postgresql.conf:/etc/postgresql/postgresql.conf
    command: postgres -c config_file=/etc/postgresql/postgresql.conf
    ports:
      - "5432:5432"

volumes:
  postgres_data:

10. Best Practices Checklist

  • ✅ Create indexes on frequently queried columns
  • ✅ Use EXPLAIN ANALYZE to understand query plans
  • ✅ Implement connection pooling
  • ✅ Configure autovacuum appropriately
  • ✅ Monitor slow queries regularly
  • ✅ Use partitioning for large tables
  • ✅ Implement application-level caching
  • ✅ Optimize memory settings for your workload
  • ✅ Use prepared statements to prevent SQL injection
  • ✅ Regular backups and monitoring

Related Resources

Conclusion

PostgreSQL performance optimization is an ongoing process. Start with proper indexing, monitor query performance, implement connection pooling, and tune configuration based on your workload. Regular maintenance and monitoring ensure your database scales efficiently with your application growth.

Comments

Share your thoughts and join the discussion

Login required to comment0 / 1000

Please or to comment

Comments (0)

Related Articles