Postgres audit logging at scale: pgaudit vs triggers vs logical decoding
"Add an audit log" is the kind of request that sounds trivial and then eats a quarter of engineering time. On a small Postgres database you can turn on log_statement = 'all', grep the logs, and ship it. On a database doing 50,000 queries per second, that same setting will fill your disk in an afternoon and tank write throughput along the way.
Everyone writing at scale lands on one of three approaches: the pgaudit extension, trigger-based audit tables, or logical decoding piped into something downstream. They have different strengths and different failure modes. This post is the decision framework we'd give a team picking between them today.
A quick word on log_statement before we get to the real options: it's the default answer and it's usually wrong. It logs every query to the server log, including successful SELECTs on busy tables. It has no structured output, no filtering by object or role, and the overhead of writing to csvlog at high query rates is nontrivial. If you're auditing for compliance, you need structured fields, object-level filtering, and the ability to separate "all SELECTs" from "DDL on this one table." log_statement gives you none of that.
Option 1: pgaudit
pgaudit is a Postgres extension maintained by Crunchy Data. It hooks into the executor and the utility command path, emitting structured audit records to the Postgres log in a parsable format.
The killer feature is object-level audit. You can audit every statement against a specific table or role without auditing everything else:
-- Audit all access to the sensitive accounts table
ALTER TABLE accounts SET (pgaudit.log = 'read, write');
-- Audit all DDL for a specific role
ALTER ROLE deploy_user SET pgaudit.log = 'ddl';
You get session-level logging (the classic "log everything this session does") and object-level logging (the compliance-focused "log every touch of these tables") as two independent modes.
When to pick pgaudit: You need structured audit output, you control the Postgres configuration (not always true on managed services — see below), and your audit scope is describable in terms of roles and objects.
The trade-offs:
- It's supported on AWS RDS, Cloud SQL, and Azure, but usually as an opt-in parameter group change with a restart.
- Output goes to the Postgres log. You need a log shipper (Vector, Fluent Bit, Promtail) to get it anywhere useful.
- Session-level
pgaudit.log = 'all'on a busy database produces a lot of log volume. Plan for it — we've seen teams accidentally fill a 500 GB disk in a week. - It audits what Postgres saw, which isn't always what the application intended. A query that errored out before execution doesn't get audited in the same way as one that succeeded.
The perf overhead is real but bounded. The pgaudit maintainers publish benchmark numbers in the 1-5% range for session logging. Object-level logging is negligible.
Option 2: Trigger-based audit tables
Triggers are the simplest thing that could possibly work. Write an AFTER INSERT/UPDATE/DELETE trigger on each table you care about, and write a row into a parallel *_audit table:
CREATE TABLE accounts_audit (
id bigserial PRIMARY KEY,
op char(1) NOT NULL, -- 'I', 'U', 'D'
ts timestamptz NOT NULL DEFAULT clock_timestamp(),
actor text NOT NULL DEFAULT current_user,
row_id bigint NOT NULL,
old_data jsonb,
new_data jsonb
);
CREATE FUNCTION accounts_audit_fn() RETURNS trigger AS $$
BEGIN
INSERT INTO accounts_audit (op, row_id, old_data, new_data)
VALUES (
substring(TG_OP FROM 1 FOR 1),
COALESCE(NEW.id, OLD.id),
CASE WHEN TG_OP IN ('UPDATE','DELETE') THEN to_jsonb(OLD) END,
CASE WHEN TG_OP IN ('INSERT','UPDATE') THEN to_jsonb(NEW) END
);
RETURN NULL;
END; $$ LANGUAGE plpgsql;
CREATE TRIGGER accounts_audit_trigger
AFTER INSERT OR UPDATE OR DELETE ON accounts
FOR EACH ROW EXECUTE FUNCTION accounts_audit_fn();
This has real virtues: it's portable, it works on every managed Postgres service, the audit data is queryable with standard SQL, and the trigger runs in the same transaction as the change — so you can't get an audit row without the change, or vice versa.
When to pick triggers: You're auditing data modifications (not reads), you want transactional integrity between change and audit, and your write rate is moderate (under 1,000 writes/sec per audited table is safe; above that you should measure).
The trade-offs:
- Write amplification. Every INSERT on
accountsnow writes two rows — the actual change and the audit row. This doubles your WAL volume on audited tables. At high write rates this is noticeable. - Bloat. Audit tables are insert-only in principle, but they grow without bound. You'll need partitioning +
DROP PARTITIONfor retention, or batched DELETEs, both of which produce dead tuples that VACUUM has to clean. - Read auditing is hard. You can't fire a trigger on SELECT. For read-auditing you need pgaudit or something at the application layer.
- The audit is coupled to the transactional DB. Compliance queries ("show me every change to account X in the last year") compete with production workload for buffer cache and I/O. On a busy database this is the thing that eventually pushes you to move audit data out — which is what Option 3 is for.
The hstore-based audit trigger library from 2ndQuadrant is a well-tested generalization of the pattern above. If you go the trigger route, start there rather than rolling your own.
Option 3: Logical decoding + CDC
Logical decoding turns Postgres's write-ahead log into a stream of logical change events. You attach a replication slot, pick an output plugin (pgoutput built in since PG 10, or wal2json for JSON output), and read change events as they happen.
The common deployment is Debezium running as a Kafka Connect source, piping changes to Kafka and from there into whatever downstream system wants them — an analytics warehouse, a search index, a ClickHouse cluster for audit storage, or all three.
When to pick logical decoding: You have a high write rate, you're already running Kafka or a similar streaming platform, you want audit data decoupled from the transactional database, and you're willing to spend a week on setup before it's reliable.
The trade-offs:
- Lowest per-row overhead. The WAL is being written anyway; logical decoding just streams it. No trigger overhead, no extra INSERTs in the primary.
- Deferred, not transactional. The audit event for a change lands downstream after the change itself commits. If the slot falls behind or the consumer dies, you accumulate WAL — and an abandoned replication slot will eventually fill your disk. This is the most common failure mode and you need monitoring for it on day one.
- Schema change handling is annoying. Debezium and other consumers have to track Postgres schema changes to decode the WAL correctly. This is mature on pgoutput/Debezium but still requires care.
- Setup cost is real. Expect to spend meaningful time on Kafka, Debezium configuration, schema registry, dead-letter queues, and slot monitoring before you trust it in production.
- Managed-Postgres support varies. AWS RDS and Cloud SQL support logical decoding. Aurora Postgres supports it with some restrictions on the DDL that can flow through.
If you're already running Kafka, Option 3 is almost always the right long-term answer. If you're not, pgaudit or triggers may be a better fit until your scale forces the question.
Decision matrix
| Requirement | pgaudit | Triggers | Logical decoding |
|---|---|---|---|
| Read auditing (SELECT) | ✅ | ❌ | ❌ |
| Write auditing (INSERT/UPDATE/DELETE) | ✅ | ✅ | ✅ |
| DDL auditing | ✅ | ⚠️ event triggers | ⚠️ DDL support varies |
| Transactional guarantee (audit commits with change) | ✅ | ✅ | ❌ (eventual) |
| Per-row overhead at high write rate | Low | Medium (write amp) | Lowest |
| Operational cost | Low (log shipper) | Low | High (Kafka stack) |
| Audit data separate from transactional DB | ❌ (goes to Postgres log) | ❌ | ✅ |
| SOC 2 / HIPAA / PCI friendly | ✅ | ✅ | ✅ |
Most teams we talk to land here:
- Compliance-driven read auditing, single-digit-thousand QPS: pgaudit.
- Write auditing a small number of sensitive tables with strong consistency needs: triggers.
- Scale, decoupling, or you already have a stream processing layer: logical decoding.
Plenty of teams run more than one. It's common to use pgaudit for session-level compliance logging and triggers for critical business-object audit trails.
Why audit data almost never belongs in your transactional Postgres
Regardless of which option you pick, the output of an audit pipeline should usually land somewhere other than the Postgres cluster it's observing:
- Bloat and VACUUM pressure. Audit tables are write-heavy and rarely updated, but every insert in Postgres creates a tuple that VACUUM will eventually touch. At billions of rows, this fights your real workload for buffer cache and I/O.
- Retention is expensive. Dropping old audit data requires either partitioning (and operational work to roll partitions) or batched DELETEs (and the bloat that follows).
- Query patterns are wrong. Compliance queries typically aggregate over huge time ranges and filter by actor or object. Postgres row-storage is the wrong physical layout for that.
- Cost per GB. Managed Postgres storage is priced for OLTP, not for the bulk columnar scan patterns audit queries produce.
A columnar analytics store — ClickHouse, BigQuery, Snowflake — is a better long-term home for audit data. For ClickHouse specifically, we wrote up the schema we're using for Basira's command-audit log last week: partition by month, sort by (tenant, actor, time), compress the payload columns with ZSTD, and let TTL handle retention. It's cheap, it's fast for the queries compliance actually runs, and it doesn't fight your production Postgres for resources.
The usual pipeline ends up looking like this: Postgres → pgaudit output (or triggers, or Debezium) → log shipper / stream → ClickHouse (or similar) → queryable for compliance and incident review.
FAQ
Does pgaudit work on AWS RDS / Cloud SQL / Azure? Yes on all three, but it needs to be enabled via a parameter group change plus restart. See the AWS docs, Cloud SQL flags, and Azure flexible server audit docs.
Is there a performance hit from trigger-based audit? Roughly 2× the WAL volume for audited tables, plus the trigger execution cost. At moderate write rates (under 1,000 writes/sec per table), this is invisible. At 10,000+ writes/sec, measure carefully.
How do I handle GDPR "right to be forgotten" with audit logs?
This is genuinely hard. The audit record is the thing you're being asked to alter. The defensible approach is PII pseudonymization (replace actor with a stable hash, keep the audit structure) plus a published retention policy so that after N months the original actor mapping is unrecoverable even inside your systems.
Can I use logical decoding without running Kafka? Yes. You can consume a slot with any language that speaks the replication protocol — psycopg supports it in Python, and there are Go and Node libraries. But you lose the reliability guarantees Kafka gives you, and you become the person responsible for a custom consumer. For most teams, Kafka (or Redpanda, or Kinesis) is worth the operational cost.
What about tamper-evidence? Can I prove audit records haven't been altered? None of the three options gives you this for free. Hash-chain each audit row (store the hash of the previous row alongside each new row) and periodically anchor the latest hash to an external system — another database, an object-storage bucket, or a transparency log. Applies equally to pgaudit-captured logs, trigger-written tables, and CDC-derived events.
What's next
If you're deciding where to store audit data that came out of Postgres, the ClickHouse schema we walked through in the previous post is a good default. If you want to understand what Postgres is actually doing in real time — not just what it changed, but what it's executing right now and how that queue is moving — that's a different problem, and pg_stat_activity alone isn't enough to answer it. More on that soon.