2021 Database Security

Implementing Transparent SQL Audit Logging

Regulatory compliance frameworks—SOC 2, HIPAA, PCI DSS, GDPR—universally require the ability to answer a simple question: who changed what, and when? Application-level audit logging, where the application code explicitly records each mutation before or after executing it, is the most common approach and also the most fragile. It depends on every developer remembering to add logging to every code path that modifies data. It misses changes made through database migrations, administrative SQL sessions, and ORM bulk operations. It breaks silently when a new table is added without corresponding audit instrumentation. In our database security assessments, we consistently recommend a database-native audit mechanism that operates transparently, without any cooperation from the application layer. PostgreSQL's trigger system and JSONB data type make this straightforward to implement and maintain.

The Architecture

The pattern centres on a single audit_log table with a generic schema: timestamp, table name, operation type (INSERT, UPDATE, DELETE), the old row state, the new row state, and the identity of the user who performed the operation. The old and new row states are captured using row_to_json(OLD) and row_to_json(NEW) within a trigger function, producing complete JSONB snapshots of the row before and after the mutation. For UPDATE operations, we additionally compute a diff—a JSONB object containing only the columns that changed, with their before and after values. This diff is invaluable for audit review: instead of comparing two complete row snapshots to determine what changed, the reviewer sees exactly which fields were modified. The trigger function is attached to each audited table as an AFTER trigger, ensuring it fires only after the mutation has been validated by all constraints and other triggers.

The most elegant aspect of the pattern is how it captures user identity. In a web application, the database connection is typically shared via a connection pool, and the PostgreSQL session user is a service account rather than the end user. To bridge this gap without modifying the trigger function's signature, we use current_setting('app.current_user_id', true) within the trigger. The application sets this session variable immediately after acquiring a connection from the pool: SET LOCAL app.current_user_id = '...'. The SET LOCAL scope ensures the value is bound to the current transaction and automatically cleared when the transaction completes, preventing identity leakage between requests sharing the same pooled connection. The second parameter to current_setting (true) makes the function return NULL rather than raising an error if the variable is not set, which handles the case of administrative sessions or migrations that do not set the application user context.

Edge Cases and Operational Concerns

Several edge cases require careful handling. Foreign key cascades are the most important: when a parent row is deleted and child rows are cascade-deleted, the trigger on the child table fires for each deleted row with the correct OLD values, but the user identity context reflects whoever deleted the parent. This is actually the correct audit behaviour—the cascade was caused by the parent deletion—but it must be documented in the audit policy so that reviewers understand why child row deletions appear under the same user and timestamp as the parent. Soft deletes (setting a deleted_at timestamp rather than removing the row) are captured as UPDATE operations, which is correct. Hard deletes are captured as DELETE operations with the full old row state preserved in the audit log, providing a complete record even after the data itself is gone. This is a critical compliance property: the audit log must survive the deletion of the data it describes.

Performance impact is the primary operational concern. Each audited mutation now performs an additional INSERT into the audit_log table. For write-heavy tables, this can double the effective write load. We mitigate this in two ways. First, the audit_log table is partitioned by month using PostgreSQL's native declarative partitioning, which keeps each partition at a manageable size and allows old partitions to be archived or detached without affecting current writes. Second, the JSONB columns are stored with TOAST compression, which is automatic for values exceeding roughly 2 KB. For tables with many columns or large text fields, TOAST compression reduces the storage overhead of the audit log significantly. Indexing the audit log requires careful thought: a GIN index on the JSONB columns enables flexible querying but adds write overhead. We typically recommend a B-tree index on (table_name, created_at) for time-range queries and defer GIN indexes to a read replica used for audit review and compliance reporting.

The final consideration is tamper resistance. An audit log stored in the same database as the data it monitors is vulnerable to an attacker or insider who has superuser access: they can delete or modify audit records to cover their tracks. For environments where this threat is in scope, we recommend streaming audit log entries to an external append-only store—a separate PostgreSQL instance with revoked DELETE and UPDATE privileges on the audit table, an object storage bucket with versioning and object lock enabled, or a dedicated SIEM system. The database trigger captures the data; the replication mechanism ensures it survives compromise of the primary database. This layered approach provides the immediacy and simplicity of trigger-based capture with the tamper resistance of external storage, meeting the requirements of even the most stringent compliance frameworks without imposing undue complexity on the application development team.