CREATE INDEX CONCURRENTLY and what locks it requires
When managing databases, especially in production environments, adding indexes
is a commonly performed task to improve query performance. A standard
CREATE INDEX
command in PostgreSQL locks the table against any kind of
updates (INSERT
, UPDATE
, DELETE
) for the duration of the index creation.
This can lead to significant downtime for applications relying on updating
that table.
PostgreSQL offers an alternative - CREATE INDEX CONCURRENTLY
. This command
allows you to build an index without blocking write operations on the table,
making it the only option for systems where writes are constant throughout the
day.
Understanding CREATE INDEX CONCURRENTLY
The magic behind CREATE INDEX CONCURRENTLY
is in its multi-phase approach:
- Initial Scan and Build: It performs an initial scan of the table and
starts building the index structure. During this phase, it only requires a
ShareUpdateExclusiveLock
on the table. This lock mode blocks schema changes (likeALTER TABLE
) andVACUUM FULL
, but it allowsINSERT
,UPDATE
andDELETE
operations to continue normally. - Waiting for Transactions: After the initial build, it waits for all transactions that started before this phase began to complete. This ensures that the index includes changes made by those transactions.
- Second Scan: It performs a second scan of the table to incorporate changes made by transactions that occurred during the initial build phase.
- Finalization: Briefly locks the table to ensure it’s using the latest updates, after which the index is marked as ready for use.
Lock Requirements
While CREATE INDEX CONCURRENTLY
avoids the heavy AccessExclusiveLock
used
by the standard CREATE INDEX
, however, it still needs an exclusive lock in the
finalization stage:
ShareUpdateExclusiveLock
(Mode 4): This is held on the table being indexed for most of the operation. It blocks schema changes andVACUUM FULL
but allows reads and writes (SELECT
,INSERT
,UPDATE
,DELETE
).- Brief
AccessExclusiveLock
(Mode 8): Towards the very end, when finalizing the index and making it visible in the system catalogs, it needs to acquire a briefAccessExclusiveLock
. This lock does block all other operations - but this lock is typically held for a very short duration, without noticeable effect on the performance of the system.
Why Use It?
The primary benefit is minimizing downtime. By allowing writes during the
lengthy index build process, CREATE INDEX CONCURRENTLY
is essential for
adding indexes to busy tables in production environments without disrupting
application availability or performance.
Trade-offs
It also requires two transactions. If either transaction fails, the index might be
left in an “invalid” state, requiring cleanup (DROP INDEX
) before retrying.
- Slower: Building an index concurrently takes significantly longer than a
standard
CREATE INDEX
due to the extra scans and waiting phases. For hot tables, there are likely going to be multiple passes before the index is close enough to the latest transaction and being able to enter its finalization stage. - Higher CPU/IO Load: The process consumes more system resources over a longer period.
- Cannot Run Inside a Transaction Block:
CREATE INDEX CONCURRENTLY
must be run outside of an explicit transaction (BEGIN
/COMMIT
). - Potential Failure: If something goes wrong (e.g., unique constraint violation during the second scan, or the final locking phase times out), it can leave behind an invalid index that needs manual cleanup.
- Blocked by long running transactions: If you have longer running transactions, those will slow down the creation of the index significantly, as the index has to wait for all previously open transactions to wrap. This is problematic if the table targetted by the index creation is particularly hot and requires multiple scans to catch up.
Conclusion
CREATE INDEX CONCURRENTLY
is a powerful tool in PostgreSQL for maintaining
performance without sacrificing availability. While it requires careful
consideration due to its longer duration, higher resource usage, and inability
to run within a transaction, its ability to avoid blocking writes makes it the
preferred method for adding indexes to live production tables. Understanding
its locking behaviour, as well as what transactions are blocking the index
creation means you can make an informed decision between CREATE INDEX
and
CREATE INDEX CONCURRENTLY
.