Postgres advisory locks
Jul 14, 2016 · 3 minute read · Commentspostgres
Postgres employs various types of locks to control shared access to data. Most often, Postgres takes care of using the correct lock on its own, but the API is public and application developers can leverage them as well.
There is, however, one type of locks not used by Postgres itself: advisory locks. They are intended only for applications and Postgres does not enforce their use — it only provides functions to manipulate them and automatically cleans them up at the end of the session.
Advisory locks come in handy when you need to lock more than a single row but less than the whole table. Our use in Enectiva is related to UPSERT
ing: whenever we get a new reading, we need to update derived consumption data. This involves updating a few rows and adding several others. Postgres does not support an UPSERT
operation, so we need to mimic it be a DELETE
and an INSERT
wrapped in a transaction.
That’s all well and good, until two workers try to update the same interval of the same meter at the same time. In the simplest case, there will be at least one record that both workers want to update (= delete + insert). If the operation was an actual UPDATE
the last one would simply win, but because we have to run DELETE
+ INSERT
, a slight semantic difference complicates things. The application sees the DELETE
+ INSERT
as operating on one logical record (one primary key value), but Postgres sees two different records which happen to have the same primary key value.
At the beginning, the first worker automatically locks the record it wants to delete, the second worker has to wait. The lock is released when the record gets deleted. The second worker tries to delete it as well and it succeeds, because the record is not there anymore. It, however, gets replaced by the first worker with a new record. When the second worker tries to insert its version, it runs against the uniqueness constraint of the primary key.
Advisory lock FTW. We could lock the whole table during an update, but that would drastically reduce the throughput — one meter at a time. Advisory lock allows us to lock only rows of a single meter — one operation on each meter at a time. It is not a perfect solution but it is an acceptable one (and could be improved).
Every transaction operating on the consumption data needs to acquire a unique advisory lock. Postgres takes care of their management but assumes that all applications try to acquire it — it is their responsibility! We simply derive lock ID from a meter ID because all our queries operate only on a single meter. Each worker that needs to touch data of that meter asks Postgres for the corresponding lock. If it is denied, it will retry, otherwise it will take it out and do its work and then release the lock back.
Postgres provides a set of functions for manipulating advisory locks. We have found that transaction scoped locks are easiest to use, because Postgres automatically releases them when the transaction ends, thus avoiding any zombie locks. Apart from this advantage, advisory locks are also faster than writing to some sort of custom lock table (and much less error prone).
For use in Ruby, there is a gem wrapping advisory locks. It has the apt name with_advisory_lock. It provides a shim for MySQL as well, but if you’re content with Postgres-only functionality you might want to implement the thirty lines of code yourself.
We're looking for developers to help us save energy
If you're interested in what we do and you would like to help us save energy, drop us a line at jobs@enectiva.cz.