Database Migration Lock - Quick Troubleshooting Guide
Quick Fix (When Tests Hang)
Option 1: Clean Database Restart (Recommended)
# Stop all database containers
docker-compose -f docker-compose.test.yml down -v
# Wait 5 seconds
sleep 5
# Start fresh
docker-compose -f docker-compose.test.yml up -d test-db
# Wait for database to be ready
sleep 15
# Run tests
npm test
Option 2: Manual Lock Release
If you don't want to restart the container:
# Release the lock directly
PGPASSWORD=testpassword psql -h 127.0.0.1 -p 5433 -U testuser -d testdb \
-c "DELETE FROM knex_migrations_lock;"
# Verify lock is released
PGPASSWORD=testpassword psql -h 127.0.0.1 -p 5433 -U testuser -d testdb \
-c "SELECT * FROM knex_migrations_lock;"
# Run tests
npm test
Option 3: Run Tests Sequentially
If locks keep occurring:
# Run tests without parallelization
npm test -- --threads false
# Or limit to single thread
npm test -- --threads 1
Understanding the Lock
What is knex_migrations_lock?
PostgreSQL table that prevents concurrent migrations:
-- This table exists in your database
SELECT * FROM knex_migrations_lock;
-- Output (when locked):
-- is_locked | locked_by
-- -----------+-----------
-- true | worker-1
Why It Happens
- Multiple test runners start simultaneously
- All try to run migrations at the same time
- First one acquires lock, others wait
- If first one crashes, lock stays held (stale lock)
- All subsequent tests hang waiting for lock
Timeline Example
10:00:00 - Test session 1 starts, acquires migration lock ✓
10:00:01 - Test session 2 starts, waits for lock...
10:00:02 - Test session 1 crashes without releasing lock
10:00:03 - Test session 2 still waiting... (will timeout after 10s)
10:00:13 - Test session 2 fails with "Migration table is already locked"
Prevention Strategies
1. Run Tests Serially (Most Reliable)
Edit backend/vitest.config.js:
export default defineConfig({
test: {
// Disable threading to prevent concurrent migration attempts
threads: false,
// Or limit to single worker
workers: 1,
globals: true,
environment: 'node',
setupFiles: './vitest.setup.js',
globalSetup: './global-setup.js',
testTimeout: 10000,
},
});
Trade-off: Tests run slower but never have lock conflicts
2. Increase Retry Timeout (Current Approach)
Edit backend/global-setup.js:
const MAX_RETRIES = 10; // Increase from 5
const RETRY_DELAY_MS = 2000;
for (let i = 0; i < MAX_RETRIES; i++) {
try {
// Force free any stale locks
await dbInstance.migrate.forceFreeMigrationsLock();
await dbInstance.migrate.latest();
break;
} catch (error) {
if (i === MAX_RETRIES - 1) {
console.error('Failed after max retries');
throw error;
}
await new Promise(r => setTimeout(r, RETRY_DELAY_MS));
}
}
Total wait time: 10 retries × 2 seconds = 20 seconds max
Trade-off: Tests wait longer but usually succeed
3. Use Separate Test Databases
For parallel test workers:
# Instead of single testdb for all workers
# Create testdb_worker_0, testdb_worker_1, testdb_worker_2, etc.
Implementation: Use environment variable for worker ID:
// vitest.setup.js
const workerId = process.env.VITEST_WORKER_ID || '0';
process.env.DB_NAME = `testdb_worker_${workerId}`;
Trade-off: Complex setup, much faster parallel execution
CI/CD Configuration
GitHub Actions Example
name: Tests
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
services:
postgres:
image: postgres:15
env:
POSTGRES_PASSWORD: testpassword
POSTGRES_DB: testdb
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
steps:
- uses: actions/checkout@v3
- uses: actions/setup-node@v3
with:
node-version: '18'
- run: npm install
# Run tests serially in CI to avoid lock contention
- run: npm test -- --threads false
env:
DB_HOST: localhost
DB_PORT: 5432
DB_USER: postgres
DB_PASSWORD: testpassword
DB_NAME: testdb
Debugging
Check Current Lock Status
# Connect to test database
PGPASSWORD=testpassword psql -h 127.0.0.1 -p 5433 -U testuser -d testdb
# Inside psql:
-- See all processes
SELECT pid, usename, state, query FROM pg_stat_activity;
-- Find blocking locks
SELECT blocking_locks.pid, blocked_locks.pid
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_locks blocking_locks ON blocking_locks.locktype = blocked_locks.locktype
AND blocking_locks.database IS NOT DISTINCT FROM blocked_locks.database
AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page
AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple
AND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxid
AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid
AND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classid
AND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objid
AND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubid
AND blocking_locks.pid != blocked_locks.pid
WHERE NOT blocked_locks.granted;
-- Check migration lock table
SELECT * FROM knex_migrations_lock;
Kill Hanging Processes
# From psql terminal
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE state = 'active' AND query LIKE '%migration%';
Production vs. Test Differences
| Aspect | Test | Production |
|---|---|---|
| Parallel runs | Yes (multiple test workers) | No (single server) |
| Frequent restarts | Yes (after each test suite) | No (days/months) |
| Migration frequency | Every test run | On deployment only |
| Lock timeout | 10 seconds (tests) | 60 seconds (deployments) |
| Recovery needed | Automatic on restart | Manual unlock |
Long-Term Solution: Separate Migration Service
For production, consider separate deployment step:
# 1. Run migrations (once, before app starts)
npm run migrate:latest
# 2. Start N application instances (no migrations)
npm run start:server
npm run start:server
npm run start:server
This prevents ANY lock contention because migrations are pre-done.
References
- Knex.js Migrations: https://knexjs.org/guide/migrations.html
- PostgreSQL Advisory Locks: https://www.postgresql.org/docs/current/functions-admin.html
- Vitest Parallel Execution: https://vitest.dev/config/#threads