AuditCoreAuditCore
highCWE-362OWASP A04:2021 — Insecure DesignTOCTOUTime-of-Check Time-of-Use

How to fix Race Conditions in Web APIs

20 simultaneous POSTs to /coupons/redeem and the user gets 20× discount. 5 concurrent /transfer requests for $100 from a $200 balance and the user transfers $500. Race conditions are functional bugs that become security bugs when money or inventory is at stake. Fix: database transactions, locks, and idempotency.

What is Race Conditions in Web APIs?

A race condition is when two or more concurrent operations interleave in a way the developer didn't anticipate. The classic Time-of-Check Time-of-Use (TOCTOU): `if (balance >= amount) { balance -= amount; }`. Two concurrent requests both pass the check (both see balance=$100), then both deduct ($100 - $100 - $100 = -$100). Server allowed double withdrawal.

In web apps this happens because most languages handle requests concurrently — Node.js single-threaded but async (race within event loop), Python with workers, Go with goroutines. Database operations between SELECT and UPDATE are vulnerable. So are filesystem `if exists then create` patterns, in-memory deduplication, and cache-then-write flows.

Race conditions are typically classified as functional bugs but become serious vulnerabilities when they affect: balance/payments (double-spend), inventory (oversell), authorization (privilege change), uniqueness constraints (duplicate accounts), rate limiting (brute force amplification). PortSwigger's research papers on 'limit overrun' attacks document dozens of real-world cases.

What an attacker can do

The concrete impact of leaving Race Conditions in Web APIs unpatched.

Double-spend / multi-redemption

Same coupon used N times, same balance withdrawn N times, same gift card redeemed N times.

Inventory oversell

Last item sold to multiple users; refunds + customer support cost.

Rate-limit bypass

Concurrent requests beat per-user counters; brute force, scraping, abuse all amplified.

Authorization bypass during state transitions

User upgraded mid-request; permissions check passed before privileges changed; sensitive op completed under stale permissions.

How do I know if I'm vulnerable?

Manual: identify check-then-act patterns. `if (balance >= x)` followed by deduction. `if (existing == null)` followed by insert. `if (count < limit)` followed by increment. Each is suspect — does the check + act happen atomically?

Automated: AuditCore's Race Condition Tester sends 20 concurrent identical requests to a target endpoint and checks for: more state changes than expected (multiple charges), uniqueness violations (multiple users with same email), or limit overruns. Findings show the race window in milliseconds.

How to fix Race Conditions in Web APIs

5 ordered steps. Apply them in order — each builds on the previous.

  1. 1

    Use database transactions with appropriate isolation level

    Wrap check-then-act in a transaction. Default isolation (READ COMMITTED) often isn't enough — use SERIALIZABLE or row-level locking.

    PostgreSQL: `BEGIN ISOLATION LEVEL SERIALIZABLE` or `SELECT ... FOR UPDATE`. MySQL: `SELECT ... FOR UPDATE`. Both make concurrent transactions either serialize or fail with serialization error you can retry.

    pythonStep 1
    from sqlalchemy.orm import Session
    from sqlalchemy.exc import IntegrityError
    
    def transfer(session: Session, from_id: int, to_id: int, amount: int):
        with session.begin():
            # SELECT FOR UPDATE locks the row
            from_acct = session.query(Account).filter_by(id=from_id).with_for_update().one()
            if from_acct.balance < amount:
                raise InsufficientFunds()
    
            to_acct = session.query(Account).filter_by(id=to_id).with_for_update().one()
    
            from_acct.balance -= amount
            to_acct.balance += amount
            # commit on context exit; concurrent transfers serialize on the same row
    
    # Note: lock rows in consistent order (e.g., always min(id) first) to avoid deadlocks
  2. 2

    Use atomic database operations (single-statement updates)

    Avoid SELECT-then-UPDATE entirely when possible. Use `UPDATE ... WHERE balance >= amount` — atomic at the database level.

    Affected-rows count tells you whether the update applied. 0 rows = condition failed.

    sqlStep 2
    -- ❌ Race-prone: SELECT then UPDATE
    SELECT balance FROM accounts WHERE id=42;
    -- check in app code
    UPDATE accounts SET balance = balance - 100 WHERE id=42;
    
    -- ✅ Atomic: condition in WHERE clause
    UPDATE accounts
    SET balance = balance - 100
    WHERE id = 42 AND balance >= 100;
    -- if affected_rows = 0, balance was insufficient; bail out
    
    -- ✅ Inventory decrement
    UPDATE products
    SET stock = stock - 1
    WHERE id = 99 AND stock > 0;
    -- 0 rows = sold out
    
    -- ✅ Coupon single-use
    UPDATE coupons
    SET used = used + 1, last_used_by = $user_id
    WHERE code = $code AND used < max_uses;
  3. 3

    Use unique constraints for uniqueness invariants

    If you need 'one user per email', let the database enforce it via UNIQUE INDEX. Catch the integrity error in app code.

    Don't `if not exists then insert` — that's the canonical race. Database-level constraint is bulletproof.

    javascriptStep 3
    // ❌ Race window between SELECT and INSERT
    const existing = await User.findOne({ email });
    if (existing) throw new Error("Email taken");
    await User.create({ email, name });
    
    // ✅ Let DB enforce uniqueness
    try {
      await User.create({ email, name });
    } catch (err) {
      if (err.code === 11000 /* mongo dup key */ || err.code === '23505' /* pg */) {
        throw new Error("Email taken");
      }
      throw err;
    }
    
    // SQL: ensure UNIQUE INDEX on email column
    // CREATE UNIQUE INDEX idx_users_email ON users (LOWER(email));
  4. 4

    Use Redis SETNX or distributed locks for cross-process coordination

    When the resource is shared across multiple app servers (not just DB rows), use a distributed lock — Redis or PostgreSQL advisory locks.

    Redis SETNX is simple. Redlock is more robust against partition failures.

    javascriptStep 4
    import Redis from "ioredis";
    const redis = new Redis();
    
    async function withLock(key, ttlMs, fn) {
      const lockKey = `lock:${key}`;
      const value = crypto.randomUUID();
      const acquired = await redis.set(lockKey, value, "PX", ttlMs, "NX");
      if (!acquired) throw new Error("Resource busy");
      try {
        return await fn();
      } finally {
        // Release only if we still own it (TTL didn't expire)
        const script = `
          if redis.call("GET", KEYS[1]) == ARGV[1] then
            return redis.call("DEL", KEYS[1])
          end
          return 0
        `;
        await redis.eval(script, 1, lockKey, value);
      }
    }
    
    // Usage:
    await withLock(`coupon:${couponCode}`, 5000, async () => {
      // single redemption serialized across all servers
    });
  5. 5

    Combine with idempotency keys for client-side resilience

    Client may retry on network failures. Idempotency key ensures the retry is a no-op, not a duplicate operation.

    See /fix/business-logic-abuse for full idempotency middleware. With idempotency + atomic DB ops, double-submit becomes harmless.

    javascriptStep 5
    // Combined defense: idempotency middleware + atomic SQL
    app.post("/charge",
      requireIdempotencyKey,         // first defense: dedupe at HTTP layer
      async (req, res) => {
        const result = await db.query(`
          UPDATE accounts
          SET balance = balance - $1
          WHERE id = $2 AND balance >= $1
          RETURNING balance
        `, [req.body.amount, req.user.account_id]);
    
        if (result.rowCount === 0) {
          return res.status(400).json({ error: "Insufficient funds" });
        }
        return res.json({ newBalance: result.rows[0].balance });
      }
    );

How to verify the fix

Run AuditCore — the Race Condition Tester sends 20 concurrent requests to flagged endpoints. Findings include the actual count of state changes vs expected.

Manual: identify a critical endpoint (charge, redeem, transfer). Use Burp Turbo Intruder or `xargs -P20 curl ...` to send 20 parallel requests. Check the database — does it reflect 20 changes (race vulnerable) or 1 (safe)? Look for 'last item' inventory bugs by ordering the same item from 2 browser tabs simultaneously.

FAQ

Frequently asked questions

Are race conditions a security bug or just a functional bug?+

Both. They're functional bugs in design but become security vulnerabilities when they affect money, inventory, authorization, or rate limits. PortSwigger's 'limit overrun' research documents how attackers turn race conditions into real-world exploits — coupon abuse, NFT mint bypass, balance withdrawal.

Doesn't Node.js single-threaded async prevent races?+

Single-threaded but ASYNC. While `await db.query()` is pending, the event loop processes another request. Both see the same DB state, both check, both update. Same problem as multi-threaded races, just lower-throughput.

Is SERIALIZABLE isolation always safe?+

Safe but expensive. Concurrent serializable transactions throw serialization-error on conflict — your code must catch and retry. For most cases, row-level locks (`FOR UPDATE`) are simpler and equally safe for the locked rows.

What about rate limiting — can attackers bypass with concurrent requests?+

Yes if rate limit is implemented as 'count requests, increment counter, check vs limit'. Atomic INCR (Redis) avoids this. Sliding window with atomic ZADD is even better.

Can I use eventual consistency (DynamoDB) safely?+

For non-critical data (counters, popularity scores), yes. For money/auth/uniqueness, no — eventual consistency means concurrent reads see different states. Use strongly-consistent reads + conditional writes, or a strongly-consistent DB for these operations.

Don't just guess — scan and verify

AuditCore Free Trial scans your homepage for Race Conditions in Web APIs and 50+ other vulnerability classes. No credit card. Results in 60 seconds.