◈   ⌘ api · Intermediate

WebSocket Gap Detection in Crypto: A Complete Guide

Learn how to detect WebSocket data gaps in crypto trading APIs, handle missed updates, and build resilient real-time order book pipelines with Python examples.

Uncle Solieditor · voc · 06.05.2026 ·views 13
◈   Contents
  1. → What Is a WebSocket Gap and Why It Matters
  2. → Common Causes of WebSocket Data Gaps
  3. → Detecting Gaps with Binance Update ID Tracking
  4. → Reconnection with Exponential Backoff
  5. → Full Order Book Sync with Gap Recovery
  6. → Gap Detection Patterns Across Exchanges
  7. → Frequently Asked Questions
  8. → Conclusion

If you've ever run a trading bot and noticed your order book slowly drifting out of sync — showing prices that don't match reality — you've already experienced a WebSocket gap. It's one of those silent killers in algorithmic trading: no error thrown, no crash, just stale data feeding decisions that should never have been made. Understanding how to detect and recover from these gaps is what separates a reliable trading system from one that occasionally blows up in subtle ways.

What Is a WebSocket Gap and Why It Matters

A WebSocket gap occurs when your client misses one or more sequential update messages from the exchange. In order book streaming — the most common use case — every update carries sequence identifiers. Binance, for example, tags each depth update with firstUpdateId (U) and lastUpdateId (u). If the U of the next message doesn't equal the previous u + 1, you've got a gap: a span of order book changes your system never processed.

This matters because crypto order books are built incrementally. You start with a REST snapshot, then apply a stream of delta updates. Miss a delta and your local book diverges from reality. On a volatile pair like BTC/USDT or ETH/USDT, a few missed updates during a large candle can mean your bid/ask spread calculations are completely wrong, your liquidation estimates are off, or your arbitrage logic is working against a phantom book.

Bybit and OKX use similar sequence-based mechanisms, though their field names differ. Bybit's WebSocket depth feed uses a seq field in its snapshot and delta messages. OKX uses checksum validation on top of sequence tracking, giving you an extra layer of integrity checking. The principle is universal: track the sequence, detect breaks, and resync when they occur.

Common Causes of WebSocket Data Gaps

The most dangerous gap isn't the one that crashes your bot — it's the one that silently corrupts your order book state. Always validate sequence continuity, not just connection health.

Detecting Gaps with Binance Update ID Tracking

The cleanest way to detect gaps is to track the lastUpdateId from each message and compare it against the firstUpdateId of the next. Here's a working Python implementation against Binance's spot depth stream:

import asyncio
import json
import websockets

class BinanceGapDetector:
    def __init__(self, symbol="BTCUSDT"):
        self.symbol = symbol.lower()
        self.last_update_id = None
        self.gap_count = 0
        self.ws_url = f"wss://stream.binance.com:9443/ws/{self.symbol}@depth"

    async def connect(self):
        async with websockets.connect(self.ws_url) as ws:
            print(f"Connected to Binance {self.symbol} depth stream")
            async for message in ws:
                await self.handle_message(json.loads(message))

    async def handle_message(self, data):
        first_id = data.get("U")  # firstUpdateId
        last_id = data.get("u")   # lastUpdateId

        if self.last_update_id is None:
            self.last_update_id = last_id
            print(f"Initialized at update ID: {last_id}")
            return

        expected_first = self.last_update_id + 1
        if first_id != expected_first:
            missed = first_id - expected_first
            self.gap_count += 1
            print(f"[GAP #{self.gap_count}] Expected ID {expected_first}, got {first_id}")
            print(f"  Missed approximately {missed} update(s)")
            await self.on_gap_detected(first_id, expected_first)

        self.last_update_id = last_id

    async def on_gap_detected(self, received_id, expected_id):
        # Override this to trigger REST snapshot resync
        print(f"Action needed: resync order book snapshot")

async def main():
    detector = BinanceGapDetector("ETHUSDT")
    await detector.connect()

asyncio.run(main())

This gives you the foundation. The on_gap_detected method is your hook — in production you'd call it to fetch a fresh REST snapshot and rebuild your local book from that baseline. Note that Binance's documentation requires you to discard buffered updates where u is less than the snapshot's lastUpdateId, so your recovery logic needs to account for that ordering.

Reconnection with Exponential Backoff

Gap detection without recovery logic is just logging. The real value is building automatic reconnection that fetches a fresh snapshot, replays any buffered updates, and resumes cleanly. Exponential backoff prevents hammering the exchange during an outage — something that will get your IP temporarily banned on Binance and Bybit if you're not careful.

import asyncio
import websockets
import json
from datetime import datetime

async def managed_websocket(url, message_handler, max_retries=10):
    """
    WebSocket connection manager with exponential backoff.
    Calls message_handler(data) for each incoming message.
    Returns False if gap detected (caller should resync).
    """
    retries = 0
    base_delay = 1.0

    while retries < max_retries:
        try:
            async with websockets.connect(
                url,
                ping_interval=20,
                ping_timeout=10,
                close_timeout=5
            ) as ws:
                retries = 0  # Reset on successful connection
                ts = datetime.utcnow().isoformat()
                print(f"[{ts}] WebSocket connected")

                async for raw_message in ws:
                    data = json.loads(raw_message)
                    result = await message_handler(data)
                    if result == "RESYNC":
                        print("Gap detected — triggering resync and reconnect")
                        return "RESYNC"

        except websockets.ConnectionClosedError as e:
            retries += 1
            delay = min(base_delay * (2 ** retries), 60)
            print(f"Connection closed (code {e.code}). Retry {retries}/{max_retries} in {delay:.1f}s")
            await asyncio.sleep(delay)

        except OSError as e:
            retries += 1
            delay = min(base_delay * (2 ** retries), 60)
            print(f"Network error: {e}. Retry {retries}/{max_retries} in {delay:.1f}s")
            await asyncio.sleep(delay)

    raise RuntimeError(f"WebSocket failed after {max_retries} retries")
Set ping_interval=20 on your websockets.connect() call. Binance closes connections that go 60 seconds without a pong response. Bybit requires a ping/pong heartbeat every 20 seconds or it drops the connection silently.

Full Order Book Sync with Gap Recovery

The production-grade approach combines WebSocket gap detection with REST snapshot fallback. When a gap is detected, you pause stream processing, fetch a fresh snapshot from the REST API, and resume applying only the updates with IDs greater than the snapshot's lastUpdateId. This is the pattern recommended in Binance's official WebSocket documentation and works equivalently on Bybit and OKX with their respective snapshot endpoints.

import asyncio
import aiohttp
import websockets
import json

class OrderBookManager:
    def __init__(self, symbol="BTCUSDT"):
        self.symbol = symbol
        self.last_update_id = None
        self.bids = {}  # price -> quantity
        self.asks = {}  # price -> quantity
        self.initialized = False
        self.buffered_updates = []

    async def fetch_snapshot(self):
        """Fetch REST snapshot from Binance and reset local state."""
        url = f"https://api.binance.com/api/v3/depth?symbol={self.symbol}&limit=1000"
        async with aiohttp.ClientSession() as session:
            async with session.get(url) as resp:
                resp.raise_for_status()
                data = await resp.json()

        self.last_update_id = data["lastUpdateId"]
        self.bids = {p: q for p, q in data["bids"] if float(q) > 0}
        self.asks = {p: q for p, q in data["asks"] if float(q) > 0}
        self.initialized = True
        print(f"Snapshot loaded: lastUpdateId={self.last_update_id}, "
              f"bids={len(self.bids)}, asks={len(self.asks)}")

    def apply_update(self, data):
        """Apply a depth update. Returns 'GAP', 'SKIP', or 'OK'."""
        first_id = data["U"]
        last_id = data["u"]

        if not self.initialized:
            return "SKIP"

        # Discard updates that predate our snapshot
        if last_id <= self.last_update_id:
            return "SKIP"

        # Gap detected
        if first_id > self.last_update_id + 1:
            missed = first_id - self.last_update_id - 1
            print(f"GAP DETECTED: missed {missed} update(s) "
                  f"(IDs {self.last_update_id+1} to {first_id-1})")
            return "GAP"

        # Apply bid updates
        for price, qty in data["b"]:
            if float(qty) == 0:
                self.bids.pop(price, None)
            else:
                self.bids[price] = qty

        # Apply ask updates
        for price, qty in data["a"]:
            if float(qty) == 0:
                self.asks.pop(price, None)
            else:
                self.asks[price] = qty

        self.last_update_id = last_id
        return "OK"

    def best_bid_ask(self):
        if not self.bids or not self.asks:
            return None, None
        best_bid = max(self.bids.keys(), key=float)
        best_ask = min(self.asks.keys(), key=float)
        return best_bid, best_ask

    async def run(self):
        ws_url = f"wss://stream.binance.com:9443/ws/{self.symbol.lower()}@depth"

        while True:
            await self.fetch_snapshot()

            try:
                async with websockets.connect(ws_url, ping_interval=20) as ws:
                    async for raw in ws:
                        data = json.loads(raw)
                        result = self.apply_update(data)

                        if result == "GAP":
                            print("Resyncing order book...")
                            break  # Exit inner loop, re-fetch snapshot

                        if result == "OK":
                            bid, ask = self.best_bid_ask()
                            if bid and ask:
                                spread = float(ask) - float(bid)
                                print(f"Best bid: {bid} | Best ask: {ask} | Spread: {spread:.2f}")

            except websockets.ConnectionClosed:
                print("Connection closed, resetting...")
                await asyncio.sleep(2)

async def main():
    manager = OrderBookManager("BTCUSDT")
    await manager.run()

asyncio.run(main())

Platforms like VoiceOfChain that aggregate real-time signals across multiple exchanges run exactly this kind of gap-aware stream management internally. When you're consuming pre-aggregated signals rather than raw exchange feeds, you're implicitly benefiting from this infrastructure — gaps on the source feed are handled before the signal reaches you.

Gap Detection Patterns Across Exchanges

Gap detection field names by exchange
ExchangeSequence FieldSnapshot EndpointResync Strategy
BinanceU (first) / u (last)GET /api/v3/depthDrop updates where u < snapshotLastUpdateId
Bybitseq in snapshot, seq in deltaGET /v5/market/orderbookResync if delta seq != snapshot seq + 1
OKXseqId per messageGET /api/v5/market/booksResync if seqId gap > 0; validate checksum
Coinbase Advancedsequence in L2 channelREST product book snapshotResync on any sequence discontinuity
Bitgetu field in depth updateGET /api/v2/spot/market/orderbookCompare against previous u value

OKX adds a checksum field to each depth update — a CRC32 over the top 25 bids and asks. Even if your sequence tracking doesn't catch a subtle corruption, computing this checksum client-side and comparing it against OKX's value will surface the discrepancy. It's an extra 5 lines of code and worth implementing if you're running high-frequency strategies on OKX.

Gate.io and KuCoin both use sequence-based depth streams but their reconnect behavior differs: Gate.io sends a full snapshot on reconnect automatically, while KuCoin requires you to re-subscribe and explicitly request a snapshot via their REST API before you can trust the stream again.

Frequently Asked Questions

How often do WebSocket gaps actually occur in practice?
On stable VPS infrastructure with a good connection to exchange servers, gaps are rare — maybe once or twice per day during normal market conditions. During high-volatility events like major liquidation cascades or Fed announcements, gap frequency can spike significantly. Always design your system to handle them, even if they're infrequent.
Should I use the combined stream endpoint or individual streams on Binance?
For order book data specifically, individual streams (e.g. btcusdt@depth) give you cleaner sequence tracking. Combined streams share a single WebSocket but interleave updates from multiple symbols, which can make gap detection logic slightly more complex since you need to track lastUpdateId per symbol independently.
Is it better to reconnect immediately on a gap or wait for the current stream to close?
Break immediately. Once a gap is detected, every subsequent update you apply is being layered on top of a corrupted baseline. Continuing to process updates while waiting for a natural disconnect makes the corruption worse. Close the WebSocket, fetch a fresh REST snapshot, then reconnect.
Can I use Bybit's 200ms snapshot stream instead of tracking gaps manually?
Yes, Bybit offers a snapshot-style stream that sends full order book state at intervals rather than deltas. This sidesteps gap detection entirely but comes with higher bandwidth and latency tradeoffs. For most strategies, delta streams with gap detection are more efficient — but for lower-frequency signals the snapshot stream is a valid shortcut.
How do I handle gaps when running multiple trading pairs simultaneously?
Maintain a separate OrderBookManager instance per symbol. Each instance tracks its own lastUpdateId and handles resync independently. Sharing state across symbols creates race conditions, especially when one symbol triggers a full resync while others are receiving updates at high frequency.
What's the difference between a WebSocket gap and a sequence reset?
A gap means you missed updates between two known sequence points. A sequence reset means the exchange restarted its counter — this happens after maintenance windows and looks like a gap but with a very large apparent jump. Most exchanges signal a reset with a new subscription snapshot; treat it as a full resync trigger, not a recoverable gap.

Conclusion

WebSocket gap detection is not an edge case — it's a core reliability requirement for any trading system that depends on real-time order book data. The pattern is consistent across Binance, Bybit, OKX, and other major exchanges: track sequence IDs, detect discontinuities, and trigger a REST snapshot resync when they occur. The Python implementations above give you a working foundation that handles the full lifecycle from initial connection through gap detection and recovery. Pair this with exponential backoff reconnection logic and per-symbol state management, and you'll have an order book pipeline that can run continuously without silently corrupting your trading data.

◈   more on this topic
◉ basics Mastering the ccxt library documentation for crypto traders ⌂ exchanges Mastering the Binance CCXT Library for Crypto Traders ⌬ bots Best Crypto Trading Bots 2025: Profitable AI-Powered Strategies