Binance API Error 1003: How to Fix Rate Limit Bans
Binance API error 1003 means your bot hit a rate limit and got IP-banned. This guide breaks down what triggers it, how to recover, and how to code around it.
Binance API error 1003 means your bot hit a rate limit and got IP-banned. This guide breaks down what triggers it, how to recover, and how to code around it.
Your trading bot was running fine, and then suddenly — nothing. Every request comes back with {"code": -1003, "msg": "Too many requests, please use the WebSocket for live updates to avoid polling the API."}. Your IP is banned, your positions are stuck, and you're watching the market move without being able to act. Error 1003 is one of the most common roadblocks for anyone building algo trading systems on Binance, and it's almost always avoidable once you understand how rate limiting actually works.
Error -1003 in the Binance API means TOO_MANY_REQUESTS. It's not a soft warning — by the time you see it in the response body, Binance has already decided your IP is misbehaving. The API uses a tiered response system: HTTP 429 comes first as a warning that you're approaching the limit. If your code ignores that warning and keeps hammering the endpoint, Binance escalates to HTTP 418 — an IP ban that includes a Retry-After header telling you exactly how long to wait before you're allowed back in.
HTTP 429 is a warning. HTTP 418 is the actual IP ban. If your code retries on 429 without backing off, you will get 418. Error code -1003 in the JSON body accompanies both responses.
The ban duration scales with severity. The first offense is typically 2 minutes. Repeat violations within 24 hours escalate to bans lasting 30 minutes or more. Binance documentation explicitly states that automated systems repeatedly violating rate limits risk permanent API key suspension — which is a far more serious outcome than a temporary IP block.
Binance doesn't use a simple requests-per-second counter. They use a weight system. Every endpoint has a different cost, and your account accumulates weight over a rolling 1-minute window. The default limits are a 6000 weight-per-minute cap, a 61,200 raw request cap per 5 minutes, and separate order placement limits. The weight cost per endpoint varies dramatically.
| Endpoint | Weight Cost | Notes |
|---|---|---|
| GET /api/v3/ping | 1 | Connectivity check |
| GET /api/v3/ticker/price (1 symbol) | 2 | Single symbol price |
| GET /api/v3/ticker/24hr (no symbol) | 80 | Returns all symbols |
| GET /api/v3/depth (limit=500) | 10 | Order book snapshot |
| GET /api/v3/account | 20 | Account info |
| POST /api/v3/order | 1 | Place an order |
If your bot is fetching the full 24hr ticker for all symbols every 10 seconds, that's 80 weight per call — 480 weight per minute from one endpoint alone. Stack that with order book polling across 20 pairs and you'll blow through 6000 weight in under two minutes. The X-MBX-USED-WEIGHT-1M header in every API response shows your current consumption. If your code isn't reading this header, you're flying blind.
If your bot is already banned, the immediate priority is to stop making things worse. Continuing to send requests during an active ban resets the timer and can escalate the penalty duration.
Changing your IP is technically an option in an emergency, but it doesn't fix the underlying problem. Other exchanges like Bybit and OKX have similar tiered rate limiting with comparable escalation behavior — the same patterns that get you banned on Binance will get you banned elsewhere.
Start with a properly authenticated request that reads the weight header from the first call:
import hmac
import hashlib
import time
import requests
from urllib.parse import urlencode
BASE_URL = "https://api.binance.com"
API_KEY = "your_api_key_here"
SECRET_KEY = "your_secret_key_here"
def sign_request(params: dict) -> str:
query_string = urlencode(params)
signature = hmac.new(
SECRET_KEY.encode("utf-8"),
query_string.encode("utf-8"),
hashlib.sha256
).hexdigest()
return signature
def get_account_info():
endpoint = "/api/v3/account"
params = {"timestamp": int(time.time() * 1000)}
params["signature"] = sign_request(params)
headers = {"X-MBX-APIKEY": API_KEY}
response = requests.get(BASE_URL + endpoint, headers=headers, params=params)
# Always log weight consumption
used_weight = response.headers.get("X-MBX-USED-WEIGHT-1M", "unknown")
print(f"Weight used this minute: {used_weight}/6000")
return response.json()
Now add a rate-limit-aware wrapper with proper error 1003 handling and exponential backoff:
import time
import requests
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
MAX_RETRIES = 3
WEIGHT_THRESHOLD = 5500 # Back off before hitting the 6000 hard limit
def safe_request(method: str, url: str, **kwargs) -> dict:
headers = kwargs.pop("headers", {})
headers["X-MBX-APIKEY"] = API_KEY
for attempt in range(MAX_RETRIES):
try:
response = requests.request(method, url, headers=headers, **kwargs)
# Proactively back off when approaching the limit
used_weight = int(response.headers.get("X-MBX-USED-WEIGHT-1M", 0))
if used_weight > WEIGHT_THRESHOLD:
wait_time = 60 - (time.time() % 60)
logger.warning(f"Approaching rate limit ({used_weight}/6000). Cooling down {wait_time:.1f}s")
time.sleep(wait_time)
# 429: warning, back off and retry
if response.status_code == 429:
retry_after = int(response.headers.get("Retry-After", 60))
logger.warning(f"Rate limit warning (429). Waiting {retry_after}s")
time.sleep(retry_after)
continue
# 418: IP ban, must fully wait out the penalty
if response.status_code == 418:
retry_after = int(response.headers.get("Retry-After", 300))
logger.error(f"IP BANNED (418). Waiting {retry_after}s — do not send any requests")
time.sleep(retry_after)
continue
data = response.json()
# Handle -1003 in response body with escalating waits
if isinstance(data, dict) and data.get("code") == -1003:
wait = 60 * (attempt + 1) # 60s, 120s, 180s
logger.error(f"Error -1003. Waiting {wait}s (attempt {attempt + 1}/{MAX_RETRIES})")
time.sleep(wait)
continue
return data
except requests.exceptions.RequestException as e:
logger.error(f"Request failed: {e}")
time.sleep(2 ** attempt)
raise Exception(f"Max retries ({MAX_RETRIES}) exceeded")
The weight threshold check at 5500 is deliberate — it leaves headroom so urgent order placement requests can still go through even when your data polling has been heavy. Now switch your real-time price feeds off REST entirely:
import asyncio
import json
import websockets
import logging
logger = logging.getLogger(__name__)
async def stream_ticker(symbol: str):
"""Stream real-time ticker data — zero API weight cost."""
uri = f"wss://stream.binance.com:9443/ws/{symbol.lower()}@ticker"
async with websockets.connect(uri) as ws:
logger.info(f"WebSocket connected: {symbol}")
async for raw in ws:
data = json.loads(raw)
price = float(data["c"]) # current close price
volume = float(data["v"]) # 24h base asset volume
bid = float(data["b"]) # best bid
ask = float(data["a"]) # best ask
logger.info(f"{symbol}: ${price:,.4f} | Vol: {volume:,.0f} | Spread: {ask - bid:.4f}")
# Feed this into your signal engine instead of polling REST
# signal_engine.on_tick(symbol, price, volume)
asyncio.run(stream_ticker("BTCUSDT"))
WebSocket streams on Binance are completely free in terms of API weight. Polling prices via REST every second is burning weight you need for order execution. Any real-time data feed should be a WebSocket stream, not a REST loop.
Most bots that hit error 1003 repeatedly share the same architectural problem: they treat the Binance REST API like a real-time data feed. It isn't. REST is for discrete operations — placing orders, fetching account state, pulling historical data. Real-time data belongs on WebSocket streams.
Platforms like Bybit and OKX follow the same WebSocket-first design philosophy. On Bybit, the public WebSocket supports order book, trade, and ticker streams with no authentication required. On OKX, the WebSocket API handles real-time market data and private account updates on the same connection. The pattern is the same across all major exchanges: WebSocket for data, REST for actions.
If you're running signal-based strategies, VoiceOfChain provides real-time trading signals that reduce how much independent market scanning your bot needs to do. Instead of your bot polling dozens of indicators across multiple pairs to find setups, it can act on pre-processed signals and focus its API budget entirely on order execution — which cuts weight consumption substantially.
| Use Case | REST Weight/min (polling 1s) | WebSocket Cost |
|---|---|---|
| Single symbol price | 120/min | 0 |
| All symbol prices | 240/min (3s interval) | 0 |
| Order book depth 500 | 600/min | 0 |
| 24hr ticker (all) | 4800/min (1min interval) | 0 |
| Account balance updates | 1200/min | 0 via User Data Stream |
| Place order | 1 per order | N/A — REST only |
Binance API error 1003 follows a predictable pattern: a bot polling too aggressively via REST, ignoring 429 warnings, until Binance drops the 418 hammer. The fix is equally consistent — move real-time data needs to WebSocket streams, monitor the X-MBX-USED-WEIGHT-1M header on every response, and implement backoff logic before rate limits become bans. The code patterns in this guide apply whether you're building on Binance, Bybit, OKX, KuCoin, or any other exchange with a REST API. Build with weight awareness from the start and this is one error you won't see again.