◈   ⌘ api · Intermediate

Binance API Historical Data: Complete Trader's Guide

Learn how to fetch, parse, and analyze Binance API historical data with Python. Covers endpoints, limits, futures data, and real trading use cases.

Uncle Solieditor · voc · 20.05.2026 ·views 4
◈   Contents
  1. → Understanding the Binance Klines Endpoint
  2. → Fetching Historical Data with Python
  3. → Rate Limits and How to Stay Under Them
  4. → Binance Futures API Historical Data
  5. → Downloading and Storing Large Datasets
  6. → Frequently Asked Questions
  7. → Putting It All Together

If you're building a trading bot, running a backtest, or just trying to understand how a coin behaved during a specific market event, historical OHLCV data is the starting point. Binance offers one of the most accessible APIs for pulling this data — free, fast, and well-documented. But there are gotchas: rate limits, pagination quirks, endpoint differences between spot and futures, and timestamp handling that will silently break your data if you're not careful. This guide walks through everything you need to pull clean historical data from Binance, with real Python code you can run today.

Understanding the Binance Klines Endpoint

Binance API historical price data lives primarily in the `/api/v3/klines` endpoint for spot markets. Each kline (candlestick) returns open time, open, high, low, close, volume, close time, quote asset volume, number of trades, taker buy base/quote volume, and an ignored field. That's 12 values per candle — more than most traders use, but useful for volume analysis.

The endpoint accepts a `symbol` (like BTCUSDT), an `interval` (1m, 5m, 1h, 1d, etc.), and optional `startTime` and `endTime` in milliseconds. The `limit` parameter controls how many candles you get back — the Binance API historical data limit is 1000 candles per request. That's the hard ceiling. If you want more, you need to paginate by shifting your `startTime` forward using the last candle's close time.

Timestamps on Binance are Unix milliseconds, not seconds. Passing seconds will silently return wrong data — you'll get candles from 1970. Always multiply your Unix timestamp by 1000, or use int(datetime.timestamp() * 1000).

Fetching Historical Data with Python

You don't need an API key to pull binance api historical data python — the klines endpoint is public. Authentication is only required for account-level endpoints like orders and balances. Here's a minimal working example to pull 30 days of hourly BTC/USDT candles:

import requests
import pandas as pd
from datetime import datetime, timedelta

BASE_URL = "https://api.binance.com"

def get_klines(symbol: str, interval: str, start_dt: datetime, end_dt: datetime) -> pd.DataFrame:
    url = f"{BASE_URL}/api/v3/klines"
    start_ms = int(start_dt.timestamp() * 1000)
    end_ms = int(end_dt.timestamp() * 1000)

    all_candles = []
    while start_ms < end_ms:
        params = {
            "symbol": symbol,
            "interval": interval,
            "startTime": start_ms,
            "endTime": end_ms,
            "limit": 1000,
        }
        resp = requests.get(url, params=params, timeout=10)
        resp.raise_for_status()
        data = resp.json()
        if not data:
            break
        all_candles.extend(data)
        # advance past the last candle's close time
        start_ms = data[-1][6] + 1

    columns = [
        "open_time", "open", "high", "low", "close", "volume",
        "close_time", "quote_volume", "trades",
        "taker_buy_base", "taker_buy_quote", "ignore"
    ]
    df = pd.DataFrame(all_candles, columns=columns)
    df["open_time"] = pd.to_datetime(df["open_time"], unit="ms")
    df[["open", "high", "low", "close", "volume"]] = df[
        ["open", "high", "low", "close", "volume"]
    ].astype(float)
    return df[["open_time", "open", "high", "low", "close", "volume"]]


if __name__ == "__main__":
    end = datetime.utcnow()
    start = end - timedelta(days=30)
    df = get_klines("BTCUSDT", "1h", start, end)
    print(df.head())
    print(f"Total candles: {len(df)}")

This handles pagination automatically. Notice we advance `start_ms` using `data[-1][6] + 1` — that's the close time of the last candle plus one millisecond, which prevents duplicate candles at the boundary. Without this, you'll either loop forever or get overlapping data.

Rate Limits and How to Stay Under Them

Binance enforces rate limits based on request weight. The klines endpoint has a weight of 2 per call (when limit ≤ 1000). The default limit is 1200 weight per minute. That gives you 600 paginated kline requests per minute — more than enough for most use cases, but easy to blow through if you're pulling data for 100 symbols at once.

When you hit the limit, Binance returns HTTP 429. If you keep hammering after that, you'll get a 418 IP ban. Here's a safer fetching pattern with exponential backoff:

import time
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

def make_session() -> requests.Session:
    session = requests.Session()
    retry = Retry(
        total=5,
        backoff_factor=2,
        status_forcelist=[429, 500, 502, 503, 504],
        respect_retry_after_header=True,
    )
    adapter = HTTPAdapter(max_retries=retry)
    session.mount("https://", adapter)
    return session

def safe_get(session: requests.Session, url: str, params: dict) -> list:
    for attempt in range(5):
        try:
            resp = session.get(url, params=params, timeout=10)
            if resp.status_code == 429:
                retry_after = int(resp.headers.get("Retry-After", 60))
                print(f"Rate limited. Sleeping {retry_after}s")
                time.sleep(retry_after)
                continue
            resp.raise_for_status()
            return resp.json()
        except requests.exceptions.RequestException as e:
            wait = 2 ** attempt
            print(f"Error: {e}. Retrying in {wait}s")
            time.sleep(wait)
    raise RuntimeError("Max retries exceeded")
Always check the X-MBX-USED-WEIGHT-1M header in the response. It tells you how much of your per-minute weight budget you've consumed. Log it during development — you'll catch runaway loops before they get you banned.

Binance Futures API Historical Data

Spot and futures use different base URLs and slightly different endpoints. For binance futures api historical data, the base is `https://fapi.binance.com` for USDT-margined perpetuals, and `https://dapi.binance.com` for coin-margined contracts. The klines endpoint path is the same — `/fapi/v1/klines` — but the symbols differ: use `BTCUSDT` for USDⓈ-M futures, `BTCUSD_PERP` for coin-margined.

Futures candles also include two extra fields beyond spot: `taker_buy_base_asset_volume` and `taker_buy_quote_asset_volume`. These are useful for measuring buying pressure — when taker buy volume is consistently high, it signals aggressive long positioning, which shows up as directional signals on platforms like VoiceOfChain before it becomes obvious on the chart.

import requests
import pandas as pd
from datetime import datetime, timedelta

FUTURES_URL = "https://fapi.binance.com"

def get_futures_klines(symbol: str, interval: str, days: int = 7) -> pd.DataFrame:
    url = f"{FUTURES_URL}/fapi/v1/klines"
    end_ms = int(datetime.utcnow().timestamp() * 1000)
    start_ms = int((datetime.utcnow() - timedelta(days=days)).timestamp() * 1000)

    all_candles = []
    while start_ms < end_ms:
        params = {
            "symbol": symbol,
            "interval": interval,
            "startTime": start_ms,
            "endTime": end_ms,
            "limit": 1000,
        }
        resp = requests.get(url, params=params, timeout=10)
        resp.raise_for_status()
        data = resp.json()
        if not data:
            break
        all_candles.extend(data)
        start_ms = data[-1][6] + 1

    columns = [
        "open_time", "open", "high", "low", "close", "volume",
        "close_time", "quote_volume", "trades",
        "taker_buy_base", "taker_buy_quote", "ignore"
    ]
    df = pd.DataFrame(all_candles, columns=columns)
    df["open_time"] = pd.to_datetime(df["open_time"], unit="ms")
    numeric_cols = ["open", "high", "low", "close", "volume", "taker_buy_base"]
    df[numeric_cols] = df[numeric_cols].astype(float)

    # compute taker buy ratio as a pressure indicator
    df["buy_pressure"] = df["taker_buy_base"] / df["volume"]
    return df

if __name__ == "__main__":
    df = get_futures_klines("BTCUSDT", "1h", days=14)
    print(df[["open_time", "close", "volume", "buy_pressure"]].tail(10))

Comparing platforms: Bybit and OKX also expose similar perpetual klines APIs, but Binance's futures API has deeper history and higher rate limits. For multi-exchange backtests, it's common to use Binance as the primary data source and validate signal quality against Bybit's equivalent markets.

Downloading and Storing Large Datasets

For backtesting strategies over years of data, calling the API repeatedly is slow and wasteful. Binance offers bulk historical data downloads via their data portal at `data.binance.vision` — monthly and daily zip files of klines, trades, and aggTrades for every symbol. These are the same candles you'd get from the API, just pre-packaged. Downloading a year of 1-minute BTC/USDT klines takes seconds versus hours of API calls.

For ongoing data collection, a simple SQLite or Parquet approach works well at small scale. At larger scale, ClickHouse handles time-series OHLCV data extremely well — sub-second queries over billions of rows are normal. VoiceOfChain uses ClickHouse internally to process order flow from multiple exchanges including Binance, OKX, and Gate.io, enabling the kind of real-time signal generation that would choke a Postgres instance.

Binance API vs Bulk Download comparison
MethodMax HistoryRate LimitedBest For
REST API /klinesFull (varies by symbol)Yes, 1200 weight/minRecent data, live feeds
data.binance.vision bulkJan 2017+NoBacktests, initial loads
WebSocket streamsReal-time onlyConnection limitLive candle updates

If you're coming from traditional finance and wondering about google finance api historical data or yahoo finance api historical data csv workflows — the structure is similar, but crypto data is tick-by-tick 24/7 and volumes are orders of magnitude larger per asset. The Binance REST API historical data approach maps directly to what you'd do with yfinance or pandas-datareader, except there are no market holidays and the granularity goes down to 1 second on some endpoints.

Frequently Asked Questions

What is the Binance API historical data limit per request?
The hard limit is 1000 candles per API request. To fetch more, you must paginate by using the `startTime` parameter and advancing it past the last candle's close time on each iteration. There is no way to exceed 1000 in a single call.
Do I need an API key to get historical data from Binance?
No. The `/api/v3/klines` and `/fapi/v1/klines` endpoints are public and require no authentication. API keys are only needed for private endpoints like order placement, account balance, and trade history.
How far back does Binance API historical data go?
For major pairs like BTCUSDT, spot data goes back to mid-2017 when Binance launched. Newer tokens only have data from their listing date. Futures data starts from when the contract was listed, typically 2019-2020 for most perpetuals.
What's the difference between Binance spot and futures historical data?
Spot klines use `https://api.binance.com/api/v3/klines` while USDT-M futures use `https://fapi.binance.com/fapi/v1/klines`. Futures candles include additional taker buy volume fields. Prices also differ slightly since futures trade at a premium or discount to spot depending on market sentiment.
Can I use this same approach with Bybit or OKX?
Yes. Bybit exposes `/v5/market/kline` and OKX uses `/api/v5/market/candles` with similar parameters. The pagination logic and timestamp handling are nearly identical. The main difference is the rate limit structure and how far back their free-tier history goes.
How do I handle gaps in historical data?
Binance occasionally has gaps during periods of extreme load or maintenance. After downloading, check for missing timestamps by comparing expected candle count with actual using `pd.date_range`. Forward-fill micro-gaps (1-2 candles) but flag and investigate larger ones — they can indicate exchange outages that would affect backtest validity.

Putting It All Together

The Binance API is one of the best free data sources available to retail algo traders. The klines endpoint is reliable, the docs are solid, and the rate limits are generous enough for most use cases. The main traps are timestamp handling (always milliseconds), the 1000-candle-per-request ceiling, and the difference between spot and futures base URLs. Get those three things right and you'll have clean historical data flowing into your backtests within an hour.

For live trading signal generation — beyond historical analysis — platforms like VoiceOfChain aggregate real-time order flow from Binance, OKX, Bybit, and other major venues, translating raw market microstructure into actionable signals without requiring you to build and maintain the data pipeline yourself. Historical data tells you what happened; real-time signals tell you what's happening right now.

◈   more on this topic
◉ basics Mastering the ccxt library documentation for crypto traders ⌂ exchanges Mastering the Binance CCXT Library for Crypto Traders ⌬ bots Best Crypto Trading Bots 2025: Profitable AI-Powered Strategies