Downloader

Use this guide when you want to download Binance historical kline ZIP files, extract them, and load the CSV rows into file.db as DuckDB table binance_candles.

The downloader is incremental: it checks which candles are missing in DuckDB for the configured symbols/date range, downloads the missing daily or full-month ZIP files, then inserts the downloaded CSV data.

Quick Start

  1. Install dependencies with uv sync --dev.
  2. Create or review a downloader config such as configs/spot-1d.yaml.
  3. Validate tradable symbols when using live Binance Spot symbols.
  4. Run a daily or monthly download.
  5. Open the Chart UI after candles are loaded.

For a broad backfill across currently tradable symbols, use the All Tradable Klines workflow.

Requirements

  • Python environment managed by uv.
  • A writable DuckDB database file, currently file.db.
  • A YAML config file describing the market data range.
  • Network access to https://data.binance.vision.

Install project dependencies:

uv sync --dev

Config File

Example: configs/spot-1d.yaml

asset: spot
data_type: klines
interval: 1d
symbols:
  - BTCUSDT
  - ETHUSDT
  - ARBUSDT
  - OPUSDT
  - SOLUSDT
start_date: 2024-01-01
end_date: 2026-04-06
destination_dir: ./data

Fields:

  • asset: Binance data asset group. Currently expected to be spot.
  • data_type: Binance data type. For candles, use klines.
  • interval: Binance kline interval, for example 1d, 1h, 5m, or 1m.
  • symbols: list of symbols to download.
  • start_date: inclusive start date for missing-data checks.
  • end_date: inclusive end date for missing-data checks.
  • destination_dir: local root directory for downloaded and extracted files.

The downloader writes Binance paths under destination_dir. For the config above, BTC daily ZIPs are stored under:

data/spot/daily/klines/BTCUSDT/1d/

Run Daily Downloads

Daily mode downloads one ZIP per missing day:

uv run python examples/run_download.py --file configs/spot-1d.yaml --freq daily

Use daily mode when you are filling recent gaps or downloading short ranges.

Check Tradable Symbols

For the full symbol-listing workflow, see tradable-symbols.md.

Before downloading a new config, check that the requested symbols are currently tradable on Binance Spot:

uv run python examples/check_tradable_symbols.py --file configs/spot-1d.yaml

The command calls Binance Spot GET /api/v3/exchangeInfo and treats symbols with status: TRADING as tradable. If any configured symbol is missing or not in TRADING status, the command exits with status code 1.

List all currently tradable USDT spot symbols:

uv run python examples/check_tradable_symbols.py --quote USDT

Only list pairs that support market orders:

uv run python examples/check_tradable_symbols.py --quote USDT --require-market-order

Limit output while exploring:

uv run python examples/check_tradable_symbols.py --quote USDT --limit 20

Important distinction: exchangeInfo tells you what is currently tradable on the live Binance Spot exchange. Historical files on data.binance.vision can include older symbols, and not every historical symbol is tradable today.

Run Monthly Downloads

Monthly mode downloads one ZIP for each fully missing calendar month:

uv run python examples/run_download.py --file configs/spot-1d.yaml --freq monthly

Use monthly mode for initial backfills. It only downloads months where every day in that month is missing from DuckDB for the symbol and interval.

Download All Tradable Symbols

To resolve all currently tradable Binance Spot symbols and download klines for them, use symbol_source: tradable instead of a fixed symbols list:

asset: spot
data_type: klines
interval: 1d
symbol_source: tradable
quote_asset: USDT
require_market_order: true
margin: none
require_borrowable:
start_date: 2024-01-01
end_date: 2026-04-06
destination_dir: ./data
db_path: file.db
batch_size: 25
download_concurrency: 10
missing_frequency: monthly

Run a dry-run first to inspect symbol and URL counts:

uv run python examples/download_tradable_klines.py --file configs/spot-all-usdt-1d.yaml --dry-run

Run a small smoke test:

uv run python examples/download_tradable_klines.py --file configs/spot-all-usdt-1d.yaml --limit-symbols 3

The command writes a JSON manifest under data/manifests/ with the resolved symbols, batch progress, URL counts, download categories, and final status. Historical 404 responses from data.binance.vision are treated as missing_remote because currently tradable symbols may not have existed for the whole requested date range.

For the full operational manual, see all-tradable-klines.md.

What The Command Does

examples/run_download.py performs these steps:

  1. Reads the YAML config.
  2. Opens file.db.
  3. Computes missing days or fully missing months from binance_candles.
  4. Builds Binance download URLs.
  5. Downloads ZIP files concurrently with retry handling.
  6. Extracts ZIP files safely.
  7. Inserts CSV rows into DuckDB with insert or ignore.

The target DuckDB table is:

binance_candles(
  symbol,
  interval,
  open_time,
  close_time,
  open,
  high,
  low,
  close,
  volume,
  quote_asset_volume,
  number_of_trades,
  taker_buy_base_volume,
  taker_buy_quote_volume
)

The table has a primary key on (symbol, interval, open_time), so rerunning the same download is safe. Existing candles are ignored on insert.

Error Handling

Downloads fail loudly. If any URL still fails after retries, the command raises DownloadError with a summary of failed URLs and stops before insertion.

Common causes:

  • 404: Binance does not have that symbol/interval/date file.
  • Network timeout: retry the command.
  • Permission error: check destination_dir permissions.
  • Bad ZIP path: extraction is blocked if a ZIP contains unsafe paths.

Partial files are written as *.part and then moved into place only after a successful download.

Troubleshooting

No rows appear in the chart UI

Confirm the command inserted rows into file.db and that the backend is pointed at the same database path. Then open the Chart UI and check the backend health endpoint.

Configured symbols fail validation

Run the tradable-symbol check before downloading. Binance historical archives can contain older symbols that are not currently tradable, and live exchangeInfo only describes the current exchange state.

Large backfills are slow

Use monthly mode for initial backfills, reduce the date range while testing, or use the All Tradable Klines dry-run and --limit-symbols options before a full run.

Programmatic Use

You can use the downloader from Python:

import asyncio

from src.downloader import concurrent_download

urls = [
    "https://data.binance.vision/data/spot/daily/klines/BTCUSDT/1d/BTCUSDT-1d-2026-04-06.zip"
]

results = asyncio.run(
    concurrent_download(
        urls,
        n=10,
        destination_dir="./data",
        retries=2,
    )
)

for result in results:
    print(result.path)

Arguments:

  • urls: iterable of Binance data URLs.
  • n: max concurrent downloads.
  • destination_dir: local output root.
  • retries: retry count after the first attempt.

Verify Data

Check available data in DuckDB:

uv run python -c "import duckdb; con=duckdb.connect('file.db'); print(con.execute(\"select symbol, interval, count(*), min(open_time), max(open_time) from binance_candles group by 1,2 order by 1,2\").fetchall())"

Start the API and inspect data from the UI:

uv run uvicorn main:app --host 127.0.0.1 --port 8000
npm run dev

Then open:

http://127.0.0.1:5173/

Notes

  • The downloader currently uses the local database path file.db in examples/run_download.py.
  • Daily and monthly modes both load data into the same binance_candles table.
  • Safe ZIP extraction rejects absolute paths and .. traversal paths before extracting.