This program downloads tick data from public.bitmex.com
desired tick-resampling timeframes. use pandas's representation such as "1D", "1H", "1T"
desired location for the db
db crawling start date. if the db exists, it automatically detects it. if not, defaults to 20141122; not recommended to change
set True if you want to reset(erase) db and make a new one
your selenium chrome driver location
python main.py [options]
-> the public.bitmex.com shows up.
-> if you see the list of csv.gz files on the webpage, click "yes"
-> the download starts
- the table names are formatted as: SYMBOL_TIMEFRAME
- Available TIMEFRAME are TICK, and what you have specified in option --timeframes
- Available SYMBOL are in table TICKERS
def load_bitmex_data(db_path, timeframe, symbol):
db = sqlite3.connect(db_path)
if timeframe == "TICK":
columns = ["timestamp", "symbol", "side", "size", "price", "tickDirection", "trdMatchID", "grossValue", "homeNotional", "foreignNotional"]
else:
columns = ["timestamp", "open", "high", "low", "close", "volume", "lowFirst"]
df = pd.DataFrame(db.execute(f"SELECT * FROM {symbol}_{timeframe}"), columns=columns)
df.index = pd.to_datetime(df["timestamp"])
return df
df = load_bitmex_data("/home/ych/Storage/bitmex.db", "1T", "XBTUSD") # loads XBTUSD_1T table which has 1min candlesticks of XBTUSD
- when initializing the DB, it can take long (like 10~15 hours)
- due to crawling restrictions, the download gets slower and slower. it's normal, but it can be faster if you just pause (shut down) and resume (rerun) the script within 30min~1 hrs.
- using the function load_bitmex_data, you can easliy query the desired dataset.
- please enjoy