Schema
This document provides a comprehensive explanation of the data schema for the trade and order book data we provide.
Introduction
This document provides a comprehensive explanation of the data schema for the trade and order book data we provide. It is intended to help you understand the structure and content of the CSV files generated from exchange WebSocket data.
General Format
Our data is stored in Tab-Separated Values (TSV) format. Each line in the CSV file represents a single data record, and fields are separated by a tab character ().
Trade Data Schema
Trade data represents individual trades executed on exchanges. Each line in the trade CSV file corresponds to a single trade event.
CSV Fields
The trade data CSV consists of the following fields:
timestamp
side
price
quantity_base
quantity_quote
quantity_contract
trade_id
json
Below is a detailed explanation of each field.
Field Descriptions
1. timestamp
Description: The time when the trade occurred.
Data Type: Integer
Format: Milliseconds since the Unix epoch (UTC).
Example:
1677628800381
2. side
Description: The side of the trade from the taker's perspective.
Data Type: String
Possible Values:
"buy"
or"sell"
Example:
buy
3. price
Description: The price at which the trade was executed.
Data Type: Float
Example:
23131.6
4. quantity_base
Description: The amount of the base currency traded.
Data Type: Float
Example:
0.021615452
5. quantity_quote
Description: The amount of the quote currency traded.
Data Type: Float
Example:
500
6. quantity_contract
Description: The number of contracts traded (applicable for futures or options). This field is optional and may be empty if not applicable.
Data Type: Float (nullable)
Example:
5
or""
(empty string)
7. trade_id
Description: A unique identifier for the trade provided by the exchange.
Data Type: String
Example:
243932386
8. json
Description: The original JSON message received from the exchange, providing additional details about the trade.
Data Type: JSON-formatted String
Example:
Sample Record
Order Book Data Schema
Order book data provides snapshots or updates of the order book, showing current bids and asks at various price levels.
CSV Fields
The order book data CSV consists of the following fields:
timestamp
snapshot
asks
bids
seq_id
prev_seq_id
Below is a detailed explanation of each field.
Field Descriptions
1. timestamp
Description: The time when the order book snapshot or update was captured.
Data Type: Integer
Format: Milliseconds since the Unix epoch (UTC).
Example:
1677628800944
2. snapshot
Description: Indicates whether the message is a full snapshot of the order book.
Data Type: Boolean
Possible Values:
true
orfalse
Example:
true
3. asks
Description: A JSON array of ask orders (sell orders).
Data Type: JSON-formatted String
Structure: An array of arrays, each inner array represents an ask order with the following elements:
Price
Quantity
Quantity Quote
Example:
4. bids
Description: A JSON array of bid orders (buy orders).
Data Type: JSON-formatted String
Structure: An array of arrays, each inner array represents a bid order with the following elements:
Price
Quantity
Quantity Quote
Example:
5. seq_id
Description: The sequence ID assigned by the exchange to this order book message. This field is optional and may be empty if not provided by the exchange.
Data Type: Integer (nullable)
Example:
33933104413
or""
(empty string)
6. prev_seq_id
Description: The previous sequence ID, indicating the sequence of order book updates. This field is optional and may be empty if not provided by the exchange.
Data Type: Integer (nullable)
Example:
""
(empty string)
Sample Record
Details of asks and bids Fields
Each entry in the asks and bids arrays is an array containing:
Price
Description: The price level.
Data Type: Float
Example:
23142.06
Quantity
Description: The amount of the base currency available at this price level.
Data Type: Float
Example:
0.00334
Quantity Quote
Description: The total value in the quote currency (Price × Quantity).
Data Type: Float
Example:
77.2944804
Notes on Data Fields
Nullable Fields: Fields like quantity_contract, seq_id, and prev_seq_id may be empty strings if the data is not applicable or not provided by the exchange.
Timestamp: All timestamps are in milliseconds since the Unix epoch in UTC. You may need to convert them to your local timezone or desired format.
JSON Field: The json field in the trade data contains the original message from the exchange, which may include additional details not parsed into the CSV fields.
Data Access and Usage
Decompression: The data files are compressed using
xz
compression. You can use tools likexzcat
to decompress and read the files.Sample Command:
Parsing JSON Fields: When parsing the asks, bids, and json fields, ensure your parser correctly handles JSON strings within a TSV format.
Example Parsing Logic
For Trade Data
Read each line and split it by the tab character ().
Assign fields based on their position.
Parse numeric fields into appropriate data types.
Handle nullable fields like quantity_contract by checking for empty strings.
Parse the JSON field if you need additional data not included in the main fields.
For Order Book Data
Read each line and split it by the tab character ().
Assign fields based on their position.
Parse the asks and bids fields as JSON arrays.
Iterate over the arrays to access individual price levels.
Handle sequence IDs appropriately, especially if you're reconstructing the order book over time.
Additional Information
Handling Timezones
Since all timestamps are in UTC, you may convert them to your local timezone using your preferred programming language or tools.
Data Consistency
Sequence IDs: Use seq_id and prev_seq_id to ensure data consistency and to detect any missing updates when processing order book data.
Data Integrity: Always validate the data types and handle exceptions in your parsing logic to maintain data integrity.
Conclusion
This guideline provides detailed information about the data schema used in our trade and order book data. Understanding this schema will help you effectively parse, analyze, and utilize the data for your trading strategies, market analysis, or any other applications.
Note: We may update this schema in the future to include additional fields or changes. We will notify you in advance of any significant changes. Please ensure your data processing systems can handle potential updates gracefully.
Last updated