# Schema

## Introduction

This document provides a comprehensive explanation of the data schema for the trade and order book data we provide. It is intended to help you understand the structure and content of the CSV files generated from exchange WebSocket data.

***

## General Format

Our data is stored in **Tab-Separated Values (TSV)** format. Each line in the CSV file represents a single data record, and fields are separated by a tab character ().

***

## Trade Data Schema

Trade data represents individual trades executed on exchanges. Each line in the trade CSV file corresponds to a single trade event.

### CSV Fields

The trade data CSV consists of the following fields:

1. **timestamp**
2. **side**
3. **price**
4. **quantity\_base**
5. **quantity\_quote**
6. **quantity\_contract**
7. **trade\_id**
8. **json**

Below is a detailed explanation of each field.

### Field Descriptions

**1. timestamp**

* **Description**: The time when the trade occurred.
* **Data Type**: Integer
* **Format**: Milliseconds since the Unix epoch (UTC).
* **Example**: `1677628800381`

**2. side**

* **Description**: The side of the trade from the taker's perspective.
* **Data Type**: String
* **Possible Values**: `"buy"` or `"sell"`
* **Example**: `buy`

**3. price**

* **Description**: The price at which the trade was executed.
* **Data Type**: Float
* **Example**: `23131.6`

**4. quantity\_base**

* **Description**: The amount of the base currency traded.
* **Data Type**: Float
* **Example**: `0.021615452`

**5. quantity\_quote**

* **Description**: The amount of the quote currency traded.
* **Data Type**: Float
* **Example**: `500`

**6. quantity\_contract**

* **Description**: The number of contracts traded (applicable for futures or options). This field is optional and may be empty if not applicable.
* **Data Type**: Float (nullable)
* **Example**: `5` or `""` (empty string)

**7. trade\_id**

* **Description**: A unique identifier for the trade provided by the exchange.
* **Data Type**: String
* **Example**: `243932386`

**8. json**

* **Description**: The original JSON message received from the exchange, providing additional details about the trade.
* **Data Type**: JSON-formatted String
* **Example**:

  ```json
  {"stream":"btcusd_perp@aggTrade","data":{"e":"aggTrade","E":1677628800381,"a":243932386,"s":"BTCUSD_PERP","p":"23131.6","q":"5","f":593068770,"l":593068770,"T":1677628800275,"m":false}}
  ```

### Sample Record

```tsv
1677628800381	buy	23131.6	0.021615452	500	5	243932386	{"stream":"btcusd_perp@aggTrade","data":{"e":"aggTrade","E":1677628800381,"a":243932386,"s":"BTCUSD_PERP","p":"23131.6","q":"5","f":593068770,"l":593068770,"T":1677628800275,"m":false}}
```

***

## Order Book Data Schema

Order book data provides snapshots or updates of the order book, showing current bids and asks at various price levels.

### CSV Fields

The order book data CSV consists of the following fields:

1. **timestamp**
2. **snapshot**
3. **asks**
4. **bids**
5. **seq\_id**
6. **prev\_seq\_id**

Below is a detailed explanation of each field.

### Field Descriptions

**1. timestamp**

* **Description**: The time when the order book snapshot or update was captured.
* **Data Type**: Integer
* **Format**: Milliseconds since the Unix epoch (UTC).
* **Example**: `1677628800944`

**2. snapshot**

* **Description**: Indicates whether the message is a full snapshot of the order book.
* **Data Type**: Boolean
* **Possible Values**: `true` or `false`
* **Example**: `true`

**3. asks**

* **Description**: A JSON array of ask orders (sell orders).
* **Data Type**: JSON-formatted String
* **Structure**: An array of arrays, each inner array represents an ask order with the following elements:
  * **Price**
  * **Quantity**
  * **Quantity Quote**
* **Example**:

  ```json
  [[23142.06, 0.00334, 77.2944804], [23142.08, 0.004, 92.56832], ...]
  ```

**4. bids**

* **Description**: A JSON array of bid orders (buy orders).
* **Data Type**: JSON-formatted String
* **Structure**: An array of arrays, each inner array represents a bid order with the following elements:
  * **Price**
  * **Quantity**
  * **Quantity Quote**
* **Example**:

  ```json
  [[23141.1, 0.00047, 10.876317], [23141.09, 0.11935, 2761.8890915], ...]
  ```

**5. seq\_id**

* **Description**: The sequence ID assigned by the exchange to this order book message. This field is optional and may be empty if not provided by the exchange.
* **Data Type**: Integer (nullable)
* **Example**: `33933104413` or `""` (empty string)

**6. prev\_seq\_id**

* **Description**: The previous sequence ID, indicating the sequence of order book updates. This field is optional and may be empty if not provided by the exchange.
* **Data Type**: Integer (nullable)
* **Example**: `""` (empty string)

### Sample Record

```tsv
1677628800944	true	[[23142.06,0.00334,77.2944804],[23142.08,0.004,92.56832],[23142.31,0.00349,80.7666619],[23142.86,0.01215,281.185749],[23143.23,0.33673,7793.0198379],[23143.24,0.00188,43.5092912],[23143.38,0.03488,807.2410944],[23143.39,0.18239,4221.1229021],[23143.4,0.06482,1500.155188],[23143.43,0.56966,13183.8863338],[23143.44,0.00689,159.4583016],[23143.51,0.06195,1433.7404445],[23143.52,0.13759,3184.3169168],[23143.6,0.00637,147.424732],[23143.62,0.001,23.14362],[23143.64,0.06083,1407.8276212],[23143.68,0.00452,104.6094336],[23143.81,0.09423,2180.8412163],[23143.85,0.002,46.2877],[23144.0,0.001,23.144]]	[[23141.1,0.00047,10.876317],[23141.09,0.11935,2761.8890915],[23141.08,0.00447,103.4406276],[23140.57,0.00197,45.5869229],[23140.5,0.00195,45.123975],[23140.29,0.004,92.56116],[23140.26,0.00189,43.7350914],[23140.03,0.04408,1020.0125224],[23139.98,0.43582,10084.8660836],[23139.97,0.31229,7226.3812313],[23139.96,0.0082,189.747672],[23139.79,0.0085,196.688215],[23139.78,0.08642,1999.7397876],[23139.68,0.0945,2186.69976],[23139.67,0.06364,1472.6085988],[23139.66,0.0007,16.197762],[23139.64,0.10316,2387.0852624],[23139.62,0.0978,2263.054836],[23139.61,0.12874,2978.9933914],[23139.6,0.03,694.188]]	33933104413	
```

### Details of **asks** and **bids** Fields

Each entry in the **asks** and **bids** arrays is an array containing:

1. **Price**
   * **Description**: The price level.
   * **Data Type**: Float
   * **Example**: `23142.06`
2. **Quantity**
   * **Description**: The amount of the base currency available at this price level.
   * **Data Type**: Float
   * **Example**: `0.00334`
3. **Quantity Quote**
   * **Description**: The total value in the quote currency (Price × Quantity).
   * **Data Type**: Float
   * **Example**: `77.2944804`

***

## Notes on Data Fields

* **Nullable Fields**: Fields like **quantity\_contract**, **seq\_id**, and **prev\_seq\_id** may be empty strings if the data is not applicable or not provided by the exchange.
* **Timestamp**: All timestamps are in milliseconds since the Unix epoch in UTC. You may need to convert them to your local timezone or desired format.
* **JSON Field**: The **json** field in the trade data contains the original message from the exchange, which may include additional details not parsed into the CSV fields.

***

## Data Access and Usage

* **Decompression**: The data files are compressed using `xz` compression. You can use tools like `xzcat` to decompress and read the files.
* **Sample Command**:

  ```bash
  xzcat /path/to/datafile.csv.xz | head
  ```
* **Parsing JSON Fields**: When parsing the **asks**, **bids**, and **json** fields, ensure your parser correctly handles JSON strings within a TSV format.

***

## Example Parsing Logic

#### For Trade Data

1. **Read each line** and split it by the tab character ().
2. **Assign fields** based on their position.
3. **Parse numeric fields** into appropriate data types.
4. **Handle nullable fields** like **quantity\_contract** by checking for empty strings.
5. **Parse the JSON field** if you need additional data not included in the main fields.

#### For Order Book Data

1. **Read each line** and split it by the tab character ().
2. **Assign fields** based on their position.
3. **Parse the asks and bids fields** as JSON arrays.
4. **Iterate over the arrays** to access individual price levels.
5. **Handle sequence IDs** appropriately, especially if you're reconstructing the order book over time.

***

## Additional Information

#### Handling Timezones

* Since all timestamps are in UTC, you may convert them to your local timezone using your preferred programming language or tools.

#### Data Consistency

* **Sequence IDs**: Use **seq\_id** and **prev\_seq\_id** to ensure data consistency and to detect any missing updates when processing order book data.
* **Data Integrity**: Always validate the data types and handle exceptions in your parsing logic to maintain data integrity.

## Conclusion

This guideline provides detailed information about the data schema used in our trade and order book data. Understanding this schema will help you effectively parse, analyze, and utilize the data for your trading strategies, market analysis, or any other applications.

***

**Note**: We may update this schema in the future to include additional fields or changes. We will notify you in advance of any significant changes. Please ensure your data processing systems can handle potential updates gracefully.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.tikdat.com/data/schema.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
