Market Making
Algorithms
Jamie Flux
Contents
1 Stochastic Inventory Balancing Algorithm 4
Python Code . . . . . . . . . . . . . . . . . . . . . . 4
2 Adaptive Spread Quoting with Deep Q-Learning 10
Python Code . . . . . . . . . . . . . . . . . . . . . . 10
3 Market Microstructure Factor Decomposition 17
Python Code . . . . . . . . . . . . . . . . . . . . . . 18
4 Volatility-Aware Quote Skewing 24
Python Code . . . . . . . . . . . . . . . . . . . . . . 25
5 Reinforcement Momentum Forecasting 30
Python Code . . . . . . . . . . . . . . . . . . . . . . 30
6 Neural Bayesian Market Maker 39
Python Code . . . . . . . . . . . . . . . . . . . . . . 39
7 Liquidity Pulse Detector 45
Python Code . . . . . . . . . . . . . . . . . . . . . . 46
8 Sentiment-Fused Quoting Model 51
Python Code . . . . . . . . . . . . . . . . . . . . . . 51
9 Regime-Switching Inventory Rebalancer 57
Python Code . . . . . . . . . . . . . . . . . . . . . . 57
10 Multi-Agent Market Maker Orchestration 64
Python Code . . . . . . . . . . . . . . . . . . . . . . 65
11 Explanatory Factor Market Maker 72
Python Code . . . . . . . . . . . . . . . . . . . . . . 73
1
12 Term Structure Arbitrage Quoter 81
Python Code . . . . . . . . . . . . . . . . . . . . . . 82
13 Hidden Liquidity Capture Mechanism 88
Python Code . . . . . . . . . . . . . . . . . . . . . . 89
14 Correlation Cluster Market Maker 96
Python Code . . . . . . . . . . . . . . . . . . . . . . 96
15 Adaptive Risk Aversion Tuning 104
Python Code . . . . . . . . . . . . . . . . . . . . . . 105
16 Meta-Learning Spread Optimizer 111
Python Code . . . . . . . . . . . . . . . . . . . . . . 112
17 RL-Based Inventory Hedging 118
Python Code . . . . . . . . . . . . . . . . . . . . . . 119
18 Quantum Annealing Price Discovery 126
Python Code . . . . . . . . . . . . . . . . . . . . . . 126
19 Peer Momentum Siphoning 132
Python Code . . . . . . . . . . . . . . . . . . . . . . 132
20 Critic-Based Latent Feature Market Making 138
Python Code . . . . . . . . . . . . . . . . . . . . . . 138
21 Graph Neural Network Price Flow 146
Python Code . . . . . . . . . . . . . . . . . . . . . . 147
22 Volatility-Triggered Reinforcement Bandits 152
Python Code . . . . . . . . . . . . . . . . . . . . . . 153
23 Sparse Regression Quoting 158
Python Code . . . . . . . . . . . . . . . . . . . . . . 159
24 Liquidity Mining Arbitrage 165
Python Code . . . . . . . . . . . . . . . . . . . . . . 166
25 Nonlinear Kalman Filter Price Stabilization 172
Python Code . . . . . . . . . . . . . . . . . . . . . . 173
26 Recurrent Profit-Loss Anchoring 179
Python Code . . . . . . . . . . . . . . . . . . . . . . 179
2
27 Long-Short Market Microstructure Transitions 185
Python Code . . . . . . . . . . . . . . . . . . . . . . 185
28 Neural PDE Market Maker 193
Python Code . . . . . . . . . . . . . . . . . . . . . . 194
29 Advanced Feature Fusion Quoter 201
Python Code . . . . . . . . . . . . . . . . . . . . . . 201
30 Self-Supervised High-Frequency Forecaster 207
Python Code . . . . . . . . . . . . . . . . . . . . . . 208
31 Regret Minimization Market Making 214
Python Code . . . . . . . . . . . . . . . . . . . . . . 214
32 Multivariate Hawkes Inventory Control 221
Python Code . . . . . . . . . . . . . . . . . . . . . . 221
33 Cross-Exchange Liquidity Aggregator 229
Python Code . . . . . . . . . . . . . . . . . . . . . . 229
3
Chapter 1
Stochastic Inventory
Balancing Algorithm
This chapter introduces a “Stochastic Inventory Balancing Algo-
rithm,” which focuses on managing a market maker’s inventory risk
by modeling price dynamics and trade executions through stochas-
tic processes. The algorithm uses probabilistic estimates of future
price movements to influence inventory targets, leveraged by a dy-
namic programming (or reinforcement learning) layer to determine
optimal bid-ask spreads. It proceeds through simulation of various
market scenarios, training an RL module to minimize the combined
objective of risk and profit, and finally implementing a real-time
rebalancing mechanism that updates orders based on current in-
ventory levels and recent volatility fluctuations.
Fundamentally, this system (1) simulates random fluctuations
in an asset’s price, (2) models how orders might be filled given the
chosen spreads, (3) updates the agent’s inventory accordingly, and
(4) calculates a reward function that balances PnL with an inven-
tory penalty. Through repeated episodes, the algorithm refines the
quoting strategy to optimize expected returns while controlling risk
exposure.
Python Code
import numpy as np
import random
import math
4
class StochasticInventoryEnv:
"""
A simplified environment modeling price evolution and
inventory changes for a market maker. The agent chooses
from a set of possible spreads to quote, then we simulate
whether orders are filled, how inventory changes, and
compute the resulting reward.
"""
def __init__(
self,
init_price=100.0,
init_inventory=0,
max_inventory=5,
price_volatility=1.0,
fill_probability=0.5,
max_steps=100
):
"""
:param init_price: The initial asset price at start.
:param init_inventory: Initial inventory (defaults to zero).
:param max_inventory: Absolute cap on positive/negative
,→ inventory.
:param price_volatility: Standard deviation of price changes
,→ per step.
:param fill_probability: Probability an order will fill
,→ given a chosen spread.
:param max_steps: Maximum steps in an episode.
"""
self.init_price = init_price
self.init_inventory = init_inventory
self.max_inventory = max_inventory
self.price_volatility = price_volatility
self.fill_probability = fill_probability
self.max_steps = max_steps
self.action_space = [1, 2, 3] # Discrete set of possible
,→ half-spread sizes
[Link]()
def reset(self):
"""
Reset the environment to start a new episode.
"""
[Link] = self.init_price
[Link] = self.init_inventory
self.steps_elapsed = 0
[Link] = 0.0
return self._get_state()
def _get_state(self):
"""
5
Return the current environment state as a tuple.
State includes discretized price, current inventory, and
,→ steps so far.
"""
# In practice, price could be discretized or handled more
,→ precisely.
return (round([Link]), [Link])
def step(self, action):
"""
Execute one time-step in the environment given an action
,→ (spread).
:param action: The index of the chosen action in
,→ self.action_space.
:return: state, reward, done, info
"""
spread = self.action_space[action]
mid_price = [Link]
# Simulate random fill events. For demonstration, if an
,→ order is filled,
# it might be a buy or sell fill depending on random draws.
# We assume symmetrical probability of buy/sell fill if it
,→ happens.
filled = ([Link]() < self.fill_probability)
# If filled, inventory changes. We assume each fill is 1
,→ unit.
if filled:
# 50% chance of filling the ask => you sold 1 unit
# 50% chance of filling the bid => you bought 1 unit
if [Link]() < 0.5:
# Sell fill
fill_price = mid_price + spread
# Only fill if not exceeding negative capacity
if abs([Link] - 1) <= self.max_inventory:
[Link] += fill_price # Gains from the sale
[Link] -= 1
else:
# Buy fill
fill_price = mid_price - spread
# Only fill if not exceeding positive capacity
if abs([Link] + 1) <= self.max_inventory:
[Link] -= fill_price # Cost of the purchase
[Link] += 1
# Simulate random price movement (Gaussian for
,→ demonstration)
[Link] += [Link](0, self.price_volatility)
self.steps_elapsed += 1
# Reward function: final PnL change less an inventory
,→ penalty
6
# Inventory penalty grows with the absolute inventory
inv_penalty = 0.1 * (abs([Link]) ** 1.2)
reward = ([Link]) - inv_penalty
# Check stopping condition
done = (self.steps_elapsed >= self.max_steps)
return self._get_state(), reward, done, {}
class QLearningAgent:
"""
A simple Q-learning agent that interacts with
,→ StochasticInventoryEnv.
"""
def __init__(
self,
action_size,
learning_rate=0.1,
discount_factor=0.95,
epsilon=1.0,
epsilon_min=0.01,
epsilon_decay=0.995
):
"""
:param action_size: Number of discrete actions available in
,→ the environment.
:param learning_rate: Alpha parameter for Q-learning
,→ updates.
:param discount_factor: Gamma parameter for Q-learning.
:param epsilon: Initial exploration probability.
:param epsilon_min: Minimum exploration probability.
:param epsilon_decay: Exponential decay rate for epsilon
,→ after each episode.
"""
self.action_size = action_size
[Link] = learning_rate
[Link] = discount_factor
[Link] = epsilon
self.epsilon_min = epsilon_min
self.epsilon_decay = epsilon_decay
# Q-table dictionary: keys are (price, inv), values are
,→ action-value arrays
self.Q = {}
def get_Q_values(self, state):
"""
Return Q-values for the provided state, creating if missing.
"""
if state not in self.Q:
self.Q[state] = [Link](self.action_size, dtype=float)
7
return self.Q[state]
def choose_action(self, state):
"""
Epsilon-greedy action selection.
"""
if [Link]() < [Link]:
return [Link](0, self.action_size - 1)
else:
q_values = self.get_Q_values(state)
return int([Link](q_values))
def update_Q(self, state, action, reward, next_state, done):
"""
Apply Q-learning update rule:
Q(s,a) = Q(s,a) + alpha * [r + gamma * max(Q(s',:)) -
,→ Q(s,a)]
"""
q_values = self.get_Q_values(state)
next_q_values = self.get_Q_values(next_state)
old_value = q_values[action]
if done:
# Terminal state update
q_values[action] = old_value + [Link] * (reward -
,→ old_value)
else:
best_next = [Link](next_q_values)
q_values[action] = old_value + [Link] * (reward +
,→ [Link] * best_next - old_value)
def decay_epsilon(self):
"""
Decrease epsilon over time down to epsilon_min.
"""
if [Link] > self.epsilon_min:
[Link] *= self.epsilon_decay
def train_agent(num_episodes=1000):
"""
Train the QLearningAgent on the StochasticInventoryEnv.
:param num_episodes: Number of training episodes.
:return: Trained agent.
"""
env = StochasticInventoryEnv()
agent = QLearningAgent(action_size=len(env.action_space))
for episode in range(num_episodes):
state = [Link]()
done = False
while not done:
action = agent.choose_action(state)
next_state, reward, done, _ = [Link](action)
8
agent.update_Q(state, action, reward, next_state, done)
state = next_state
agent.decay_epsilon()
if (episode + 1) % 100 == 0:
print(f"Episode {episode+1}/{num_episodes}, Epsilon:
,→ {[Link]:.3f}")
return agent
if __name__ == "__main__":
# Example usage:
trained_agent = train_agent(num_episodes=500)
env_test = StochasticInventoryEnv()
state = env_test.reset()
done = False
total_reward = 0.0
while not done:
action = trained_agent.choose_action(state)
next_state, reward, done, _ = env_test.step(action)
total_reward += reward
state = next_state
print("Test episode completed with total reward:", total_reward)
• The StochasticInventoryEnv class encapsulates the environ-
ment, simulating random price moves and partial order fills
to reflect real-time inventory changes.
• The QLearningAgent implements a basic Q-learning mecha-
nism to learn how to pick a spread action that balances PnL
with inventory penalties.
• The train_agent function demonstrates how to loop over mul-
tiple episodes, updating the Q-table and decaying the explo-
ration rate.
• Finally, the main block runs a quick training session, then
tests the trained policy on one episode, printing the resulting
cumulative reward.
9
Chapter 2
Adaptive Spread
Quoting with Deep
Q-Learning
Below is a detailed explanation of how Deep Q-Learning can be
applied to adaptive spread quoting, followed by a complete Python
code snippet demonstrating a toy example of this approach in ac-
tion.
In this approach, we construct a market simulation environ-
ment that tracks key market state variables—such as price veloc-
ity, order flow imbalances, and liquidity signals. An agent learns
to choose among discrete quoting actions (e.g., narrow, medium,
or wide spreads) to maximize a reward function reflecting both fill
probability (profit potential) and the risk of holding undesirable
inventory when volatility is high. The neural network within the
DQN agent estimates Q-values for each action-state pair. Dur-
ing training, experience tuples are gathered (state, action, reward,
next_state, done) to update the Q-network’s parameters, push-
ing the policy toward spread settings that balance risk and reward
under varying market conditions.
Python Code
Below is a Python code snippet that encompasses the core com-
putational elements of the "Adaptive Spread Quoting with Deep
10
Q-Learning" approach, including the environment definition, neu-
ral network agent, replay buffer, and training loop. This is a self-
contained example designed for demonstration purposes:
import numpy as np
import random
from collections import deque
# If you'd like to install TensorFlow/Keras:
# pip install tensorflow
import tensorflow as tf
from [Link] import layers, models, optimizers
class MarketEnv:
"""
A toy market environment to illustrate how Deep Q-Learning might
,→ adapt spreads.
Observations:
[price_velocity, order_imbalance, liquidity_signal] -> shape
,→ (3,)
Actions (discrete):
0 -> Narrow spread
1 -> Medium spread
2 -> Wide spread
"""
def __init__(self, max_steps=50):
self.max_steps = max_steps
self.step_count = 0
# Internal state (randomly initialized for illustration)
# price_velocity: [-5, 5]
# order_imbalance: [-1, 1]
# liquidity_signal: [0, 1]
self.price_velocity = None
self.order_imbalance = None
self.liquidity_signal = None
def reset(self):
self.step_count = 0
self.price_velocity = [Link](-2, 2)
self.order_imbalance = [Link](-0.5, 0.5)
self.liquidity_signal = [Link](0.1, 0.9)
return self._get_observation()
def step(self, action):
"""
Takes an action (spread choice) and returns next state,
,→ reward, done, info.
Reward logic (simplified):
- If market is volatile (|price_velocity| is large) and
,→ agent chooses narrow spread,
there's risk of adverse selection -> negative reward.
11
- If agent chooses wide spread in calm conditions, it
,→ might miss fills -> negative reward.
- Otherwise, the agent gets a small positive reward.
"""
self.step_count += 1
# Simulate random market changes
self.price_velocity += [Link](-0.5, 0.5)
self.order_imbalance += [Link](-0.1, 0.1)
self.liquidity_signal += [Link](-0.05, 0.05)
# Clip values to a feasible range
self.price_velocity = [Link](self.price_velocity, -5.0,
,→ 5.0)
self.order_imbalance = [Link](self.order_imbalance, -1.0,
,→ 1.0)
self.liquidity_signal = [Link](self.liquidity_signal, 0.0,
,→ 1.0)
# Determine reward
vol_level = abs(self.price_velocity)
if vol_level > 2.0 and action == 0:
# High volatility, narrow spread -> risky
reward = -1.0
elif vol_level < 1.0 and action == 2:
# Low volatility, wide spread -> missed opportunities
reward = -0.5
else:
# Medium scenario
# If order_imbalance is large, choosing a wide spread
,→ might avoid inventory risk
# If liquidity is high, a narrow spread might get better
,→ fills
reward = 0.1 - (0.1 * abs(self.order_imbalance))
if action == 0 and self.liquidity_signal > 0.5:
reward += 0.2 # good fill probability in high
,→ liquidity
if action == 2 and abs(self.order_imbalance) > 0.5:
reward += 0.2 # safer quoting in imbalance
done = self.step_count >= self.max_steps
return self._get_observation(), reward, done, {}
def _get_observation(self):
return [Link]([
self.price_velocity,
self.order_imbalance,
self.liquidity_signal
], dtype=np.float32)
class ReplayBuffer:
"""
A simple FIFO replay buffer for DQN.
12
"""
def __init__(self, max_size=2000):
[Link] = deque(maxlen=max_size)
def add(self, state, action, reward, next_state, done):
[Link]((state, action, reward, next_state,
,→ done))
def sample(self, batch_size):
batch = [Link]([Link], batch_size)
states, actions, rewards, next_states, dones = [], [], [],
,→ [], []
for s, a, r, s2, d in batch:
[Link](s)
[Link](a)
[Link](r)
next_states.append(s2)
[Link](d)
return ([Link](states, dtype=np.float32),
[Link](actions, dtype=np.int32),
[Link](rewards, dtype=np.float32),
[Link](next_states, dtype=np.float32),
[Link](dones, dtype=np.float32))
def __len__(self):
return len([Link])
class DQNAgent:
"""
Deep Q-Network agent that learns to choose among discrete
,→ spreads.
"""
def __init__(self, state_size, action_size, learning_rate=0.001,
,→ gamma=0.99,
epsilon=1.0, epsilon_min=0.01, epsilon_decay=0.995,
,→ batch_size=32):
self.state_size = state_size
self.action_size = action_size
[Link] = gamma
[Link] = epsilon
self.epsilon_min = epsilon_min
self.epsilon_decay = epsilon_decay
self.batch_size = batch_size
# Experience Replay
[Link] = ReplayBuffer()
# Main model and target model
[Link] = self._build_model(learning_rate)
self.target_model = self._build_model(learning_rate)
self.update_target_model()
def _build_model(self, learning_rate):
13
model = [Link]([
[Link](shape=(self.state_size,)),
[Link](64, activation='relu'),
[Link](64, activation='relu'),
[Link](self.action_size, activation='linear')
])
[Link](loss='mse',
,→ optimizer=[Link](learning_rate=learning_rate))
return model
def update_target_model(self):
"""
Copies weights from main model to target model.
"""
self.target_model.set_weights([Link].get_weights())
def act(self, state):
"""
Epsilon-greedy action selection.
"""
if [Link]() <= [Link]:
return [Link](self.action_size)
q_values = [Link](state[[Link], :],
,→ verbose=0)
return [Link](q_values[0])
def remember(self, state, action, reward, next_state, done):
[Link](state, action, reward, next_state, done)
def replay(self):
"""
Sample from replay buffer and train the network.
"""
if len([Link]) < self.batch_size:
return
(states, actions, rewards, next_states, dones) =
,→ [Link](self.batch_size)
# Predict Q-values for current states using main model
q_values = [Link](states, verbose=0)
# Predict Q-values for next states using target model
q_next = self.target_model.predict(next_states, verbose=0)
for i in range(self.batch_size):
target = q_values[i]
if dones[i]:
target[actions[i]] = rewards[i]
else:
target[actions[i]] = rewards[i] + [Link] *
,→ [Link](q_next[i])
# Train the model
[Link](states, q_values, epochs=1, verbose=0)
14
# Epsilon decay
if [Link] > self.epsilon_min:
[Link] *= self.epsilon_decay
def train_dqn(episodes=200):
"""
Main DQN training loop for the toy MarketEnv.
"""
env = MarketEnv(max_steps=50)
state_size = 3 # [price_velocity, order_imbalance,
,→ liquidity_signal]
action_size = 3 # [narrow=0, medium=1, wide=2]
agent = DQNAgent(state_size, action_size)
update_target_frequency = 10 # how often to update target
,→ network
for e in range(episodes):
state = [Link]()
total_reward = 0
done = False
while not done:
action = [Link](state)
next_state, reward, done, _ = [Link](action)
[Link](state, action, reward, next_state, done)
[Link]()
state = next_state
total_reward += reward
# Update target model every few episodes
if e % update_target_frequency == 0:
agent.update_target_model()
print(f"Episode: {e}/{episodes}, Reward: {total_reward:.2f},
,→ Epsilon: {[Link]:.3f}")
return agent
if __name__ == "__main__":
trained_agent = train_dqn(episodes=50)
Below is a brief summary of the core components in this snippet:
• MarketEnv: A toy environment simulating market condi-
tions (price velocity, order flow imbalance, liquidity). The
agent’s action changes the spread (narrow, medium, or wide),
which affects the reward depending on volatility and liquidity.
• ReplayBuffer: Stores experience tuples (state, action, re-
ward, next_state, done) for training stability.
15
• DQNAgent: Implements the Deep Q-Learning logic with
two neural network models (main and target), an epsilon-
greedy policy, and a learning routine that updates Q-values
to balance immediate profit vs. risk.
• train_dqn: The main training loop that interacts with the
environment, trains the agent over multiple episodes, and
periodically updates the target network for stable learning.
By observing how the rewards vary with market volatility and
action choices, the trained agent learns to adaptively widen or nar-
row spreads, balancing the risk of adverse selection against the
desire for more frequent order fills.
16
Chapter 3
Market Microstructure
Factor Decomposition
Below is an elaborate explanation of how the factor decomposition
concept can be implemented for real-time market making.
The core steps of this approach are:
1. Data Ingestion: Collect high-frequency market data containing
order book states (e.g., best bid/ask prices, queue lengths, volume
profiles).
2. Feature Engineering: Extract domain-specific microstructure
features such as short-term trend, order flow momentum, liquidity
pockets, etc.
3. Factor Decomposition: Apply techniques like PCA to reduce di-
mensionality and reveal latent factors that drive market microstruc-
ture.
4. Model Training: Feed both the PCA-transformed factors and
raw domain-specific features into a predictive model (linear or non-
linear).
5. Quoting Logic: Use model outputs (e.g., predicted short-horizon
price movement or liquidity stress) to adjust quote prices and spreads
dynamically.
This pipeline ensures that hidden market dynamics—exposed through
factor decomposition—are integrated into the quoting strategy, im-
proving fill rates and mitigating slippage.
17
Python Code
Below is a Python code snippet demonstrating how to build a
simple end-to-end pipeline for microstructure factor decomposition
and applying it to a basic quoting function.
import numpy as np
import pandas as pd
from [Link] import PCA
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
def generate_synthetic_orderbook_data(num_samples=10000, seed=42):
"""
Generate synthetic high-frequency order book data for
,→ demonstration.
Each row simulates an instantaneous snapshot of market
,→ conditions.
:param num_samples: Number of synthetic data samples to
,→ generate.
:param seed: Random seed for reproducibility.
:return: A Pandas DataFrame with columns representing market
,→ features.
"""
[Link](seed)
# Simulate best bid and ask prices around a drifting mid price
drift = [Link]([Link](num_samples) * 0.01)
mid_prices = 100 + drift
spread = [Link](num_samples) * 0.05 # random
,→ half-spread
best_bids = mid_prices - spread
best_asks = mid_prices + spread
# Simulate queue lengths at each level
bid_queue = [Link](1, 200, size=num_samples)
ask_queue = [Link](1, 200, size=num_samples)
# Order flow momentum: random short-window aggregated trades
order_flow = [Link](num_samples) * 10
# Volume profile: for demonstration, random bursts
volume_profile = [Link](1, 1000, size=num_samples)
# Short-term trend: difference of mid prices
short_term_trend = [Link]([0], [Link](mid_prices))
data_dict = {
'best_bid': best_bids,
18
'best_ask': best_asks,
'bid_queue_length': bid_queue,
'ask_queue_length': ask_queue,
'order_flow_momentum': order_flow,
'volume_profile': volume_profile,
'short_term_trend': short_term_trend
}
df = [Link](data_dict)
return df
def extract_microstructure_features(df):
"""
Extract domain-specific features indicating market
,→ microstructure behaviors.
:param df: Original order book dataframe.
:return: DataFrame of engineered features.
"""
# Compute mid_price as a primary metric
df['mid_price'] = (df['best_bid'] + df['best_ask']) / 2
# Compute liquidity pockets: a naive measure combining queue
,→ lengths
df['liquidity_pockets'] = (df['bid_queue_length'] +
,→ df['ask_queue_length']) / (
1 + abs(df['best_ask'] - df['best_bid'])
)
# Compute a simplified measure of slope or momentum in the order
,→ book
# We'll consider order_flow_momentum as is, but you could
,→ transform it
df['abs_order_flow'] = [Link](df['order_flow_momentum'])
# Additional placeholder for micro-lulls or bursts (dummy logic)
df['liquidity_burst_flag'] = (df['volume_profile'] >
,→ 800).astype(int)
return df
def apply_factor_decomposition(feature_df, n_components=3):
"""
Apply PCA to reveal latent factors from market microstructure
,→ features.
:param feature_df: The dataframe containing engineered features.
:param n_components: Number of principal components to extract.
:return: pca_object, transformed_features (DataFrame of
,→ principal components).
"""
19
# Select numerical columns for PCA
numeric_cols =
,→ feature_df.select_dtypes(include=[[Link]]).[Link]()
# Standardizing by mean and std
X = feature_df[numeric_cols].copy()
X_mean = [Link]()
X_std = [Link]().replace(0, 1e-6)
X_scaled = (X - X_mean) / X_std
# Apply PCA
pca = PCA(n_components=n_components, random_state=42)
pca_factors = pca.fit_transform(X_scaled)
# Create a DataFrame with principal components
factor_cols = [f'factor_{i+1}' for i in range(n_components)]
factor_df = [Link](pca_factors, columns=factor_cols,
,→ index=feature_df.index)
return pca, factor_df
def train_predictive_model(factor_df, target_series,
,→ use_linear_model=True):
"""
Train a basic predictive model (Linear Regression or a more
,→ complex model)
to forecast short-horizon price movements or mid-price changes.
:param factor_df: DataFrame containing principal components or
,→ factors.
:param target_series: The target variable to predict (e.g., next
,→ price change).
:param use_linear_model: Whether to use a simple linear model or
,→ not.
:return: Trained model, X_test, y_test
"""
X_train, X_test, y_train, y_test = train_test_split(
factor_df,
target_series,
test_size=0.2,
random_state=42
)
if use_linear_model:
model = LinearRegression()
else:
# Placeholder for a more sophisticated model (e.g., MLP,
,→ Recurrent NN)
# In practice, we might import a neural network library here
model = LinearRegression()
[Link](X_train, y_train)
20
return model, X_test, y_test
def quoting_logic(current_mid_price, predicted_change,
,→ risk_aversion=0.5):
"""
Example quoting logic that adjusts the quote spread based on
,→ predicted change.
If we expect a positive move, we might shift the bid upward and
,→ the ask upward,
tightening or widening based on risk aversion.
:param current_mid_price: The current mid price from the order
,→ book.
:param predicted_change: Predicted short-horizon change in the
,→ mid price.
:param risk_aversion: A parameter controlling how aggressively
,→ we quote.
:return: A tuple (new_bid, new_ask) representing the updated
,→ quotes.
"""
# Base spread is a fraction of mid price scaled by risk aversion
base_spread = current_mid_price * 0.001 * (1 + risk_aversion)
# If predicted_change > 0, expecting a price increase, we can
,→ shift quotes up
shift = predicted_change * 0.5
# Example logic: new bid = mid_price + shift - half_spread
# new ask = mid_price + shift + half_spread
half_spread = base_spread / 2
new_bid = current_mid_price + shift - half_spread
new_ask = current_mid_price + shift + half_spread
return max(new_bid, 0), max(new_ask, new_bid) # ensure ask >=
,→ bid
def main():
# 1. Generate synthetic data
df = generate_synthetic_orderbook_data(num_samples=5000)
# 2. Extract domain-specific features
df = extract_microstructure_features(df)
# The target we try to predict: forward mid_price change over
,→ next few steps
# For demonstration, shift the 'mid_price' by -1 to create a
,→ naive target
df['next_mid_change'] = df['mid_price'].shift(-1) -
,→ df['mid_price']
[Link](inplace=True)
21
# 3. Apply factor decomposition (PCA) to the set of
,→ microstructure features
# We'll just apply to the entire numeric feature set for
,→ demonstration
numeric_part = [Link](columns=['next_mid_change'])
pca, factor_df = apply_factor_decomposition(numeric_part)
# 4. Train a predictive model on the PCA factors
model, X_test, y_test = train_predictive_model(
factor_df,
df['next_mid_change'],
use_linear_model=True
)
# 5. Predict on the test set for demonstration
y_pred = [Link](X_test)
# Example scenario: pick an index from the test set and compute
,→ new quotes
random_index = [Link](X_test.[Link])
predicted_change =
,→ [Link](X_test.loc[random_index].[Link](1,
,→ -1))[0]
current_mid_price = [Link][random_index, 'mid_price']
new_bid, new_ask = quoting_logic(current_mid_price,
,→ predicted_change)
# Print out results
print(f"Random Index: {random_index}")
print(f"Current Mid Price: {current_mid_price:.4f}")
print(f"Predicted Short-Horizon Change: {predicted_change:.4f}")
print(f"New Bid Quote: {new_bid:.4f}, New Ask Quote:
,→ {new_ask:.4f}")
print(f"Model R^2 on Test Set: {[Link](X_test,
,→ y_test):.4f}")
if __name__ == "__main__":
main()
The code above demonstrates the following key steps:
• Generation of synthetic order book data, including best bid/ask
prices, queue lengths, and pseudo order flow momentum.
• Extraction of domain-specific microstructure features (liquid-
ity pockets, short-term trend, etc.).
• Application of PCA to decompose features into latent factors
that capture hidden dynamics.
22
• Training a simple linear model to predict short-horizon mid-
price changes from these factors.
• A basic quoting logic that shifts bid and ask quotes based
on the predicted price movement and a tunable risk aversion
parameter.
By focusing the model on both factor-based signals (from PCA)
and domain-specific features (order flow momentum, queue lengths,
volume bursts), market makers can produce quotes that dynami-
cally adapt to evolving market microstructure, thus improving fill
rates and controlling slippage.
23
Chapter 4
Volatility-Aware Quote
Skewing
Designed to handle sudden spikes in market volatility, this algo-
rithm calculates a real-time vol-adjusted skew by incorporating
both historical patterns and short-term volatility estimates. The
objective is to dynamically position quotes so that during turbu-
lence, the spread widens to compensate for risk, and during stable
periods, the spread compresses to encourage fills. Below is an out-
line of the approach:
• Volatility Forecasting: Use a high-frequency estimator (for
example, EWMA or GARCH) to continuously update current volatil-
ity in near-real-time. This forecast can blend intraday price fluc-
tuations (short-term) with a rolling historical observation (long-
term).
• Adjusted Spread Calculation: Map the predicted volatility to
an output spread. For instance, when projected volatility is low,
quotes can be placed closer to the best bid/ask. When volatility
spikes, the algorithm widens the spread to protect against adverse
movements.
• Real-Time Optimization: Integrate a quick feedback loop that
recalculates the spread based on recent volatility changes. The
system may also factor in inventory positions if it needs to avoid
heavy imbalance.
By combining these steps, a market maker can better handle
sudden market upheavals while still participating in stable environ-
ments.
24
Python Code
Below is a Python code snippet demonstrating how one might im-
plement a simplified version of the volatility-aware quote skewing
strategy using an EWMA volatility model:
import numpy as np
class VolatilityForecaster:
"""
A class to forecast volatility using an exponentially weighted
,→ moving average (EWMA).
"""
def __init__(self, alpha=0.94, initial_vol_guess=0.01):
"""
:param alpha: The smoothing parameter for EWMA.
Typically around 0.94 to 0.97 for intraday
,→ data.
:param initial_vol_guess: A default initial volatility
,→ guess.
"""
[Link] = alpha
self.current_vol = initial_vol_guess
def update_volatility(self, return_value):
"""
Updates the EWMA volatility estimate based on the observed
,→ return.
:param return_value: The most recent price return.
:return: Updated volatility estimate.
"""
# EWMA volatility update rule:
# new_vol^2 = alpha * old_vol^2 + (1 - alpha) *
,→ return_value^2
old_vol_sq = self.current_vol ** 2
new_vol_sq = [Link] * old_vol_sq + (1 - [Link]) *
,→ (return_value ** 2)
self.current_vol = [Link](new_vol_sq)
return self.current_vol
class VolatilityAwareQuoter:
"""
A class to represent a quoting mechanism that adjusts spreads
,→ based on volatility.
"""
def __init__(self, base_spread=0.01, max_spread=0.05,
,→ inventory_weight=0.001):
"""
25
:param base_spread: A nominal (minimum) spread for stable
,→ conditions.
:param max_spread: The maximum allowed spread in
,→ high-volatility conditions.
:param inventory_weight: Weight factor to optionally adjust
,→ spread based on inventory risk.
"""
self.base_spread = base_spread
self.max_spread = max_spread
self.inventory_weight = inventory_weight
def compute_spread(self, current_vol, inventory_level):
"""
Compute the final spread given the current volatility and
,→ the market maker's inventory level.
:param current_vol: Current volatility estimate from the
,→ forecaster.
:param inventory_level: Current inventory position (could be
,→ long or short).
:return: A spread width to apply around mid-price.
"""
# Example approach: scale spread linearly between
,→ base_spread and max_spread using vol
# Then incorporate an inventory penalty or premium
,→ adjustment.
spread_range = self.max_spread - self.base_spread
# Map volatility in [0, 1] range for demonstration (you
,→ could use a different scaling).
vol_factor = min(1.0, max(0.0, current_vol / 0.05))
# Basic linear interpolation
raw_spread = self.base_spread + spread_range * vol_factor
# Optional inventory-based adjustment
# The bigger your inventory imbalance, the wider you might
,→ want to quote
inventory_adjustment = self.inventory_weight *
,→ abs(inventory_level)
final_spread = raw_spread + inventory_adjustment
return final_spread
def simulate_market_data(num_steps=1000, drift=0.0, volatility=0.02,
,→ seed=42):
"""
Generates random price steps simulating intraday price
,→ evolution.
For demonstration, we assume a simple geometric Brownian motion
,→ style model.
:param num_steps: Number of time steps to simulate.
:param drift: The drift term in the simulated process.
:param volatility: The volatility term in the simulated process.
26
:param seed: Random seed for reproducibility.
:return: A NumPy array of simulated prices.
"""
[Link](seed)
prices = [100.0] # Start from a price of 100
dt = 1.0 # time step (notional), might represent 1 minute or
,→ smaller chunk in real usage
for _ in range(num_steps):
# random shock in price
shock = [Link](loc=(drift * dt), scale=(volatility
,→ * [Link](dt)))
new_price = prices[-1] * (1 + shock)
[Link](new_price)
return [Link](prices)
def run_volatility_aware_strategy(prices):
"""
Demonstrates running the volatility-aware quoting strategy given
,→ a series of prices.
:param prices: Array of simulated or real intraday prices.
"""
# Initialize our Forecaster and Quoter
forecaster = VolatilityForecaster(alpha=0.94,
,→ initial_vol_guess=0.01)
quoter = VolatilityAwareQuoter(base_spread=0.01,
,→ max_spread=0.05, inventory_weight=0.0005)
inventory_level = 0 # Starting with a neutral inventory
mid_prices = []
spreads = []
inventory_trajectory = []
# We'll assume a naive approach: if we "buy" in a down move, or
,→ "sell" in an up move
# just to illustrate how inventory might change. In practice,
,→ you'd have more sophisticated logic.
for i in range(1, len(prices)):
# Compute the return from the previous price
ret = (prices[i] - prices[i - 1]) / prices[i - 1]
# Update volatility
current_vol = forecaster.update_volatility(ret)
# Decide on spread
current_spread = quoter.compute_spread(current_vol,
,→ inventory_level)
# For demonstration: if price moves up, assume we sold some,
,→ reducing inventory.
27
# If price moves down, assume we bought some, increasing
,→ inventory.
if ret > 0:
inventory_level -= 1
elif ret < 0:
inventory_level += 1
# Record data
mid_prices.append((prices[i] + prices[i - 1]) / 2.0)
[Link](current_spread)
inventory_trajectory.append(inventory_level)
return mid_prices, spreads, inventory_trajectory
def main():
# 1. Simulate market data
prices = simulate_market_data(num_steps=500, drift=0.0001,
,→ volatility=0.01)
# 2. Run the volatility-aware quoting strategy
mid_prices, spreads, inventory_trajectory =
,→ run_volatility_aware_strategy(prices)
# 3. Simple demonstration output
# Print the last few results to show how spreads adapt
print("Final mid-price:", mid_prices[-1])
print("Final spread:", spreads[-1])
print("Final inventory level:", inventory_trajectory[-1])
# We could do more elaborate analysis or plotting here if
,→ desired.
if __name__ == "__main__":
main()
Below is an outline of the key functional segments in the code:
• VolatilityForecaster: Maintains and updates an EWMA-based
volatility measure, factoring in recent price returns.
• VolatilityAwareQuoter: Computes the appropriate spread based
on the current volatility estimate and an optional inventory
adjustment.
• simulate_market_data: Generates synthetic intraday prices
for demonstration (using a simplified random process). This
can be replaced or extended to real-world data feeds.
• run_volatility_aware_strategy: Puts the forecaster and quoter
together in a simulation loop, updating volatility and spreads
28
at each step, and optionally adjusting inventory in response
to price moves.
• main: Provides a minimal working example that simulates
data, runs the strategy, and outputs final statistics.
This conceptual framework illustrates how to integrate high-
frequency volatility estimates into an automated quoting mecha-
nism, ensuring that in calm markets, spreads remain tight, while
in turbulent markets, spreads widen to reduce adverse selection
and mitigate risk.
29
Chapter 5
Reinforcement
Momentum Forecasting
Below is an illustrative approach to implementing a short-horizon
momentum predictor infused into a reinforcement learning frame-
work for market making. The primary idea is to detect brief, ex-
ploitable price trends (“price waves”) using a Temporal Convolu-
tional Network (TCN), then feed this momentum signal—alongside
the current inventory and recent volatility—into a policy network
that determines optimal spread widths. The agent’s reward func-
tion balances immediate profitability (from favorable fills) with pe-
nalizing large or adverse inventory positions, ensuring that risk is
kept in check.
In a production setting, one would replace the simplified feed
data and reward calculation logic here with more comprehensive
market microstructure details, order book updates, and fill simu-
lations.
Python Code
Below is a Python code snippet that accomplishes the following:
1. Simulates streaming price updates and maintains an agent in-
ventory.
2. Utilizes a simple TCN model to detect short-term momentum
based on recent price changes.
3. Employs a Policy Network (reinforcement learning style) that
30
determines spread actions (narrow, medium, wide) based on TCN-
derived momentum signals, inventory, and volatility.
4. Implements a training loop where the agent refines its policy to
maximize reward (weighted by profit and inventory risk).
import torch
import [Link] as nn
import [Link] as optim
import numpy as np
import random
from collections import deque
# -----------------------------
# 1. ENVIRONMENT SIMULATION
# -----------------------------
class MarketEnvironment:
"""
A simplified market-making environment that simulates price
,→ moves,
calculates PnL, and updates agent inventory.
"""
def __init__(self,
max_steps=2000,
initial_price=100.0,
volatility=0.5,
inventory_penalty=0.01):
"""
:param max_steps: The number of time steps to simulate.
:param initial_price: Starting price of the asset.
:param volatility: Standard deviation for daily (or step)
,→ price changes.
:param inventory_penalty: Penalty multipler for holding
,→ inventory.
"""
self.max_steps = max_steps
self.initial_price = initial_price
[Link] = volatility
self.inventory_penalty = inventory_penalty
[Link]()
def reset(self):
"""
Resets the environment to initial conditions.
"""
self.current_step = 0
[Link] = self.initial_price
[Link] = False
[Link] = 0.0
[Link] = [[Link]] # store price history for TCN
,→ input
31
self.total_reward = 0.0
return self._get_state()
def _get_state(self):
"""
Returns the state representation:
- recent price changes (for TCN)
- current inventory
- approximate volatility
"""
# We define a fixed window for TCN input, e.g. last 16
,→ changes
# If fewer than 16 steps exist, we pad with zeros.
history_window = 16
if len([Link]) < history_window + 1:
padded_prices = [[Link][0]] * (history_window + 1 -
,→ len([Link])) + [Link]
else:
padded_prices = [Link][-(history_window + 1):]
# price changes (differences)
price_diffs = []
for i in range(len(padded_prices) - 1):
price_diffs.append(padded_prices[i+1] -
,→ padded_prices[i])
# inventory & volatility as single values
# (you could also track multiple frames of volatility)
state = {
'price_diffs': price_diffs,
'inventory': [Link],
'volatility': [Link]
}
return state
def step(self, action):
"""
Environment step given the chosen spread action.
:param action: integer representing spread setting
e.g. 0 -> narrow, 1 -> medium, 2 -> wide
:return: next_state, reward, done, info_dict
"""
# Sample next price change from a normal distribution for
,→ simplicity
price_change = [Link](0, [Link])
old_price = [Link]
[Link] += price_change
[Link]([Link])
# Simulate whether the agent gets filled or not (very
,→ simplified).
# If the agent sets a narrower spread, the fill probability
,→ is higher.
32
if action == 0:
fill_prob = 0.9
spread_cost = 0.01
elif action == 1:
fill_prob = 0.6
spread_cost = 0.02
else:
fill_prob = 0.3
spread_cost = 0.05
filled = ([Link]() < fill_prob)
# If order is filled, assume an immediate buy or sell based
,→ on
# momentum direction (this is a simplification).
# For demonstration, let's say if price_change > 0, we
,→ *sell* to capitalize
# on upward momentum, else we *buy* to catch the downward
,→ momentum.
fill_pnl = 0.0
if filled:
if price_change > 0:
# agent sells at slightly above the old_price
exec_price = old_price + (spread_cost / 2.0)
# Realized PnL for that trade
fill_pnl = exec_price - [Link] # sold high,
,→ buyback at new price
[Link] -= 1.0
else:
# agent buys at slightly below old_price
exec_price = old_price - (spread_cost / 2.0)
# Realized PnL for that trade
fill_pnl = [Link] - exec_price # bought low,
,→ new price is higher/same
[Link] += 1.0
# Inventory penalty to discourage large open positions:
inv_penalty = -self.inventory_penalty * abs([Link])
# Final reward
reward = fill_pnl + inv_penalty
self.total_reward += reward
self.current_step += 1
if self.current_step >= self.max_steps:
[Link] = True
next_state = self._get_state()
info = {"filled": filled,
"fill_pnl": fill_pnl,
"inventory": [Link]}
return next_state, reward, [Link], info
33
# --------------------------------------------------
# 2. TEMPORAL CONVOLUTIONAL NETWORK FOR MOMENTUM
# --------------------------------------------------
class TemporalBlock([Link]):
"""
Single block for a simplified TCN layer.
"""
def __init__(self, in_channels, out_channels, kernel_size=3,
,→ stride=1, dilation=1, padding=1):
super(TemporalBlock, self).__init__()
self.conv1 = nn.Conv1d(in_channels, out_channels,
,→ kernel_size,
stride=stride, padding=padding,
,→ dilation=dilation)
[Link] = [Link]()
self.conv2 = nn.Conv1d(out_channels, out_channels,
,→ kernel_size,
stride=stride, padding=padding,
,→ dilation=dilation)
[Link] = [Link](
self.conv1,
nn.BatchNorm1d(out_channels),
[Link],
self.conv2,
nn.BatchNorm1d(out_channels)
)
[Link] = nn.Conv1d(in_channels, out_channels, 1) if
,→ in_channels != out_channels else None
self.init_weights()
def init_weights(self):
[Link].kaiming_normal_([Link],
,→ nonlinearity='relu')
[Link].kaiming_normal_([Link],
,→ nonlinearity='relu')
if [Link] is not None:
[Link].kaiming_normal_([Link],
,→ nonlinearity='relu')
def forward(self, x):
out = [Link](x)
res = x if [Link] is None else [Link](x)
return [Link]()(out + res)
class TemporalConvNet([Link]):
"""
A simple Temporal Convolutional Network for short-term momentum
,→ detection.
Input shape: (batch_size, channels=1, seq_len=16)
"""
def __init__(self, num_channels=[16, 32], kernel_size=3):
super(TemporalConvNet, self).__init__()
layers = []
34
in_channels = 1
for out_channels in num_channels:
layers += [TemporalBlock(in_channels,
out_channels,
kernel_size=kernel_size,
dilation=1,
padding=(kernel_size//2))]
in_channels = out_channels
[Link] = [Link](*layers)
def forward(self, x):
"""
:param x: shape (batch_size, 1, seq_len)
:return: last layer feature shape (batch_size, out_channels,
,→ seq_len)
"""
return [Link](x)
# --------------------------------------------------
# 3. POLICY NETWORK (RL AGENT)
# --------------------------------------------------
class PolicyNetwork([Link]):
"""
Simple MLP that takes TCN features + inventory + volatility
and outputs action probabilities for discrete spread settings.
"""
def __init__(self, tcn_output_channels=32, num_actions=3):
super(PolicyNetwork, self).__init__()
self.tcn_output_channels = tcn_output_channels
self.fc1 = [Link](tcn_output_channels + 2, 64)
self.fc2 = [Link](64, 32)
self.fc3 = [Link](32, num_actions)
[Link] = [Link]()
def forward(self, tcn_output, inventory, volatility):
"""
:param tcn_output: shape (batch_size, out_channels, seq_len)
:param inventory: shape (batch_size,)
:param volatility: shape (batch_size,)
:return: logits for discrete actions
"""
# We can take the last time step or average pool across the
,→ time dimension
# For simplicity, let's do an average pool across seq_len
tcn_features = [Link](tcn_output, dim=2) # shape:
,→ (batch_size, out_channels)
# Concatenate additional features
inv_vol = [Link]([inventory, volatility], dim=1) #
,→ shape: (batch_size, 2)
x = [Link]([tcn_features, inv_vol], dim=1)
x = [Link](self.fc1(x))
35
x = [Link](self.fc2(x))
logits = self.fc3(x)
return logits
# -----------------------------
# 4. AGENT TRAINING LOOP
# -----------------------------
class MarketMakerAgent:
"""
A reinforcement learning agent that uses the TCN for momentum
,→ detection
and a policy network for spread decision.
"""
def __init__(self,
env,
tcn,
policy_net,
gamma=0.99,
learning_rate=0.001,
device='cpu'):
[Link] = env
[Link] = [Link](device)
self.policy_net = policy_net.to(device)
[Link] = gamma
[Link] =
,→ [Link](list([Link]())+list(policy_net.parameters()),
lr=learning_rate)
[Link] = device
def select_action(self, state):
"""
Sample an action given the current state using the policy
,→ network.
"""
price_diffs = [Link](state['price_diffs'],
,→ dtype=torch.float32,
device=[Link]).unsqueeze(0).unsqueeze(0)
# shape after unsqueeze: (1, 1, seq_len)
tcn_out = [Link](price_diffs) # (1, tcn_output_channels,
,→ seq_len)
inventory_tensor = [Link]([state['inventory']],
,→ dtype=torch.float32, device=[Link])
vol_tensor = [Link]([state['volatility']],
,→ dtype=torch.float32, device=[Link])
logits = self.policy_net(tcn_out, inventory_tensor,
,→ vol_tensor)
probs = [Link](logits, dim=1)
# Sample from the distribution
dist = [Link](probs=probs)
action = [Link]()
36
return [Link](), dist.log_prob(action)
def train_one_episode(self):
log_probs = []
rewards = []
state = [Link]()
while True:
action, log_prob = self.select_action(state)
next_state, reward, done, _ = [Link](action)
log_probs.append(log_prob)
[Link](reward)
state = next_state
if done:
break
# Compute discounted rewards
discounted_returns = []
R = 0
for r in reversed(rewards):
R = r + [Link] * R
discounted_returns.insert(0, R)
discounted_returns = [Link](discounted_returns,
,→ dtype=torch.float32, device=[Link])
# Normalize rewards (optional)
discounted_returns = (discounted_returns -
,→ discounted_returns.mean()) / (discounted_returns.std() +
,→ 1e-5)
loss = 0
for log_prob, Gt in zip(log_probs, discounted_returns):
loss += -log_prob * Gt
[Link].zero_grad()
[Link]()
[Link]()
return sum(rewards)
def train_agent(self, episodes=10):
for ep in range(episodes):
ep_reward = self.train_one_episode()
print(f"Episode {ep+1}/{episodes}, Reward:
,→ {ep_reward:.2f}")
# -----------------------------
# 5. MAIN EXECUTION
# -----------------------------
if __name__ == "__main__":
# Initialize environment
env = MarketEnvironment(
max_steps=200,
37
initial_price=100.0,
volatility=0.5,
inventory_penalty=0.01
)
# Initialize TCN and Policy
tcn = TemporalConvNet(num_channels=[16, 32], kernel_size=3)
policy_net = PolicyNetwork(tcn_output_channels=32,
,→ num_actions=3)
# Create Agent
agent = MarketMakerAgent(env, tcn, policy_net, gamma=0.99,
,→ learning_rate=0.001, device='cpu')
# Train for some episodes
agent.train_agent(episodes=5)
# After training, the agent can be tested or integrated into a
,→ live system.
This code performs the following major steps:
• Defines a simplified market environment (MarketEnvironment)
where each step updates price, calculates a fill probability
based on the chosen spread width, and applies a reward that
includes both realized profit and inventory penalties.
• Implements a Temporal Convolutional Network (Temporal-
ConvNet) to extract local momentum patterns from recent
price changes.
• Creates a Policy Network (PolicyNetwork) that merges the
TCN output with the agent’s inventory and market volatil-
ity, producing discrete action logits (e.g., three spread width
choices).
• Trains the agent (MarketMakerAgent) using a standard pol-
icy gradient approach, computing discounted returns and ad-
justing both the TCN and policy network weights to maxi-
mize the reward.
Although this example is simplified, extending it to handle real
order book data, partial fills, more complex inventory policies, or
additional signals (like micro-lot analysis, advanced volatility mod-
els, etc.) would follow the same architectural pattern.
38
Chapter 6
Neural Bayesian
Market Maker
This approach combines Bayesian inference with neural network
approximations to update beliefs about return distributions on-
the-fly. By treating the neural network’s weights in a Bayesian
manner, the system generates confidence intervals around price es-
timates, enabling sophisticated risk management. The core steps
involve sampling from approximate posterior distributions, gen-
erating multiple spread proposals, and selecting the variant that
optimally balances expected profit and downside risk based on the
network’s uncertainty measures.
Python Code
Below is a Python code snippet that demonstrates a Bayesian neu-
ral network approach for market making. The code includes:
1. A dummy dataset generator simulating market returns.
2. Construction and sampling of a Bayesian neural network (using
PyMC) to approximate return distributions based on input fea-
tures.
3. Functions for generating multiple spread proposals, evaluating
profitability and risk, and selecting the best spread under uncer-
tainty.
39
import numpy as np
import pymc as pm
import arviz as az
import [Link] as at
def generate_synthetic_market_data(n_samples=200):
"""
Generates synthetic market data (transaction features and
,→ subsequent returns).
For demonstration, we create a simple relationship between X and
,→ y.
:param n_samples: Number of data points to generate.
:return: (X, y) where X is the input features and y is the
,→ simulated returns.
"""
[Link](42)
# Two-dimensional features
X = [Link](n_samples, 2)
# Ground truth relationship (linear + noise for demonstration)
w_true = [Link]([0.8, -0.4])
b_true = 0.1
# Return series generation
y = X @ w_true + b_true + 0.1 * [Link](n_samples)
return X, y
def build_bayesian_neural_network(X, y):
"""
Constructs a Bayesian neural network in PyMC with one hidden
,→ layer.
:param X: Input features (numpy array).
:param y: Observed market returns (numpy array).
:return: A PyMC model object.
"""
with [Link]() as bnn_model:
# Priors for first layer weight and bias
weights_in_1 = [Link]('weights_in_1', mu=0, sigma=1,
,→ shape=([Link][1], 5))
bias_in_1 = [Link]('bias_in_1', mu=0, sigma=1,
,→ shape=(5,))
# Hidden layer computation (ReLU activation)
layer_1 = [Link]([Link](X, weights_in_1) + bias_in_1)
# Priors for second layer weight and bias
weights_2 = [Link]('weights_2', mu=0, sigma=1, shape=(5,
,→ 1))
bias_2 = [Link]('bias_2', mu=0, sigma=1, shape=(1,))
# Output layer (no activation)
output = [Link](layer_1, weights_2) + bias_2
# Prior for the observation noise
40
sigma = [Link]('sigma', sigma=1.0)
# Likelihood (observed returns)
likelihood = [Link]('likelihood', mu=[Link](),
sigma=sigma, observed=y)
return bnn_model
def sample_bnn_posterior(bnn_model, draws=1000, tune=1000):
"""
Runs MCMC sampling to approximate the posterior of the Bayesian
,→ neural network.
:param bnn_model: The PyMC model object.
:param draws: Number of MCMC draws.
:param tune: Number of tuning steps.
:return: The posterior trace.
"""
with bnn_model:
trace = [Link](draws=draws, tune=tune, target_accept=0.9,
,→ chains=2)
return trace
def predict_returns_with_bnn(X_new, bnn_model, trace,
,→ n_samples=200):
"""
Predict return distributions for unseen features X_new using the
,→ sampled BNN posterior.
:param X_new: A 2D numpy array representing new input features.
:param bnn_model: The PyMC model object with neural network
,→ definitions.
:param trace: The MCMC trace from the trained BNN.
:param n_samples: Number of posterior samples to draw for
,→ predictions.
:return: A numpy array of shape (n_samples, X_new.shape[0])
,→ containing
predicted returns for each posterior sample.
"""
# Build the symbolic expressions for the network output
weights_in_1 = bnn_model['weights_in_1']
bias_in_1 = bnn_model['bias_in_1']
weights_2 = bnn_model['weights_2']
bias_2 = bnn_model['bias_2']
X_new_t = [Link](X_new, name='X_new_t')
layer_1_expr = [Link]([Link](X_new_t, weights_in_1) +
,→ bias_in_1)
output_expr = [Link](layer_1_expr, weights_2) + bias_2
# Use PyMC's sample_posterior_predictive with a custom node
# approach: build a deterministic variable and sample from it
with bnn_model:
pred_func = [Link]('pred_func', output_expr)
ppc_trace = pm.sample_posterior_predictive(trace,
vars=[pred_func],
41
,→ samples=n_samples,
,→ progressbar=False)
predictions = ppc_trace['pred_func'].squeeze(axis=-1)
# predictions has shape (n_samples, n_posterior_draws,
,→ X_new.shape[0])
# we want to combine dimension 0 and 1 for convenience, or
,→ simply keep dimension 1
# because n_samples above is referencing how many times we
,→ replicate from the posterior
# Typically, we might want shape (n_posterior_draws, n_points)
# but here let's just flatten appropriately.
return [Link](axis=0)
def generate_spread_proposals(num_proposals=5):
"""
Generate multiple spread proposals. For demonstration, we
,→ produce
random spread widths in a plausible range.
:param num_proposals: How many different spread values to
,→ propose.
:return: A list of proposed spreads.
"""
# E.g., we propose small to relatively larger spreads
[Link](0)
spreads = [Link](0.1, 1.0, size=num_proposals)
return spreads
def evaluate_spread_profit_and_risk(spread, predicted_returns):
"""
Evaluates the expected profit and downside risk for a given
,→ spread.
This function is heavily simplified for demonstration.
:param spread: Proposed spread level.
:param predicted_returns: An array of posterior predictions for
,→ returns.
:return: (expected_profit, downside_risk)
"""
# Suppose we define a simplistic measure of profit ~ spread *
,→ average returns
# and risk ~ standard deviation * spread
expected_profit = spread * [Link](predicted_returns)
downside_risk = spread * [Link](predicted_returns)
return expected_profit, downside_risk
def select_optimal_spread(spreads, predicted_returns,
,→ risk_aversion=0.5):
"""
Select the variant that optimally balances expected profit and
,→ downside risk.
:param spreads: List or array of possible spread values.
42
:param predicted_returns: Posterior predictions for returns (or
,→ average).
:param risk_aversion: Controls how strongly we penalize risk in
,→ the objective.
:return: The best spread under our simplistic objective.
"""
best_score = -[Link]
best_spread = None
for s in spreads:
prof, risk = evaluate_spread_profit_and_risk(s,
,→ predicted_returns)
score = prof - risk_aversion * risk # A rudimentary
,→ risk-adjusted measure
if score > best_score:
best_score = score
best_spread = s
return best_spread
if __name__ == "__main__":
# ---------------------------------------------------------
# 1. Generate synthetic data and build the BNN model
X, y = generate_synthetic_market_data(n_samples=200)
bnn_model = build_bayesian_neural_network(X, y)
# ---------------------------------------------------------
# 2. Sample from the BNN posterior distribution
print("Sampling from the BNN posterior...")
trace = sample_bnn_posterior(bnn_model, draws=500, tune=500)
# ---------------------------------------------------------
# 3. Predict returns for new data (simulate 'current market
,→ state')
X_new = [Link]([[0.5, -0.3], [1.0, 0.2]]) # Example new
,→ feature set
posterior_pred = predict_returns_with_bnn(X_new, bnn_model,
,→ trace, n_samples=200)
# Here, posterior_pred is of shape (number_of_draws,
,→ number_of_sample_points)
# For demonstration, let's take the mean over the sample points:
# Suppose we just focus on the first input row, or average them
# to get a single numeric approximation for a base strategy.
approximate_future_return = posterior_pred.mean()
# ---------------------------------------------------------
# 4. Generate multiple spread proposals
spread_candidates = generate_spread_proposals(num_proposals=5)
# ---------------------------------------------------------
# 5. Evaluate each spread based on predicted returns
# We'll just pass the approximate_future_return for
,→ demonstration
print("Evaluating spread proposals...")
best_spread = select_optimal_spread(spread_candidates,
43
predicted_returns=[approximate_future_return],
risk_aversion=0.5)
# ---------------------------------------------------------
# 6. Output results
print("Posterior Mean Predictions (X_new):", posterior_pred)
print("Approximate Future Return:", approximate_future_return)
print("Spread Candidates:", spread_candidates)
print("Best Spread:", best_spread)
# (Optional) Inspect the posterior summary
summary = [Link](trace)
print("\nPosterior Summary:")
print(summary)
Below is an outline of how each part of the code contributes to
the Bayesian neural network market making concept:
• generate_synthetic_market_data simulates feature vectors
(like order flow signals) and corresponding returns, serving as
an example dataset.
• build_bayesian_neural_network defines a PyMC model with
priors for all network weights/biases, capturing uncertainty
in parameter estimates.
• sample_bnn_posterior performs Markov Chain Monte Carlo
(MCMC) to sample from the posterior distribution over the
neural network’s weights.
• predict_returns_with_bnn uses the sampled posterior to
produce predicted return distributions for unseen data points,
aiding real-time spreading decisions.
• generate_spread_proposals enumerates candidate spreads
(tight vs. wide quotes).
• evaluate_spread_profit_and_risk calculates a simplified
profit and risk measure for a given spread based on the pre-
dicted returns.
• select_optimal_spread chooses the best spread by striking
a balance between expected profit and downside risk.
By sampling from the approximate posterior, the algorithm nat-
urally incorporates model uncertainty into quoting decisions, en-
abling more robust inventory and risk management.
44
Chapter 7
Liquidity Pulse
Detector
Below is an elaboration of the core approach to detecting and
exploiting transient liquidity pulses, followed by a comprehensive
Python implementation. The concept uses wavelet transforms (or
a simple high-pass filter as an alternative) to identify sudden fluc-
tuations in order book depth. Once detected, the algorithm dy-
namically adjusts order sizes and spreads based on the amplitude
and duration of these pulses.
In broad terms, the approach consists of the following steps:
1. Data Acquisition: Continuously capture order book data (e.g.,
bid depth, ask depth, best bid, best ask, etc.) in real time.
2. Signal Processing (Wavelet/High-Pass Filter): Apply a trans-
form (e.g., discrete wavelet transform “DWT” or a numerical high-
pass filter) to highlight high-frequency fluctuations. Smooth data
can be subtracted from raw data to uncover short-lived pulses.
3. Pulse Detection: Detect pulse boundaries by thresholding the
transformed signal. For instance, large peaks above a certain stan-
dard deviation threshold may indicate a liquidity surge or lull.
4. Quoting Adjustments: When a surge is detected:
• Increase or decrease order size proportionally to the amplitude.
• Narrow or widen spreads to capture opportunities or mitigate
risk.
5. Execution and Monitoring: Place and cancel orders in real time,
maintaining a log of fills, partial fills, or missed opportunities. Up-
45
date signals as new data arrives to continue capturing subsequent
pulses.
Python Code
import numpy as np
import pywt # For wavelet transforms; ensure "pip install
,→ PyWavelets"
from collections import deque
import time
class LiquidityPulseDetector:
"""
A class to detect transient liquidity pulses in order book depth
,→ data
using wavelet transforms or a high-pass numerical filter. Once a
,→ pulse
is detected, the algorithm adjusts the order size and spreads
,→ accordingly.
"""
def __init__(self, window_size=128, wavelet='db4',
,→ threshold_factor=2.0):
"""
Initialize the LiquidityPulseDetector.
:param window_size: Number of data points in each segment to
,→ transform.
:param wavelet: Choice of wavelet for the transform, e.g.,
,→ 'db4', 'haar'.
:param threshold_factor: Factor to multiply by the standard
,→ deviation
to define the threshold for pulse
,→ detection.
"""
self.window_size = window_size
[Link] = wavelet
self.threshold_factor = threshold_factor
# Circular buffer to hold the latest window_size data
self.depth_data_buffer = deque([], maxlen=self.window_size)
# For holding the time-series of wavelet-coefficient-based
,→ signals
self.transformed_signal_buffer = deque([],
,→ maxlen=self.window_size)
self.order_id_counter = 1
def _wavelet_transform(self, data):
"""
Perform a discrete wavelet transform focusing on detail
,→ coefficients
46
to isolate high-frequency components (liquidity pulses).
:param data: Array of depth values (e.g., aggregated order
,→ book depth).
:return: Array of detail coefficients that capture
,→ high-frequency pulses.
"""
# We only retain the detail coefficients at the first level
,→ of decomposition.
coeffs = [Link](data, [Link], mode='symmetric',
,→ level=1)
# coeffs is typically [cA, cD], where cA is approximation,
,→ cD is detail at level=1
cA, cD = coeffs
return cD # Return detail coefficients as the pulse signal
def _detect_pulse(self, transformed_signal):
"""
Check if a pulse is detected by comparing the wavelet detail
,→ coefficients
against a dynamic threshold.
:param transformed_signal: The wavelet detail array from the
,→ last segment.
:return: Boolean indicating the presence of a liquidity
,→ pulse.
"""
if len(transformed_signal) == 0:
return False
# Compute threshold using standard deviation of detail
,→ coefficients
std_val = [Link](transformed_signal)
mean_val = [Link](transformed_signal)
threshold = self.threshold_factor * std_val
# If the absolute max detail coefficient minus mean exceeds
,→ threshold, we have a pulse
max_dev = [Link]([Link](transformed_signal - mean_val))
return max_dev > threshold
def update_depth(self, current_depth):
"""
Update the detector with new depth data. Once window_size
,→ data points
are accumulated, perform wavelet transform and check for
,→ pulses.
:param current_depth: Current aggregated depth or a measure
,→ of
order book liquidity.
:return: Boolean indicating if a pulse was detected at this
,→ update step.
47
"""
self.depth_data_buffer.append(current_depth)
# If we haven't filled the buffer yet, no transform can be
,→ done
if len(self.depth_data_buffer) < self.window_size:
return False
# Convert buffer to numpy array for processing
data = [Link](self.depth_data_buffer)
# Wavelet transform to isolate short-term fluctuations
detail_coeffs = self._wavelet_transform(data)
self.transformed_signal_buffer.append(detail_coeffs)
# Check for pulses
return self._detect_pulse(detail_coeffs)
def generate_quotes(self, base_order_size, base_spread,
,→ pulse_detected):
"""
Generate or adjust quotes based on whether a pulse is
,→ detected.
:param base_order_size: The normal or baseline order size to
,→ place.
:param base_spread: The baseline spread (difference between
,→ bid and ask).
:param pulse_detected: Boolean indicating if a pulse is
,→ currently active.
:return: Dictionary specifying the updated quoting
,→ parameters.
"""
if pulse_detected:
# If a liquidity pulse is detected, increase spread to
,→ manage risk
# but also potentially increase the order size to
,→ capture profits.
new_order_size = int(base_order_size * 1.5)
new_spread = base_spread * 1.3
else:
# In normal conditions, keep default quoting parameters
new_order_size = base_order_size
new_spread = base_spread
return {
'order_id': self.order_id_counter,
'order_size': new_order_size,
'spread': new_spread
}
def place_order(self, order_params):
"""
Simulate placing an order in the market.
48
:param order_params: Dictionary containing order_id,
,→ order_size, and spread.
"""
print(f"Placing order {order_params['order_id']} "
f"with size {order_params['order_size']} "
f"and spread {order_params['spread']:.2f}")
# In practice, we would interface with an exchange API here.
self.order_id_counter += 1
def simulate_market_and_detect_pulses():
"""
This function simulates a live market by generating random depth
,→ data,
updating the LiquidityPulseDetector, and placing orders based on
,→ pulses.
"""
detector = LiquidityPulseDetector(window_size=16, wavelet='db4',
,→ threshold_factor=2.0)
# Simulate baseline market depth around 1000 with random noise
# We might occasionally inject spikes to simulate pulses
base_depth = 1000
base_order_size = 10
base_spread = 0.01 # e.g., 1% of price might be a baseline
,→ spread
for t in range(100):
# Introduce artificially random short-lived pulses
if 20 <= t <= 25 or 60 <= t <= 62:
current_depth = base_depth + [Link](100, 300)
else:
current_depth = base_depth + [Link](-30, 30)
# Update the detector with the simulated current depth
pulse_detected = detector.update_depth(current_depth)
# Generate quotes based on whether a pulse is detected
quote_params = detector.generate_quotes(base_order_size,
,→ base_spread, pulse_detected)
# Place the order
detector.place_order(quote_params)
# Sleep to simulate time passing in a real trading
,→ environment
# In a production system, this might be replaced by actual
,→ event-driven waits
[Link](0.05)
if __name__ == "__main__":
49
simulate_market_and_detect_pulses()
This code snippet demonstrates how to:
• Continuously capture simulated order book depth values and
accumulate them in a rolling buffer.
• Perform a discrete wavelet transform (using PyWavelets) to
isolate high-frequency “detail” coefficients.
• Detect pulses by thresholding on the transformed coefficients’
standard deviation.
• Dynamically adjust quote size and spread in response to pulse
detection.
• Place mock orders (in practice, you would call an exchange
API, use an OMS/EMS, or another execution framework).
Developers can build on this framework by implementing:
• More advanced filtering methods (e.g., multi-level wavelet detail
coefficients, windowed FFT, or advanced high-pass filters).
• Real-time connectivity and APIs to live markets for actual order
placement, cancellations, and rebalancing.
• Additional logic for risk constraints (inventory limits, maximum
daily drawdown, etc.) to ensure robust, safe trading.
50
Chapter 8
Sentiment-Fused
Quoting Model
Below is an outline of how one might integrate sentiment from vari-
ous text sources with short-term volatility measures to dynamically
adjust spreads. The approach begins by fetching textual data (e.g.,
tweets, forum posts, news headlines), classifying each data point
into bullish, bearish, or neutral sentiment, then aggregating those
results into a rolled sentiment score. Parallel to this, we calculate
short-term volatility using recent market prices. Both the senti-
ment score and volatility estimates feed into a quoting engine that
tightens or widens spreads. This helps stabilize profits and control
risk in rapidly shifting markets.
Python Code
import requests
import time
import numpy as np
import nltk
from typing import List, Dict
from [Link] import SentimentIntensityAnalyzer
# Ensure that NLTK data (e.g., for VADER) is downloaded:
# [Link]('vader_lexicon')
######################################################################
# Step 1: Fetch and Parse Text Data from Social Media/News
51
######################################################################
def fetch_text_data(api_url: str) -> List[str]:
"""
Fetch text-based data from a social media or news endpoint.
In practice, this function would interface with actual APIs
,→ (Twitter, news sources, etc.).
For demonstration, we simulate with placeholders.
:param api_url: URL to fetch data from (placeholder usage).
:return: A list of text documents, e.g. tweets, headlines, forum
,→ comments.
"""
# Placeholder random data (in a real scenario, you'd parse
,→ JSON/XML from the API)
simulated_data = [
"Stocks are looking great! I'm bullish on tech.",
"Market meltdown coming soon, beware!",
"No significant changes, I'm neutral this week.",
"Overvalued sector might crash, definitely bearish signals
,→ here.",
"I believe the rally will continue, huge opportunity ahead."
]
return simulated_data
######################################################################
# Step 2: NLP Processing and Sentiment Analysis
######################################################################
def compute_sentiment_label(text: str, analyzer:
,→ SentimentIntensityAnalyzer) -> str:
"""
Classify text sentiment as 'bullish', 'bearish', or 'neutral'
,→ using VADER Sentiment Analysis.
:param text: The input text string to be analyzed.
:param analyzer: NLTK's SentimentIntensityAnalyzer instance.
:return: One of 'bullish', 'bearish', or 'neutral'.
"""
scores = analyzer.polarity_scores(text)
compound_score = scores['compound']
# Simple heuristic thresholds for demonstration:
if compound_score > 0.05:
return 'bullish'
elif compound_score < -0.05:
return 'bearish'
else:
return 'neutral'
def aggregate_sentiment(sentiment_labels: List[str]) -> float:
"""
52
Aggregate individual sentiment labels into a rolled numeric
,→ score.
For instance, bullish = +1, neutral = 0, bearish = -1.
:param sentiment_labels: List of sentiment labels.
:return: Aggregated sentiment score (averaged across data
,→ points).
Positive indicates bullish, negative indicates bearish,
,→ close to zero is neutral.
"""
mapping = {'bullish': 1, 'neutral': 0, 'bearish': -1}
numeric_values = [mapping[label] for label in sentiment_labels]
if len(numeric_values) == 0:
return 0.0
return float([Link](numeric_values))
######################################################################
# Step 3: Calculating Short-Term Volatility
######################################################################
def compute_short_term_volatility(price_history: List[float]) ->
,→ float:
"""
Compute a short-term volatility measure using the rolling
,→ standard deviation
of recent price changes.
:param price_history: A list of recent prices.
:return: Estimated volatility (standard deviation of returns).
"""
if len(price_history) < 2:
return 0.0
returns = []
for i in range(1, len(price_history)):
if price_history[i-1] != 0:
ret = (price_history[i] - price_history[i-1]) /
,→ price_history[i-1]
[Link](ret)
else:
[Link](0.0)
# Annualize or scale as needed. Here we just return the std dev
,→ of the returns.
return float([Link](returns))
######################################################################
# Step 4: Quoting Logic
######################################################################
def dynamic_spread_decision(sentiment_score: float,
,→ volatility_estimate: float) -> Dict[str, float]:
"""
53
Determine how much to widen or narrow the bid-ask spread based
,→ on sentiment
and volatility. Larger absolute sentiment score and high
,→ volatility typically lead
to more protective (wider) spreads, while neutral sentiment and
,→ low volatility
allow tighter spreads.
:param sentiment_score: Aggregate sentiment from text data.
:param volatility_estimate: Calculated short-term volatility.
:return: A dictionary with recommended bid_spread and
,→ ask_spread.
"""
# Base spread in ticks or basis points
base_spread = 1.0
# Increase spread if absolute sentiment is large
# Increase further if volatility is high
# This is a simplistic linear model:
spread_adjustment = 1.0 + abs(sentiment_score) * 0.5 +
,→ volatility_estimate * 10.0
# Example: if sentiment is bullish, we might place a narrower
,→ ask spread, a bit wider bid
# if sentiment is bearish, do the opposite
# if neutral, keep them fairly equal
if sentiment_score > 0.1:
# Bullish
bid_spread = base_spread * spread_adjustment * 1.1
ask_spread = base_spread * spread_adjustment * 0.9
elif sentiment_score < -0.1:
# Bearish
bid_spread = base_spread * spread_adjustment * 0.9
ask_spread = base_spread * spread_adjustment * 1.1
else:
# Neutral
bid_spread = base_spread * spread_adjustment
ask_spread = base_spread * spread_adjustment
return {
'bid_spread': bid_spread,
'ask_spread': ask_spread
}
######################################################################
# Step 5: Main Execution Loop (Simulation Example)
######################################################################
def main_simulation():
"""
Main function to simulate the entire pipeline of data fetching,
,→ sentiment calculation,
54
volatility estimation, and quoting strategy. In a live system,
,→ this might run indefinitely
or on a scheduled basis, adjusting quotes in real-time.
"""
# Initialize sentiment analyzer
analyzer = SentimentIntensityAnalyzer()
# Simulated URL (not used in placeholder)
placeholder_api_url = "[Link]
# Step 5.1: Fetch textual data
text_data_batch = fetch_text_data(placeholder_api_url)
# Step 5.2: Analyze sentiment for each text snippet
sentiment_labels = [compute_sentiment_label(text, analyzer) for
,→ text in text_data_batch]
# Step 5.3: Aggregate sentiment
aggregated_score = aggregate_sentiment(sentiment_labels)
print("Aggregated Sentiment Score:", aggregated_score)
# Step 5.4: Mock a price history to compute volatility
# (In reality, you'd fetch from a market data feed or your
,→ broker API.)
mock_price_history = [100, 101.5, 99, 102, 101.8, 103, 102.5,
,→ 104]
short_term_vol =
,→ compute_short_term_volatility(mock_price_history)
print("Short-Term Volatility Estimate:", short_term_vol)
# Step 5.5: Determine dynamic spreads
quote_decision = dynamic_spread_decision(aggregated_score,
,→ short_term_vol)
print("Recommended Bid Spread:", quote_decision["bid_spread"])
print("Recommended Ask Spread:", quote_decision["ask_spread"])
# In an actual trading system, you'd now place these quotes on
,→ the order book
# through your exchange/broker's API. This is only a
,→ demonstration.
if __name__ == "__main__":
# Run the simulation exactly once for illustration
main_simulation()
Below is a brief explanation of each major component:
• fetch_text_data: Simulates fetching text data (e.g., tweets,
headlines) from an API.
• compute_sentiment_label: Uses NLTK’s VADER sentiment
55
analysis to classify text into bullish, bearish, or neutral.
• aggregate_sentiment: Converts individual sentiments to
numeric values and computes an average score.
• compute_short_term_volatility: Demonstrates a simple
rolling standard deviation of price returns as a volatility mea-
sure.
• dynamic_spread_decision: Takes the sentiment score and
volatility estimate, then produces recommended bid/ask spreads.
• main_simulation: Ties all elements together, simulating how
quotes might be adjusted in real time.
56
Chapter 9
Regime-Switching
Inventory Rebalancer
Below is an illustrative approach to implementing the “Regime-
Switching Inventory Rebalancer” using a Hidden Markov Model
(HMM) to detect distinct regimes (such as trending, mean-reverting,
or high-volatility) and subsequently adjust inventory targets and
quoting styles. The code trains an HMM on historical market data,
infers the real-time probability of each regime, and makes quoting
decisions based on the most likely regime. This helps mitigate the
risk of unexpected regime shifts and capital drawdowns.
Python Code
import numpy as np
import pandas as pd
from hmmlearn import hmm
import warnings
[Link]("ignore")
class RegimeSwitchingInventoryRebalancer:
"""
This class implements a regime-switching inventory rebalancer
,→ that uses
a hidden Markov model (HMM) to detect distinct market regimes.
,→ Based on
the inferred regime, the algorithm sets inventory targets and
,→ quoting styles
to manage risk and profit potential.
57
"""
def __init__(self,
n_components=3,
regime_labels=('Trending', 'MeanReverting',
,→ 'HighVolatility'),
random_state=42):
"""
:param n_components: Number of hidden regimes to model.
:param regime_labels: A tuple of string labels that describe
,→ each regime.
:param random_state: Seed for reproducibility.
"""
self.n_components = n_components
self.regime_labels = regime_labels
self.random_state = random_state
[Link] = [Link](n_components=self.n_components,
covariance_type="full",
random_state=self.random_state)
[Link] = False
def fit_hmm(self, features):
"""
Fit the HMM on historical data.
:param features: A 2D NumPy array or DataFrame of features
,→ used for training,
shape = (n_samples, n_features).
:return: None
"""
# Convert to numpy if DataFrame is supplied
if isinstance(features, [Link]):
features = [Link]
[Link](features)
[Link] = True
def predict_regimes(self, features):
"""
Predict the hidden regimes for given features using the
,→ fitted HMM.
:param features: A 2D NumPy array or DataFrame of features,
shape = (n_samples, n_features).
:return: Array of most likely regime labels, shape =
,→ (n_samples,).
"""
if not [Link]:
raise ValueError("HMM has not been fitted. Call
,→ fit_hmm() first.")
if isinstance(features, [Link]):
features = [Link]
58
hidden_states = [Link](features)
labeled_states = [self.regime_labels[s] for s in
,→ hidden_states]
return [Link](labeled_states)
def predict_regime_probabilities(self, feature):
"""
Calculate the probability of each regime for a single
,→ feature vector
using the fitted HMM.
:param feature: A 1D feature vector, shape = (n_features,).
:return: Dictionary mapping regime label -> probability.
"""
if not [Link]:
raise ValueError("HMM has not been fitted. Call
,→ fit_hmm() first.")
# We need to reshape the feature to (1, n_features)
feature_reshaped = [Link](feature).reshape(1, -1)
log_prob_matrix = [Link].predict_proba(feature_reshaped)
# log_prob_matrix has shape (1, n_components)
# If using hmmlearn's predict_proba, it gives direct
,→ probabilities
# (not log probabilities). We treat them as probabilities
,→ directly.
prob_array = log_prob_matrix[0]
regime_probabilities = {
label: prob_array[idx] for idx, label in
,→ enumerate(self.regime_labels)
}
return regime_probabilities
def set_quotes_based_on_regime(self, current_regime,
,→ current_inventory, price):
"""
Example rule-based quoting logic that sets target inventory
,→ and quotes
based on the inferred regime.
:param current_regime: String label of current regime.
:param current_inventory: Current inventory level.
:param price: Current market price (for reference).
:return: A dictionary with the proposed quotes and inventory
,→ target.
"""
# Example settings for each regime
if current_regime == 'Trending':
# Expect strong directional moves, so let's keep
,→ moderate inventory
# and place narrower spreads to stay competitive and
,→ capture the trend
59
target_inventory = 50
bid_spread = price * 0.001 # narrower
ask_spread = price * 0.0015
elif current_regime == 'MeanReverting':
# Market is swinging around a mean; tighten inventory to
,→ trade reversions
target_inventory = 30
bid_spread = price * 0.0008
ask_spread = price * 0.0012
else: # HighVolatility
# Widen spreads to reduce fill risk and limit large
,→ adverse moves
target_inventory = 20
bid_spread = price * 0.002
ask_spread = price * 0.003
# Adjust target inventory slightly if we are over or under
,→ heavily
delta_inventory = target_inventory - current_inventory
# Example quoting structure
# We'll place quotes around the current price adjusted by
,→ spreads
bid_quote = price - bid_spread
ask_quote = price + ask_spread
return {
'regime': current_regime,
'target_inventory': target_inventory,
'delta_inventory': delta_inventory,
'bid_quote': bid_quote,
'ask_quote': ask_quote
}
def generate_synthetic_market_data(n=1000, seed=42):
"""
Simple function to generate synthetic market data for
,→ demonstration:
- We create a feature set that might reflect daily returns and
,→ realized volatility.
:param n: Number of data points to generate.
:param seed: Random seed.
:return: A pandas DataFrame with columns ['return',
,→ 'volatility'].
"""
[Link](seed)
# Synthetic "return" can be from a random normal distribution
returns = [Link](n) * 0.01
# Synthetic "volatility" can be from a random absolute normal
,→ distribution
volatility = [Link]([Link](n)) * 0.1
60
data = [Link]({
'return': returns,
'volatility': volatility
})
return data
def main_demo():
"""
Demonstration function that:
1) Generates synthetic data
2) Fits an HMM to detect distinct regimes
3) Simulates a stream of new data points,
infers the regime, and sets quotes accordingly.
"""
# 1) Generate synthetic data
historical_data = generate_synthetic_market_data(n=500)
# 2) Fit HMM on the historical data
rebalancer = RegimeSwitchingInventoryRebalancer(
n_components=3,
regime_labels=('Trending', 'MeanReverting',
,→ 'HighVolatility'),
random_state=42
)
rebalancer.fit_hmm(historical_data)
# Let's see how the HMM labels the historical data
predicted_regimes = rebalancer.predict_regimes(historical_data)
print("Sample of predicted regimes on historical data:",
,→ predicted_regimes[:10])
# 3) Simulate a new stream of data points with random price
current_inventory = 0
simulated_prices = [Link](100, 110, 10) # for example, 10
,→ price steps
print("\nReal-time quoting decisions:")
for idx, price in enumerate(simulated_prices):
# Suppose we get new 'return' and 'volatility' data in
,→ real-time
new_return = [Link]() * 0.01
new_vol = abs([Link]()) * 0.1
# Infer regime probabilities
regime_probs =
,→ rebalancer.predict_regime_probabilities([new_return,
,→ new_vol])
# Pick the regime with highest probability
current_regime = max(regime_probs, key=regime_probs.get)
# Decide on quotes
quoting_info = rebalancer.set_quotes_based_on_regime(
current_regime=current_regime,
61
current_inventory=current_inventory,
price=price
)
# Update inventory (naive example: assume we move halfway to
,→ target)
current_inventory += int(0.5 *
,→ quoting_info['delta_inventory'])
# Print decision
print(f"Step {idx+1}: Regime Probs={regime_probs},
,→ Decision={quoting_info}")
if __name__ == "__main__":
main_demo()
Below is a brief explanation of the main code components:
• RegimeSwitchingInventoryRebalancer: A class responsi-
ble for modeling the market regimes using a hidden Markov
model (HMM). It also manages quoting rules based on the
inferred regime.
• fit_hmm: Trains the HMM on historical feature data (e.g.,
returns, volatility). This method sets up the model to distin-
guish between different latent market conditions.
• predict_regimes: Given new feature data, it returns the
most likely regimes for each data point.
• predict_regime_probabilities: For a single feature vec-
tor, it computes the probability that the market is in each of
the available regimes.
• set_quotes_based_on_regime: Implements a simple rule-
based quoting logic that sets target inventory and spreads. In
practice, traders can refine these rules for more sophisticated
behaviors.
• generate_synthetic_market_data: Creates synthetic data
for demonstration with columns for returns and volatility.
• main_demo: Demonstrates how historical data are used to
train the HMM and how real-time data may be processed to
adapt inventory and quoting decisions.
62
By conditioning inventory targets and quoting styles on regime
probabilities, this approach reduces the risk associated with sud-
den market structure changes and supports more stable long-term
performance.
63
Chapter 10
Multi-Agent Market
Maker Orchestration
Below is an illustrative description of how a multi-agent market
making orchestration might be implemented with multi-agent re-
inforcement learning. Multiple specialized agents each focus on
a unique aspect of market making (e.g., quote optimization, in-
ventory balancing, volatility hedging), coordinated by a central
mechanism that merges their actions into a final quote. The over-
all reward structure is designed to encourage synergy rather than
conflict.
Key ideas in this approach include:
• Defining a market environment simulation that tracks price changes,
order flow, and inventory levels.
• Creating individual agents—each with its own Q-table or policy
function—trained to optimize different facets (spreads, inventory
hedging, volatility scaling).
• Incorporating a coordinator (or aggregator) that collects pro-
posed actions from each agent, then settles on cohesive final quotes.
• Maintaining a reward signal that balances individual agent objec-
tives (e.g., risk minimization) with overall profitability and capital
preservation.
By combining multiple specialized skill sets, the multi-agent
system can adapt more robustly to market fluctuations than a sin-
gle monolithic agent.
64
Python Code
import numpy as np
import random
class MarketEnvironment:
"""
A simplified market environment to illustrate multi-agent market
,→ making.
Simulates price, inventory changes, and fundamental order flow
,→ statistics.
"""
def __init__(self, initial_price=100.0, max_steps=1000,
volatility=0.02, initial_inventory=0, seed=42):
self.initial_price = initial_price
[Link] = initial_price
self.step_count = 0
self.max_steps = max_steps
[Link] = volatility
[Link] = initial_inventory
[Link] = False
[Link](seed)
[Link](seed)
def reset(self):
"""
Reset environment to initial states.
"""
[Link] = self.initial_price
self.step_count = 0
[Link] = 0
[Link] = False
return self._get_state()
def step(self, action_dict):
"""
Environment step based on a dictionary of actions from each
,→ agent.
:param action_dict: {'quote_opt_agent': <action>,
,→ 'inv_bal_agent': <action>, 'vol_hedge_agent': <action>}
:return: next_state, rewards_dict, done, info
"""
# Combine actions from the specialized agents
# to create final combined quote and hedge decisions
final_quote = action_dict['quote_opt_agent']
inv_adjust = action_dict['inv_bal_agent']
vol_adjust = action_dict['vol_hedge_agent']
# For demonstration, let's interpret these as:
# - final_quote: discrete representation of the spread
,→ decision (e.g., -1 = tighten, 0 = neutral, +1 = widen)
65
# - inv_adjust: discrete representation of how to nudge
,→ inventory (e.g., -1 = reduce, 0 = hold, +1 = increase)
# - vol_adjust: discrete representation of adjusting for
,→ volatility (e.g., -1 = short volatility, 0 = do nothing,
,→ +1 = long volatility)
# Simulate price change
# Price evolves stochastically, incorporate "vol_adjust" as
,→ an
# incremental factor that might reduce or amplify volatility
effective_vol = [Link] * (1 + 0.1 * vol_adjust)
price_change = [Link](0, effective_vol)
[Link] += price_change * [Link] # percentage
,→ return
[Link] = max(0.1, [Link]) # clamp to avoid negative
,→ prices
# Inventory change - interpret "inv_adjust" as adjusting
,→ position
# The environment reacts to quote changes with random fills
fill_probability = 0.3 + 0.1 * (final_quote * -1) # if the
,→ agent tightens quotes, more fills
if fill_probability > 0:
if [Link]() < fill_probability:
# random fill direction: buy or sell
trade_dir = [Link]([-1, 1])
trade_size = 1 + abs(inv_adjust) # position size
,→ influenced by "inv_adjust"
[Link] += trade_size * trade_dir
# Compute rewards
# Let's define partial rewards for each agent:
# 1) QuoteOpt: reward for achieving PnL from short-term
,→ trades
# 2) InvBal: reward that penalizes large inventory
# 3) VolHedge: reward for stabilizing price changes (less
,→ variance)
# Realized PnL approximation if a trade occurred
# (using last known price change as a proxy for fill
,→ outcome).
realized_pnl = (price_change * [Link]) * [Link]
quote_optimum_reward = realized_pnl - abs(final_quote) *
,→ 0.01 # penalize wide spreads with cost
# Large inventory penalty
inv_penalty = -0.001 * ([Link] ** 2)
inv_bal_reward = inv_penalty
# Vol hedge reward: if vol_adjust is negative or positive
# but price changes are large, we penalize. If vol is
# stable, we reward slight positive.
# This is a simplistic proxy for "vol hedging performance".
66
volatility_observed = abs(price_change)
vol_hedge_reward = -(volatility_observed) if vol_adjust == 0
,→ else -abs(vol_adjust - volatility_observed)
# Package final agent rewards
rewards_dict = {
'quote_opt_agent': quote_optimum_reward,
'inv_bal_agent': inv_bal_reward,
'vol_hedge_agent': vol_hedge_reward
}
self.step_count += 1
if self.step_count >= self.max_steps:
[Link] = True
return self._get_state(), rewards_dict, [Link], {}
def _get_state(self):
"""
Returns the current state representation that each agent
,→ sees.
For multi-agent systems, you might filter or embed the state
differently for each agent. For simplicity, we'll
return a single universal state.
"""
return {
'price': [Link],
'inventory': [Link],
'step_count': self.step_count
}
class BaseAgent:
"""
A base class for specialized agents. Contains
minimal structure for Q-learning.
"""
def __init__(self, name, possible_actions, alpha=0.1,
,→ gamma=0.95, epsilon=0.1):
[Link] = name
self.possible_actions = possible_actions
self.q_table = {} # key: (state), value: q-values for each
,→ action
[Link] = alpha
[Link] = gamma
[Link] = epsilon
def _state_to_key(self, state):
"""
Convert a state dictionary to a tuple to act as a key.
Round floats to reduce uniqueness of states for tabular
,→ Q-learning.
"""
return (
67
round(state['price'], 2),
state['inventory'],
state['step_count']
)
def choose_action(self, state):
"""
Epsilon-greedy policy for choosing an action.
"""
state_key = self._state_to_key(state)
if state_key not in self.q_table:
self.q_table[state_key] =
,→ [Link](len(self.possible_actions))
if [Link]() < [Link]:
return [Link](self.possible_actions)
else:
q_values = self.q_table[state_key]
return self.possible_actions[[Link](q_values)]
def update(self, state, action, reward, next_state):
"""
Update Q-values based on observed transition.
"""
state_key = self._state_to_key(state)
next_state_key = self._state_to_key(next_state)
if state_key not in self.q_table:
self.q_table[state_key] =
,→ [Link](len(self.possible_actions))
if next_state_key not in self.q_table:
self.q_table[next_state_key] =
,→ [Link](len(self.possible_actions))
action_index = self.possible_actions.index(action)
old_value = self.q_table[state_key][action_index]
next_max = [Link](self.q_table[next_state_key])
new_value = old_value + [Link] * (reward + [Link] *
,→ next_max - old_value)
self.q_table[state_key][action_index] = new_value
class QuoteOptimizationAgent(BaseAgent):
"""
Specializes in choosing the best quote spreads.
Possible actions are discrete spread adjustments (tighten,
,→ neutral, widen).
"""
def __init__(self, name='quote_opt_agent'):
possible_actions = [-1, 0, 1] # e.g. -1: tighten, 0:
,→ neutral, +1: widen
super().__init__(name=name,
,→ possible_actions=possible_actions)
68
class InventoryBalancingAgent(BaseAgent):
"""
Specializes in balancing inventory around a target (e.g. 0).
Possible actions might be decreasing, holding, or increasing
,→ inventory stance.
"""
def __init__(self, name='inv_bal_agent'):
possible_actions = [-1, 0, 1] # -1: reduce, 0: hold, +1:
,→ accumulate
super().__init__(name=name,
,→ possible_actions=possible_actions)
class VolatilityHedgingAgent(BaseAgent):
"""
Specializes in adjusting actions for volatility hedging
(short vol, do nothing, long vol).
"""
def __init__(self, name='vol_hedge_agent'):
possible_actions = [-1, 0, 1] # -1: short vol, 0: do
,→ nothing, +1: long vol
super().__init__(name=name,
,→ possible_actions=possible_actions)
class Coordinator:
"""
Orchestrates actions from multiple agents and interfaces with
,→ the environment.
In a more elaborate system, it might adjust synergy, or run
,→ synergy-based
logic. For simplicity, we simply collect each agent's action
and pass them to the environment.
"""
def __init__(self, agents):
[Link] = agents
def decide_actions(self, shared_state):
"""
Query each specialized agent for an action,
then combine them into a single action dict.
"""
action_dict = {}
for agent in [Link]:
action_dict[[Link]] =
,→ agent.choose_action(shared_state)
return action_dict
def update_agents(self, prev_state, actions_dict, rewards_dict,
,→ next_state):
"""
Update the Q-tables of each agent with their individual
,→ reward.
"""
69
for agent in [Link]:
[Link](prev_state,
actions_dict[[Link]],
rewards_dict[[Link]],
next_state)
def train_multi_agent_system(num_episodes=50, max_steps=200):
"""
Train the multi-agent system in the environment, using
,→ Q-learning for each agent.
"""
env = MarketEnvironment(initial_price=100.0,
,→ max_steps=max_steps)
quote_agent = QuoteOptimizationAgent()
inv_agent = InventoryBalancingAgent()
vol_agent = VolatilityHedgingAgent()
coord = Coordinator(agents=[quote_agent, inv_agent, vol_agent])
for episode in range(num_episodes):
state = [Link]()
done = False
while not done:
# Coordinator retrieves each agent's action
actions_dict = coord.decide_actions(state)
# Step environment
next_state, rewards_dict, done, info =
,→ [Link](actions_dict)
# Update the agents
coord.update_agents(state, actions_dict, rewards_dict,
,→ next_state)
state = next_state
return quote_agent, inv_agent, vol_agent
def run_demo_episode(env, coordinator):
"""
Run a single episode with trained agents for demonstration.
"""
state = [Link]()
done = False
total_rewards = {'quote_opt_agent': 0.0, 'inv_bal_agent': 0.0,
,→ 'vol_hedge_agent': 0.0}
while not done:
actions_dict = coordinator.decide_actions(state)
next_state, rewards_dict, done, _ = [Link](actions_dict)
for k in total_rewards:
total_rewards[k] += rewards_dict[k]
70
state = next_state
return total_rewards
if __name__ == "__main__":
# Train the multi-agent system
quote_agent, inv_agent, vol_agent =
,→ train_multi_agent_system(num_episodes=200, max_steps=500)
# Initialize a coordinated system with the trained agents
coordinator = Coordinator([quote_agent, inv_agent, vol_agent])
test_env = MarketEnvironment(initial_price=100.0, max_steps=100)
# Run a demo episode
final_rewards = run_demo_episode(test_env, coordinator)
print("Demo Episode Rewards:", final_rewards)
print("Final Price:", test_env.price)
print("Final Inventory:", test_env.inventory)
print("Demo completed.")
Below is a brief breakdown of the code:
• MarketEnvironment simulates the evolving price, inventory
changes, and handles the step transitions.
• Each specialized agent (QuoteOptimizationAgent, Inventory-
BalancingAgent, VolatilityHedgingAgent) inherits from
BaseAgent, which implements tabular Q-learning.
• The Coordinator collects actions from each specialized agent
and provides them to the environment, then updates each
agent’s Q-table with its individual reward.
• train_multi_agent_system orchestrates a full training loop
across multiple episodes.
• run_demo_episode runs a validation or demonstration episode
using the trained agents to show how the multi-agent system
behaves.
By assigning distinct responsibilities to each agent and com-
bining their decisions with a coordinator, this multi-agent rein-
forcement learning approach can yield an agile and holistic market
making strategy that adapts to changing market conditions.
71
Chapter 11
Explanatory Factor
Market Maker
This chapter focuses on interpretability by explicitly modeling ex-
planatory factors—such as supply-demand imbalances, news-driven
price triggers, and intraday cyclical patterns—and correlating these
factors with price changes through a custom linear or boosted tree
model. Below is a comprehensive approach that highlights how one
might:
• Identify and engineer features (e.g., supply-demand imbal-
ance, news sentiment, cyclical variables).
• Train an interpretable regression model (linear or gradient boosted)
to predict short-term price changes.
• Use the model’s predictions and feature importances to guide
quoting decisions.
• Implement a dynamic rebalancing loop that adjusts spreads
based on real-time factor values and model forecasts.
The interpretability arises from the model’s ability (whether
linear with coefficients or tree-based with feature importances) to
show the contribution of each factor to predicted price changes.
This transparent structure helps risk managers understand why
quotes shift, linking decisions back to easily identifiable market
drivers.
72
Python Code
import numpy as np
import pandas as pd
from [Link] import GradientBoostingRegressor
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
import [Link] as plt
def generate_synthetic_data(num_samples=2000,
seed=42,
imbalance_scale=5.0,
news_scale=3.0):
"""
Generate synthetic market data that includes:
1. supply_demand_imbalance: A factor that influences price
,→ changes.
2. news_sentiment: Encodes external triggers from news releases.
3. cyclical_hour: Representative of intraday cyclical patterns.
4. price_change: The target variable we want to predict for
,→ quoting decisions.
:param num_samples: Number of data points to generate.
:param seed: Random seed for reproducibility.
:param imbalance_scale: Amplitude for supply-demand imbalance
,→ factor.
:param news_scale: Amplitude for news sentiment factor.
:return: A pandas DataFrame with features and the target
,→ variable.
"""
[Link](seed)
# Simulate cyclical intraday features: hours in a day from 0 to
,→ 23
cyclical_hour = [Link](0, 24, num_samples)
# Supply-demand imbalance factor
supply_demand_imbalance = imbalance_scale * (2 *
,→ [Link](num_samples) - 1)
# News sentiment factor
news_sentiment = news_scale * (2 * [Link](num_samples) -
,→ 1)
# Price change is influenced by these factors; add some noise
# Example relationship: price_change ~ 0.3 * imbalance + 0.2 *
,→ news - cyclical_day factor + random noise
random_noise = [Link](0, 0.5, num_samples)
# Introduce cyclical pattern (intraday) effect: e.g., certain
,→ hours might have more impact
cyclical_effect = 0.1 * [Link](cyclical_hour / 24.0 * 2 * [Link])
73
price_change = 0.3 * supply_demand_imbalance + \
0.2 * news_sentiment - \
0.3 * cyclical_effect + \
random_noise
data = [Link]({
'supply_demand_imbalance': supply_demand_imbalance,
'news_sentiment': news_sentiment,
'cyclical_hour': cyclical_hour,
'price_change': price_change
})
return data
def feature_engineering(df):
"""
Perform basic feature engineering on the DataFrame:
1. Transform cyclical_hour into sine/cosine components to
,→ capture continuous cyclical patterns.
2. (Optional) Scale or encode other features if desired.
:param df: Original DataFrame with supply_demand_imbalance,
,→ news_sentiment, cyclical_hour, and price_change.
:return: Transformed DataFrame ready for modeling.
"""
# Convert cyclical_hour into sine/cosine to reflect circular
,→ time
df['hour_sin'] = [Link](2 * [Link] * df['cyclical_hour'] / 24)
df['hour_cos'] = [Link](2 * [Link] * df['cyclical_hour'] / 24)
# Drop the raw cyclical hour if not needed
df = [Link](columns=['cyclical_hour'])
return df
def train_interpretable_model(df, use_linear=False):
"""
Train either a linear regression or a gradient boosting model
to predict short-term price changes based on explanatory
,→ factors.
:param df: DataFrame with features and 'price_change' column.
:param use_linear: If True, train a linear model for
,→ interpretability with coefficients.
If False, train a gradient boosting model for
,→ feature importances.
:return: Trained model, feature names, and training/test sets.
"""
# Separate features and target
X = [Link](columns=['price_change'])
y = df['price_change']
feature_names = [Link]()
74
# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y,
,→ test_size=0.3,
,→ random_state=42)
# Train model
if use_linear:
model = LinearRegression()
else:
model = GradientBoostingRegressor(n_estimators=100,
learning_rate=0.1,
max_depth=3,
random_state=42)
[Link](X_train, y_train)
return model, feature_names, (X_train, X_test, y_train, y_test)
def interpret_linear_model(model, feature_names):
"""
Print out the coefficients of a trained LinearRegression model
for interpretability.
:param model: Trained LinearRegression model.
:param feature_names: List of feature names in the same order as
,→ the model coefficients.
"""
coeffs = model.coef_
intercept = model.intercept_
print("Linear Model Intercept:", intercept)
for name, c in zip(feature_names, coeffs):
print(f"Coefficient for {name}: {c:.4f}")
def interpret_gb_model(model, feature_names):
"""
Plot feature importances for a trained GradientBoostingRegressor
to understand which factors influence price changes the most.
:param model: Trained GradientBoostingRegressor model.
:param feature_names: List of feature names in the same order
,→ used by the model.
"""
importances = model.feature_importances_
sorted_idx = [Link](importances)
[Link]([Link](feature_names)[sorted_idx],
,→ importances[sorted_idx])
[Link]("Gradient Boosting Feature Importances")
[Link]()
def quote_decision_logic(predicted_price_change,
75
current_mid_price,
base_spread=0.01,
imbalance_factor=1.0,
inventory_level=0,
risk_aversion=1.0):
"""
Determine quoting spreads (bid/ask) based on the predicted price
,→ change,
current price, supply-demand imbalance factor, and risk
,→ aversion.
:param predicted_price_change: Model's forecast of short-term
,→ price move.
:param current_mid_price: The current mid-market price.
:param base_spread: Base spread size (e.g., 1% of price) in
,→ absolute terms or fraction.
:param imbalance_factor: Factor capturing supply-demand
,→ imbalance (amplifies spread).
:param inventory_level: Current inventory of the market maker
,→ (could adjust if inventory is large).
:param risk_aversion: Higher means wider spreads to reduce risk.
:return: (bid_quote, ask_quote) - the quotes for bid and ask
,→ prices.
"""
# Determine direction
spread_adjustment = 1.0 + abs(imbalance_factor) * 0.1 +
,→ abs(inventory_level) * 0.01
risk_adjustment = 1.0 + risk_aversion * 0.1
if predicted_price_change > 0:
# Expect upward move => set bid a bit closer to current mid,
,→ ask a bit further
bid_price = current_mid_price * (1 - base_spread * 0.5 /
,→ spread_adjustment / risk_adjustment)
ask_price = current_mid_price * (1 + base_spread * 1.5 /
,→ spread_adjustment / risk_adjustment)
else:
# Expect downward move => set bid further, ask closer
bid_price = current_mid_price * (1 - base_spread * 1.5 /
,→ spread_adjustment / risk_adjustment)
ask_price = current_mid_price * (1 + base_spread * 0.5 /
,→ spread_adjustment / risk_adjustment)
return bid_price, ask_price
def dynamic_rebalancing_loop(model,
df,
current_mid_price=100.0,
inventory=0.0,
steps=20,
risk_aversion=1.0,
use_linear=False):
"""
76
Simulate a dynamic market making process over a series of steps.
Each step:
1. Takes a row of data and infers a short-term price change
,→ from the model.
2. Sets quotes based on the predicted price movement,
,→ supply-demand imbalance,
and the current inventory level.
3. Simulates a hypothetical fill that updates inventory and
,→ price.
4. Returns the history of decisions for analysis.
:param model: The trained model (linear or gradient boosting).
:param df: Feature DataFrame with columns for
,→ supply_demand_imbalance,
news_sentiment, hour_sin, hour_cos.
:param current_mid_price: Initial mid price.
:param inventory: Initial inventory level.
:param steps: Number of steps to simulate.
:param risk_aversion: Parameter to widen or tighten spreads
,→ based on risk appetite.
:param use_linear: Indicates the type of model for inference
,→ (linear or gradient boosting).
:return: A list of records containing step data, quotes, and
,→ inventory changes.
"""
history = []
feature_cols = [col for col in [Link] if col !=
,→ 'price_change']
sample_size = len(df)
for i in range(steps):
# Randomly select a sample from df
row_idx = [Link](0, sample_size)
row_features = [Link][row_idx][feature_cols].to_frame().T
if use_linear:
predicted_change = [Link](row_features)[0]
else:
predicted_change = [Link](row_features)[0]
# Use supply_demand_imbalance from current row if available
imbalance =
,→ row_features['supply_demand_imbalance'].values[0]
# Calculate quotes
bid, ask =
,→ quote_decision_logic(predicted_price_change=predicted_change,
current_mid_price=current_mid_price,
base_spread=0.01,
imbalance_factor=imbalance,
inventory_level=inventory,
risk_aversion=risk_aversion)
77
# Simulate hypothetical fill:
# If predicted_change > 0, assume some portion hits the ask
,→ (increasing inventory).
# If predicted_change < 0, assume some portion hits the bid
,→ (decreasing inventory).
fill_size = [Link](1, 5) # random fill size
if predicted_change > 0:
# Price might move upward, so let's assume the ask gets
,→ lifted
fill_price = ask
inventory -= fill_size # we sold to a buyer
else:
# Price might move downward, so let's assume the bid
,→ gets hit
fill_price = bid
inventory += fill_size # we bought from a seller
# Update mid price a bit
current_mid_price = (bid + ask) / 2.0
record = {
'step': i,
'selected_row_index': row_idx,
'predicted_change': predicted_change,
'bid_quote': bid,
'ask_quote': ask,
'fill_price': fill_price,
'inventory': inventory,
'imbalance_factor': imbalance,
'updated_mid_price': current_mid_price
}
[Link](record)
return history
def main():
# Generate synthetic data
raw_data = generate_synthetic_data(num_samples=3000)
# Feature engineering
processed_data = feature_engineering(raw_data)
# Train a linear model for interpretability
linear_model, linear_feats, _ =
,→ train_interpretable_model(processed_data, use_linear=True)
print("=== Linear Model Interpretation ===")
interpret_linear_model(linear_model, linear_feats)
# Train a gradient boosting model and visualize feature
,→ importances
gb_model, gb_feats, _ =
,→ train_interpretable_model(processed_data, use_linear=False)
print("\n=== Gradient Boosting Feature Importance ===")
78
interpret_gb_model(gb_model, gb_feats)
# Demonstrate dynamic rebalancing with the linear model
print("\n=== Dynamic Rebalancing with Linear Model ===")
lin_history = dynamic_rebalancing_loop(linear_model,
,→ processed_data,
current_mid_price=100.0,
inventory=0.0,
steps=5,
risk_aversion=1.0,
use_linear=True)
for rec in lin_history:
print(rec)
# Demonstrate dynamic rebalancing with the gradient boosting
,→ model
print("\n=== Dynamic Rebalancing with Gradient Boosting Model
,→ ===")
gb_history = dynamic_rebalancing_loop(gb_model, processed_data,
current_mid_price=100.0,
inventory=0.0,
steps=5,
risk_aversion=1.5,
use_linear=False)
for rec in gb_history:
print(rec)
if __name__ == "__main__":
main()
Below is a brief outline of the key elements in the code:
• generate_synthetic_data: Creates a dataset with inter-
pretable factors (supply-demand imbalance, news sentiment,
cyclical hour) and a synthetic price-change target.
• feature_engineering: Transforms cyclical data (hours) into
sine/cosine components for better capturing time-of-day cy-
cles.
• train_interpretable_model: Either fits a linear regression
(direct coefficients for interpretability) or a gradient boosting
model (feature importances).
• interpret_linear_model / interpret_gb_model: Demon-
strations of how to extract and visualize model interpretabil-
ity, via linear coefficients or feature importances.
79
• quote_decision_logic: Generates bid/ask quotes based on
the model’s predicted short-term price change, current mid-
price, inventory level, and a risk aversion parameter.
• dynamic_rebalancing_loop: Simulates how a market maker
might use these quotes in real time, updating inventory and
mid-price based on hypothetical fills.
• main: Ties everything together—generates data, trains mod-
els, interprets them, and runs a basic dynamic rebalancing
simulation.
By tracing how factors like supply-demand imbalance and news
sentiment affect the model’s predictions—and in turn the quot-
ing spreads—risk managers and traders can better understand and
justify each quoting decision, ensuring that the strategy remains
transparent and aligned with observed market drivers.
80
Chapter 12
Term Structure
Arbitrage Quoter
Below is an extended description of how to implement a Term
Structure Arbitrage Quoter in Python, followed by the complete
code snippet. The approach involves monitoring price or implied
volatility misalignments across different maturities (e.g., spot, near-
month, and far-month futures), forecasting near-future changes,
and placing dynamic offsetting orders for each maturity to exploit
these small but consistent arbitrage opportunities. Core steps in-
clude:
• Gathering real-time data for each maturity (spot, near-month,
far-month).
• Using a (potentially advanced) volatility forecasting module that
predicts short-term implied vol.
• Detecting relative mispricing across maturities by comparing
current quotes to a “fair value” model.
• Placing or modifying quotes to capture short-lived inefficiencies,
and synchronizing inventory across maturities to manage overall
exposure.
• Updating a profit-and-loss (PnL) metric to measure performance
and adjusting the strategy if a threshold of risk is exceeded.
Below is the complete Python code that illustrates these con-
cepts in a self-contained manner:
81
Python Code
import numpy as np
import time
class TermStructureArbEngine:
"""
A class to handle term structure arbitrage logic by forecasting
,→ implied volatility,
detecting mispricing, placing dynamic quotes, and managing
,→ multi-maturity inventory.
"""
def __init__(self, maturities, initial_prices, initial_vols,
,→ inventory_limits):
"""
Initialize the term structure arbitrage engine.
:param maturities: List of maturity labels (e.g., ['spot',
,→ '1M', '3M', '6M']).
:param initial_prices: Dictionary of each maturity's current
,→ price or future price.
:param initial_vols: Dictionary of each maturity's implied
,→ volatility estimate.
:param inventory_limits: Dictionary specifying max inventory
,→ allowed for each maturity.
"""
[Link] = maturities
self.current_prices = initial_prices.copy() # Could be
,→ real-time updates from an API
self.implied_vols = initial_vols.copy()
# Track inventory: how many contracts or units are held for
,→ each maturity
[Link] = {m: 0 for m in maturities}
self.inventory_limits = inventory_limits
self.position_value = 0.0 # Aggregated PnL measure
# A placeholder for forecasting model parameters
# In practice, you might load a trained model or calibrate
,→ GARCH/EWMA
self.model_params = {m: {'alpha': 0.05, 'beta': 0.1} for m
,→ in maturities}
def update_market_data(self, new_prices, new_vols):
"""
Update the market data for each maturity.
:param new_prices: Dict of updated prices.
:param new_vols: Dict of updated implied vols.
"""
for m in [Link]:
if m in new_prices:
self.current_prices[m] = new_prices[m]
82
if m in new_vols:
self.implied_vols[m] = new_vols[m]
def forecast_implied_vols(self):
"""
A placeholder function for forecasting implied volatility
,→ for each maturity.
e.g., using GARCH/EWMA or a neural net. Here we do a simple
,→ random walk approach.
"""
for m in [Link]:
alpha = self.model_params[m]['alpha']
beta = self.model_params[m]['beta']
# Example of a naive random walk with drift
self.implied_vols[m] = (1 - alpha) *
,→ self.implied_vols[m] + alpha * (
self.implied_vols[m] + beta *
,→ [Link]()
)
def fair_value_model(self, maturity):
"""
A hypothetical function that calculates the 'fair value' for
,→ each maturity
based on the implied volatility and some parametric pricing
,→ approach.
For demonstration, we use a simple linear function of
,→ implied vol.
:param maturity: The maturity for which to compute a fair
,→ price.
:return: Estimated fair price for that maturity.
"""
vol = self.implied_vols[maturity]
base_price = self.current_prices[maturity]
# Example: fair price as a function of current price plus a
,→ premium
# that depends on the implied volatility level
return base_price + (0.5 * vol * base_price)
def detect_mispricing(self):
"""
Compare current market prices to our fair value estimates
,→ across maturities.
Return a dictionary that indicates whether each maturity is
,→ under/over priced
relative to fair value.
:return: Dictionary {maturity: mispricing_value},
where mispricing_value > 0 => current price < fair
,→ value (potential buy),
mispricing_value < 0 => current price > fair
,→ value (potential sell).
83
"""
signals = {}
for m in [Link]:
fv = self.fair_value_model(m)
market_price = self.current_prices[m]
signals[m] = fv - market_price
return signals
def place_quotes_for_arbitrage(self, signals):
"""
Place or adjust quotes depending on the mispricing signals.
,→ If the signal
is positive, we want to buy that maturity; if negative, we
,→ want to sell.
:param signals: Dictionary of mispricing signals for each
,→ maturity.
"""
for m, signal in [Link]():
if signal > 0:
# Potential buy signal if we have capacity
if [Link][m] < self.inventory_limits[m]:
self.execute_order(m, order_type="BUY", size=1)
elif signal < 0:
# Potential sell signal if we are not short beyond
,→ limit
if [Link][m] > -self.inventory_limits[m]:
self.execute_order(m, order_type="SELL", size=1)
def execute_order(self, maturity, order_type, size):
"""
Execute an order by modifying the inventory and updating
,→ approximate PnL.
Usually, you'd interface with an exchange/broker to track
,→ real fill prices.
:param maturity: Maturity to trade.
:param order_type: "BUY" or "SELL".
:param size: Number of contracts to buy or sell.
"""
fill_price = self.current_prices[maturity]
if order_type == "BUY":
[Link][maturity] += size
cost = fill_price * size
self.position_value -= cost # Deduct the cost from PnL
elif order_type == "SELL":
[Link][maturity] -= size
proceeds = fill_price * size
self.position_value += proceeds # Add the proceeds to
,→ PnL
def synchronize_inventory(self):
"""
84
Adjust positions to keep total exposure manageable across
,→ maturities.
This might involve offsetting overbought or oversold
,→ conditions.
"""
net_position = sum([Link][m] for m in
,→ [Link])
max_net_allowed = sum(self.inventory_limits[m] for m in
,→ [Link])
if abs(net_position) > max_net_allowed:
# Example: if net_position is too large, try offsetting
,→ from the farthest maturity
# For demonstration, we pick the last maturity to offset
largest_maturity = [Link][-1]
if net_position > 0 and [Link][largest_maturity]
,→ > 0:
# Sell some portion
to_sell = min(net_position,
,→ [Link][largest_maturity])
self.execute_order(largest_maturity, "SELL",
,→ to_sell)
elif net_position < 0 and
,→ [Link][largest_maturity] < 0:
# Buy to offset the short position
to_buy = min(-net_position,
,→ -[Link][largest_maturity])
self.execute_order(largest_maturity, "BUY", to_buy)
def mark_to_market(self):
"""
Revalue the open inventory positions at current market
,→ prices to update PnL.
"""
mtm_value = 0.0
for m in [Link]:
mtm_value += [Link][m] * self.current_prices[m]
# The difference from the 'position_value' is the current
,→ PnL
total_pnl = self.position_value + mtm_value
return total_pnl
def run_term_structure_arbitrage_demo():
"""
Demonstration function that simulates running the term structure
,→ arbitrage
for a few iterations with random data updates. In real usage,
,→ data would be
continuously updated from an API or order book feed.
"""
maturities = ['spot', '1M', '3M']
initial_prices = {'spot': 100.0, '1M': 102.0, '3M': 104.0}
initial_vols = {'spot': 0.20, '1M': 0.22, '3M': 0.25}
inventory_limits = {'spot': 10, '1M': 10, '3M': 10}
85
arb_engine = TermStructureArbEngine(maturities, initial_prices,
,→ initial_vols, inventory_limits)
for step in range(5):
print(f"\n=== Iteration {step + 1} ===")
# Step 1: Forecast implied volatilities
arb_engine.forecast_implied_vols()
# Step 2: Generate random updates to simulate new market
,→ data
price_updates = {}
vol_updates = {}
for m in maturities:
price_updates[m] = arb_engine.current_prices[m] +
,→ [Link]() * 0.5
vol_updates[m] = arb_engine.implied_vols[m] +
,→ [Link]() * 0.01
# Step 3: Update market data in the engine
arb_engine.update_market_data(price_updates, vol_updates)
# Step 4: Detect mispricing signals
signals = arb_engine.detect_mispricing()
# Step 5: Place quotes to exploit mispricing
arb_engine.place_quotes_for_arbitrage(signals)
# Step 6: Synchronize inventory to manage risk
arb_engine.synchronize_inventory()
# Step 7: Compute current PnL
current_pnl = arb_engine.mark_to_market()
print("Updated Prices:", arb_engine.current_prices)
print("Inventory:", arb_engine.inventory)
print("Implied Vols:", arb_engine.implied_vols)
print("Signals:", signals)
print("Current PnL:", current_pnl)
[Link](1) # Sleep to simulate time passage; remove or
,→ adjust in production
# Entry point for the module
if __name__ == "__main__":
run_term_structure_arbitrage_demo()
The above code snippet defines the following key elements out-
side of the minted block:
• A TermStructureArbEngine class that manages forecasting
implied volatilities, detecting mispricing across maturities,
86
placing trades (mimicked by updating a local inventory struc-
ture), and marking the portfolio to market to calculate run-
ning PnL.
• A demonstration function (run_term_structure_arbitrage_demo)
that simulates random updates to prices and volatilities, con-
tinuously calling the engine’s methods to illustrate an itera-
tive trading cycle.
• Simplified placeholders for forecasting, model parameters, and
real exchange order interface. In a real environment, these
would be replaced by GARCH models, neural networks, or
application-specific APIs.
This comprehensive example serves as a foundation for imple-
menting and experimenting with term structure arbitrage by dy-
namically managing quotes and hedging positions across multiple
futures maturities.
87
Chapter 13
Hidden Liquidity
Capture Mechanism
This algorithm aims to uncover hidden liquidity in the order book
by aggregating partial fill signals and micro price movements. The
core idea is to detect the footprints left by iceberg orders (large
orders that reveal only a portion of their true size) or dark pool
patterns that cause partial fills at seemingly stable price levels.
By parsing the order book data in real time, recognizing recur-
ring partial fill footprints, and correlating them with short bursts
of volatility, the system infers the presence of undisclosed depth.
Once such hidden blocks are suspected, the algorithm places op-
portunistic quotes just around those price points to capture better
fills as the hidden liquidity reveals itself. Crucially, robust fail-safes
ensure that if the suspected liquidity pocket is illusory or quickly
consumed, the system reduces or removes the exposure to avoid
adverse selection.
Key highlights of this approach include:
• Advanced order book parsing to monitor partial fill events and
subtle volume shifts.
• Real-time detection influenced by short-horizon price moves, in-
dicating possible iceberg behavior.
• Opportunistic quoting logic that adjusts bid/ask offsets around
suspected hidden liquidity.
• Fail-safes to prevent the strategy from overcommitting when sig-
nals are inconclusive or if market conditions abruptly change.
By intelligently positioning orders near detected liquidity pock-
ets, market makers can improve fill quality and reduce transaction
88
costs, all while maintaining conservative risk management.
Python Code
Below is a Python code snippet that demonstrates a simplified
proof-of-concept implementation for uncovering hidden liquidity in
an order book. It models data ingestion, partial fill aggregation,
iceberg detection, and a quoting engine equipped with fail-safes:
import time
import random
import numpy as np
from collections import deque
class OrderBook:
"""
A simplified OrderBook structure holding price levels and
,→ volumes.
For demonstration, we keep minimal data representations.
"""
def __init__(self):
# Each side is a dict: price -> volume
[Link] = {}
[Link] = {}
# Store partial fills in a deque for quick processing
self.partial_fills = deque()
def update_order_book(self, updates):
"""
Update order book with new quotes or trades.
:param updates: list of (side, price, volume) or partial
,→ fill indicators
"""
for side, price, volume in updates:
if side == 'BID':
if volume <= 0:
# Remove the bid if volume is non-positive
if price in [Link]:
del [Link][price]
else:
[Link][price] = volume
elif side == 'ASK':
if volume <= 0:
# Remove the ask if volume is non-positive
if price in [Link]:
del [Link][price]
else:
[Link][price] = volume
elif side == 'PARTIAL_FILL':
89
# Partial fill announcements go into the partial
,→ fills queue
# volume argument here might represent the partial
,→ fill size
self.partial_fills.append((price, volume))
def get_best_bid(self):
"""Return the best (highest) bid price and volume."""
if not [Link]:
return None, 0
best_price = max([Link]())
return best_price, [Link][best_price]
def get_best_ask(self):
"""Return the best (lowest) ask price and volume."""
if not [Link]:
return None, 0
best_price = min([Link]())
return best_price, [Link][best_price]
class HiddenLiquidityDetector:
"""
Detect hidden liquidity by aggregating partial fill signals and
checking if repeated fill footprints occur at specific price
,→ levels.
"""
def __init__(self, max_samples=50, fill_threshold=5):
"""
:param max_samples: Number of partial fill samples to keep.
:param fill_threshold: Count of repeated partial fills to
,→ suspect hidden liquidity.
"""
self.max_samples = max_samples
self.fill_threshold = fill_threshold
# We store partial fills in a rolling structure
self.recent_fills = deque()
def update_fills(self, partial_fills):
"""
Append new partial fills and maintain a rolling history.
:param partial_fills: A list or deque of (price, volume)
,→ partial fills.
"""
for fill in partial_fills:
self.recent_fills.append(fill)
while len(self.recent_fills) > self.max_samples:
self.recent_fills.popleft()
def detect_iceberg_levels(self):
"""
Analyze recent fills to see if there's a price level with
,→ repeated partial fills,
90
suggesting an iceberg order. Returns a set of suspected
,→ iceberg price levels.
"""
price_count = {}
for price, vol in self.recent_fills:
price_count[price] = price_count.get(price, 0) + 1
# Identify price levels with fill counts above threshold
suspected_levels = {
price for price, count in price_count.items() if count
,→ >= self.fill_threshold
}
return suspected_levels
class QuotingEngine:
"""
Places quotes near suspected hidden liquidity levels. Uses
,→ real-time heuristics
on price movement to adjust offset. Also contains fail-safes for
,→ over-commitment.
"""
def __init__(self, inventory_limit=100, base_offset=1,
,→ max_quotes=5):
"""
:param inventory_limit: Maximum net inventory we allow.
:param base_offset: Basic offset (in ticks) from the
,→ suspected level.
:param max_quotes: Maximum number of opportunistic quotes
,→ simultaneously.
"""
[Link] = 0
self.inventory_limit = inventory_limit
self.base_offset = base_offset
self.max_quotes = max_quotes
self.active_quotes = []
def place_opportunistic_quotes(self, order_book,
,→ iceberg_levels):
"""
For each suspected iceberg level, place quotes around it
,→ (bid if lower than
current midpoint, ask if higher).
"""
best_bid, bid_vol = order_book.get_best_bid()
best_ask, ask_vol = order_book.get_best_ask()
if best_bid is None or best_ask is None:
# No valid inside market data
return []
mid_price = (best_bid + best_ask) / 2.0
91
new_quotes = []
# Limit the number of new quotes placed each cycle
allowed_quotes = self.max_quotes - len(self.active_quotes)
if allowed_quotes <= 0:
return []
# Randomly filter suspected levels to avoid flooding
iceberg_levels_sampled = list(iceberg_levels)
[Link](iceberg_levels_sampled)
iceberg_levels_sampled =
,→ iceberg_levels_sampled[:allowed_quotes]
for level in iceberg_levels_sampled:
# Decide if we want to place a buy or sell quote
if level < mid_price and [Link] <
,→ self.inventory_limit:
# Place a buy quote near the iceberg level
quote_price = max(level, best_bid) # place at or
,→ slightly above best bid
offset_quote_price = quote_price + self.base_offset
,→ * 0.5 # add half offset
new_quotes.append(('BID', offset_quote_price, 1)) #
,→ quantity 1 for simplicity
# Mark as active
self.active_quotes.append(('BID',
,→ offset_quote_price, 1))
elif level > mid_price and [Link] >
,→ -self.inventory_limit:
# Place a sell quote near the iceberg level
quote_price = min(level, best_ask) # place at or
,→ slightly below best ask
offset_quote_price = quote_price - self.base_offset
,→ * 0.5 # subtract half offset
new_quotes.append(('ASK', offset_quote_price, 1))
# Mark as active
self.active_quotes.append(('ASK',
,→ offset_quote_price, 1))
return new_quotes
def handle_fills_and_risk(self, fill_events):
"""
Update inventory based on fill events. Each fill event
,→ includes side, price, volume.
Implement fail-safes if the inventory surpasses or threatens
,→ our limit.
"""
for side, price, vol in fill_events:
if side == 'BID':
# We bought vol
[Link] += vol
elif side == 'ASK':
# We sold vol
92
[Link] -= vol
# Fail-safe: if inventory is too large or too short, remove
,→ or reduce some quotes
if abs([Link]) >= self.inventory_limit:
# Clear active quotes to reduce further fills
self.active_quotes = []
def simulate_order_book_events():
"""
Simulates incoming order book data, partial fills, and returns
,→ random updates.
In a live system, these would come from market data feeds.
"""
sides = ['BID', 'ASK']
updates = []
# Randomly generate new quotes
for _ in range(3):
side = [Link](sides)
price = [Link](95, 105)
volume = [Link](1, 10)
[Link]((side, price, volume))
# Chance to generate a partial fill
if [Link]() < 0.7: # 70% chance
fill_price = [Link](95, 105)
fill_volume = 1 # partial fill
[Link](('PARTIAL_FILL', fill_price, fill_volume))
return updates
def run_hidden_liquidity_detection():
"""
Main function to tie everything together. Continuously reads
,→ simulated
order book events, updates the detector, places quotes, and
,→ checks risk.
"""
order_book = OrderBook()
detector = HiddenLiquidityDetector(max_samples=50,
,→ fill_threshold=3)
quoting_engine = QuotingEngine(inventory_limit=10,
,→ base_offset=1, max_quotes=5)
for _ in range(20): # 20 event cycles for demonstration
# Simulate incoming data
updates = simulate_order_book_events()
# Update the order book with new quotes or partial fills
order_book.update_order_book(updates)
# Process partial fills for detection
93
if order_book.partial_fills:
partial_fills_list = list(order_book.partial_fills)
order_book.partial_fills.clear() # clear after reading
detector.update_fills(partial_fills_list)
# Detect potential iceberg levels
suspected_levels = detector.detect_iceberg_levels()
# Place opportunistic quotes
new_quotes =
,→ quoting_engine.place_opportunistic_quotes(order_book,
,→ suspected_levels)
# In a real system, we would send these new_quotes to the
,→ exchange
# For simulation, let's assume some of them get partially
,→ filled
fill_events = []
for sq in new_quotes:
side, price, volume = sq
fill_prob = [Link]()
if fill_prob < 0.2: # 20% chance of immediate partial
,→ fill
fill_events.append((side, price, 1)) # fill entire
,→ quantity for simplicity
# Handle fill events and manage risk
quoting_engine.handle_fills_and_risk(fill_events)
# Print debug info
print(f"Detected iceberg levels: {suspected_levels}")
print(f"New quotes placed: {new_quotes}")
print(f"Fill events: {fill_events}")
print(f"Current Inventory: {quoting_engine.inventory}")
print("-" * 50)
# Sleep to mimic time intervals; not needed in real HFT
,→ contexts
[Link](0.2)
if __name__ == "__main__":
run_hidden_liquidity_detection()
Outside of a real production environment, this demonstration
shows the core operational flow for uncovering hidden liquidity:
• The “OrderBook” class stores current bid/ask updates and
queues partial fill indicators.
• The “HiddenLiquidityDetector” collects and analyzes partial
fill data, looking for repeated fills at the same price.
94
• The “QuotingEngine” places opportunistic quotes around price
points identified as likely iceberg levels, while a fail-safe mech-
anism clears or reduces quotes when inventory reaches defined
limits.
• The “simulate_order_book_events” function mimics incom-
ing market data and partial fills (in reality, this data would
be streamed from an exchange).
• The main “run_hidden_liquidity_detection” loop orchestrates
the process, integrating new data, detecting hidden blocks,
placing quotes, processing fills, and printing debug state-
ments.
This code is a simplified template and would need refinement
and rigorous testing under real market conditions, emphasizing la-
tency sensitivities, order execution logic, and robust risk controls.
95
Chapter 14
Correlation Cluster
Market Maker
Below is a comprehensive overview of the Correlation Cluster Mar-
ket Maker approach, which groups together correlated assets to
spread risk, exploit price co-movements, and dynamically adjust
quoting strategies based on real-time correlation estimates. The
core idea is to:
• Continuously monitor price data for a set of assets.
• Calculate (and re-calculate) their correlation matrix or hierar-
chical clustering assignments intraday.
• Divide assets into clusters based on correlation.
• Provide liquidity (quotes) for each asset in a way that accounts
for both individual inventory targets and overall cluster-level ex-
posure.
• Automatically rebalance positions if shifts in correlation or mar-
ket conditions occur, with an optimization layer ensuring that risk
is managed efficiently across related assets.
Python Code
Below is a Python code snippet demonstrating how one might im-
plement a proof-of-concept for a cluster-based market making sys-
tem. This includes: 1) Data fetching or simulation. 2) Periodic
correlation matrix calculation and cluster assignment. 3) A com-
bined quoting engine that adapts spreads for each asset based on
cluster analysis. 4) Real-time inventory monitoring and rebalanc-
96
ing within clusters. 5) A simplified run loop simulating order fills
and dynamic updates.
import numpy as np
import pandas as pd
from [Link] import AgglomerativeClustering
import datetime
import random
def simulate_market_data(assets, num_points=300, seed=42):
"""
Simulate random intraday market data for multiple assets.
Returns a dictionary mapping each asset to a pandas DataFrame
,→ with timestamp and price columns.
:param assets: list of asset tickers (e.g. ['AAPL', 'GOOG',
,→ 'MSFT'])
:param num_points: number of simulated data points
:param seed: random number generator seed
"""
[Link](seed)
market_data = {}
# Create a mock time index at 1-minute intervals
start = [Link]().replace(hour=9, minute=30,
,→ second=0, microsecond=0)
time_index = [start + [Link](minutes=i) for i in
,→ range(num_points)]
for asset in assets:
# We assume each asset starts at some random base price
,→ around 100, then evolves
base_price = 100 + [Link](-10, 10)
# Generate random returns
returns = [Link](0, 0.02,
,→ size=num_points).cumsum()
# Simulated price path
prices = base_price * (1 + returns)
df = [Link]({
'timestamp': time_index,
'price': prices
})
market_data[asset] = df
return market_data
def compute_log_returns(market_data, assets):
"""
Compute log returns for each asset over the provided market
,→ data.
Returns a DataFrame of log returns indexed by timestamp.
:param market_data: dictionary of asset->DataFrame with columns
,→ [timestamp, price]
97
:param assets: list of asset names
"""
aligned_data = []
for asset in assets:
df = market_data[asset].copy()
df['log_return'] = [Link](df['price']).diff()
df.set_index('timestamp', inplace=True)
,→ aligned_data.append(df[['log_return']].rename(columns={'log_return':
,→ asset}))
# Merge all assets log returns into one DataFrame
merged_df = [Link](aligned_data, axis=1)
return merged_df
def dynamic_correlation_clustering(returns_df, n_clusters=2):
"""
Perform hierarchical clustering based on the correlation matrix
,→ of asset returns.
:param returns_df: DataFrame with columns corresponding to
,→ assets and rows to log returns in time
:param n_clusters: number of clusters for
,→ AgglomerativeClustering
:return: dictionary of asset->cluster, correlation_matrix
"""
# Drop any rows with NaN (e.g., first row)
returns_df = returns_df.dropna()
if returns_df.empty:
# Edge case: if not enough data, return everything in one
,→ cluster
return {col: 0 for col in returns_df.columns}, None
corr_matrix = returns_df.corr()
# Convert correlation to distance
dist_matrix = 1 - corr_matrix
# Agglomerative Clustering expects a condensed distance matrix
,→ or the raw features.
# We'll flatten the dist_matrix, but we must ensure we only pass
,→ upper triangular entries to avoid duplicates.
# Alternatively, we can cluster on raw returns. For
,→ demonstration, let's just cluster on correlation as
,→ features:
# We'll treat each asset as a row in a small dataset of
,→ correlation values, ignoring the diagonal to avoid
,→ self-correlation.
# Convert correlation matrix to a feature matrix for each asset
asset_features = []
asset_names = list(dist_matrix.columns)
for asset in asset_names:
# Use the correlation row as features
row = dist_matrix.loc[asset].values
98
asset_features.append(row)
clustering_model =
,→ AgglomerativeClustering(n_clusters=n_clusters,
,→ affinity='precomputed', linkage='average')
labels = clustering_model.fit_predict(dist_matrix)
asset_cluster_map = {asset_names[i]: labels[i] for i in
,→ range(len(asset_names))}
return asset_cluster_map, corr_matrix
class CorrelationClusterMarketMaker:
def __init__(self, assets, initial_capital=100000,
,→ n_clusters=2):
"""
Initialize the cluster-based market maker.
:param assets: list of asset names
:param initial_capital: total starting capital
:param n_clusters: number of clusters to use for
,→ hierarchical clustering
"""
[Link] = assets
self.n_clusters = n_clusters
[Link] = initial_capital
# Dictionary to track current positions and average cost for
,→ each asset
[Link] = {asset: 0 for asset in assets}
self.avg_costs = {asset: 0.0 for asset in assets}
# Store cluster assignments
self.cluster_assignments = {asset: 0 for asset in assets}
# Spread widths (can be modified dynamically)
# Key: cluster index, Value: (base_spread,
,→ volatility_factor)
self.cluster_spread_policy = {c: (0.01, 0.05) for c in
,→ range(n_clusters)}
# Record realized PnL
self.realized_pnl = 0.0
def update_clusters(self, asset_cluster_map):
""" Update the cluster assignments. """
self.cluster_assignments = asset_cluster_map
def quote_prices(self, asset, price):
"""
Determine bid and ask prices for a given asset based on
,→ cluster spread policy.
:param asset: asset name (string)
:param price: current market price (float)
:return: (bid_price, ask_price)
99
"""
cluster_idx = self.cluster_assignments[asset]
base_spread, vol_factor =
,→ self.cluster_spread_policy[cluster_idx]
# For demonstration, treat the cluster_id as an indicator
,→ for how wide to set spreads
# A more sophisticated approach might incorporate real-time
,→ volatility or inventory.
# Example: spread = base_spread * (1 + vol_factor *
,→ random_value)
# We'll do a random multiplier to simulate real-time
,→ conditions.
random_multiplier = 1 + [Link]() * vol_factor
spread = price * base_spread * random_multiplier
bid_price = price - spread / 2
ask_price = price + spread / 2
return bid_price, ask_price
def handle_fill(self, asset, fill_price, volume):
"""
Update inventory and realized PnL when a fill occurs.
:param asset: asset name
:param fill_price: fill price
:param volume: number of shares/contracts traded
"""
current_position = [Link][asset]
current_avg_cost = self.avg_costs[asset]
# If volume > 0, it is a buy, if volume < 0, it is a sell
# Realized PnL if the direction changes or partial offset
,→ occurs
if [Link](current_position) == [Link](volume) or
,→ current_position == 0:
# We are adding to existing position or starting new
,→ position
new_position = current_position + volume
# Weighted average cost update
if new_position != 0:
new_avg_cost = (current_position * current_avg_cost
,→ + volume * fill_price) / new_position
else:
new_avg_cost = 0.0
[Link][asset] = new_position
self.avg_costs[asset] = new_avg_cost
else:
# Some or all of the existing position is offset
if abs(volume) <= abs(current_position):
# Position is partially or fully closed
realized = volume * (fill_price - current_avg_cost)
100
self.realized_pnl += realized
[Link][asset] += volume
# If position goes to zero, reset avg cost
if [Link][asset] == 0:
self.avg_costs[asset] = 0.0
else:
# volume is larger in magnitude than the current
,→ position => net direction changes
offset_vol = -current_position
realized = offset_vol * (fill_price -
,→ current_avg_cost)
self.realized_pnl += realized
# Start new position with leftover volume
new_vol = volume + offset_vol # This will have the
,→ sign of volume
[Link][asset] = new_vol
self.avg_costs[asset] = fill_price
def optimize_cluster_exposure(self):
"""
Dummy optimization step that attempts to reduce risk if a
,→ single asset's position
dominates its cluster exposure.
"""
cluster_positions = {}
for asset in [Link]:
c_id = self.cluster_assignments[asset]
cluster_positions.setdefault(c_id, 0)
cluster_positions[c_id] += [Link][asset]
# If any cluster has too large an exposure, attempt to
,→ reduce it
# We'll just log it for demonstration. In a real system,
,→ we'd place offsetting trades.
for c_id, total_pos in cluster_positions.items():
if abs(total_pos) > 50: # Arbitrary threshold
print(f"WARNING: Cluster {c_id} total position =
,→ {total_pos}. Consider rebalancing.")
def run_correlation_cluster_strategy(assets, steps=300,
,→ n_clusters=2):
"""
Main function to run the Correlation Cluster Market Maker
,→ strategy on simulated or real data.
:param assets: list of asset names
:param steps: number of data points to simulate
:param n_clusters: number of clusters
"""
# Initialize the market data
data_dict = simulate_market_data(assets, num_points=steps,
,→ seed=42)
# Initialize the market maker
101
mm = CorrelationClusterMarketMaker(assets,
,→ initial_capital=100000, n_clusters=n_clusters)
# We'll store the returns in a rolling window to recalc
,→ correlation intraday
rolling_window = 30
all_timestamps = data_dict[assets[0]]['timestamp'].values
# Simulate a run loop
for i in range(1, steps):
current_time = all_timestamps[i]
# Build a DataFrame of newest data up to index i
slice_data = {}
for asset in assets:
slice_data[asset] = data_dict[asset].iloc[:i+1]
returns_df = compute_log_returns(slice_data, assets)
# Recompute clusters periodically (for example, every 20
,→ steps)
if i % 20 == 0:
# Use a rolling window for correlation
if i >= rolling_window:
recent_returns = returns_df.iloc[-rolling_window:]
else:
recent_returns = returns_df.copy()
asset_cluster_map, corr_matrix =
,→ dynamic_correlation_clustering(recent_returns,
,→ n_clusters=n_clusters)
mm.update_clusters(asset_cluster_map)
# For each asset, we get the latest price, quote, and
,→ simulate a random fill
for asset in assets:
latest_price = slice_data[asset].iloc[-1]['price']
bid, ask = mm.quote_prices(asset, latest_price)
# Simulate the chance of an incoming market order
# Typically determined by real order flow. For
,→ demonstration, random logic here:
if [Link]() < 0.05: # 5% chance of a buy
,→ incoming
fill_price = ask
volume = [Link](1, 10) # random buy size
mm.handle_fill(asset, fill_price, volume)
elif [Link]() < 0.05: # 5% chance of a sell
,→ incoming
fill_price = bid
volume = -[Link](1, 10) # random sell size
mm.handle_fill(asset, fill_price, volume)
# Periodically run a cluster-level exposure check
102
mm.optimize_cluster_exposure()
print("===== Final Results =====")
print("Positions:", [Link])
print("Average Costs:", mm.avg_costs)
print("Realized PnL:", mm.realized_pnl)
if __name__ == "__main__":
# Example usage
asset_list = ["AAPL", "GOOG", "MSFT", "AMZN", "TSLA"]
run_correlation_cluster_strategy(asset_list, steps=200,
,→ n_clusters=2)
Above, we walk through:
• Simulating (or hypothetically fetching) price data for several
assets.
• Calculating log returns and performing hierarchical clustering
on the correlation matrix.
• Dynamically updating cluster assignments in the Correlation-
ClusterMarketMaker class.
• Sending out quotes based on each asset’s retrieved cluster
policy (spread, volatility factor, etc.).
• Handling fills in real time, updating positions, realized PnL,
and average costs.
• Providing a basic “optimization” step to warn if a cluster’s
overall position grows too large, suggesting rebalancing.
This skeleton illustrates an end-to-end pipeline for a cluster-
based market maker. In practical deployment, you would integrate
real market data streams, robust concurrency controls, advanced
optimization layers, and more sophisticated volatility and correla-
tion modeling to handle real-world complexities.
103
Chapter 15
Adaptive Risk Aversion
Tuning
Approach Explanation
This section explains an approach for implementing a dynamic,
real-time adjustment of the risk aversion parameter in a market
making environment. Traditionally, market makers fix their risk
aversion for an entire trading session, but this method repeat-
edly recalibrates it based on updated market conditions, effectively
balancing the maker’s appetite for risk against sudden changes in
volatility, volume, and sentiment.
Core ideas and methodology:
• Real-Time Market Signals: The algorithm periodically
captures real-time indicators such as volatility (e.g., from an
EWMA or GARCH model), trading volume levels, and sen-
timent data (optional).
• PnL Tracking: It keeps track of historical PnL (both re-
alized and unrealized) to gauge how profitable or risky the
current quoting strategy is.
• Risk Aversion Update: Using a gradient-based method,
the algorithm computes the deviation of current realized PnL
variance from a user-defined target. If the realized variance is
consistently above target, the algorithm increases risk aver-
sion (i.e., tightens quoting, widening spreads). Conversely, if
variance is below target, the algorithm decreases risk aversion
and quotes more aggressively to seek higher profits.
104
• Feedback Loop: By iterating this process, the system con-
verges to a dynamic equilibrium of quoting aggressiveness
suited to prevailing market conditions. Thus, risk aversion
“spikes” when volatility surges and relaxes in calmer mar-
kets.
Python Code
Below is a Python code snippet that demonstrates how to imple-
ment a dynamic risk aversion mechanism for a market making
strategy. It includes functions to retrieve (mock) market data,
track realized PnL variance, update the risk aversion parameter,
and compute new quotes accordingly.
import numpy as np
import random
from collections import deque
class MarketDataFeed:
"""
Mock data feed that provides market data in real-time.
In an actual implementation, this would interface with live APIs
,→ or
a data stream capturing price, volatility, etc.
"""
def __init__(self, price=100.0, volatility=0.2, sentiment=0.0,
,→ volume=1000):
[Link] = price
[Link] = volatility
[Link] = sentiment
[Link] = volume
def update_market_state(self):
"""
Simulate random market updates. In a real scenario, you'd
pull data from an exchange or a real-time feed.
"""
# Random fluctuations in price
price_change = [Link](0, [Link])
[Link] += price_change
# In reality, volatility might also shift slightly with
,→ volume or sentiment
[Link] = max(0.01, [Link] +
,→ [Link](0, 0.01))
# Sentiment could be random in this mock. Real
,→ implementation might parse news or social media.
105
[Link] += [Link](0, 0.05)
[Link] = max(-1.0, min(1.0, [Link])) #
,→ clamp between -1 and 1
# Volume might also fluctuate
[Link] += [Link](-10, 10)
[Link] = max(1, [Link])
return {
"price": [Link],
"volatility": [Link],
"sentiment": [Link],
"volume": [Link]
}
class DynamicRiskAversionMarketMaker:
"""
Implements a market making strategy with dynamic, real-time
,→ adjustment
of the risk aversion parameter via gradient-based updates.
"""
def __init__(self,
initial_risk_aversion=0.5,
target_pnl_variance=1.0,
learning_rate=0.1,
history_length=100):
"""
:param initial_risk_aversion: Starting value of the risk
,→ aversion parameter (0 < aversion < 1).
:param target_pnl_variance: Desired upper bound for PnL
,→ variance.
:param learning_rate: Step size for gradient-based updates
,→ of the risk aversion parameter.
:param history_length: Number of previous PnL data points to
,→ store for variance calculation.
"""
self.risk_aversion = initial_risk_aversion
self.target_pnl_variance = target_pnl_variance
self.learning_rate = learning_rate
self.pnl_history = deque(maxlen=history_length)
# For illustration, suppose the market maker keeps track of
,→ a simple inventory
# and positions are updated when quotes fill; we keep it
,→ simple here.
[Link] = 0
self.last_price = 100.0
def update_pnl(self, current_price):
"""
Compute the approximate realized PnL based on simple
,→ inventory changes and
106
store it for variance tracking. A more sophisticated
,→ approach might track each trade.
"""
# Approximate daily PnL = (current_price - last_price) *
,→ inventory
realized_pnl = (current_price - self.last_price) *
,→ [Link]
self.pnl_history.append(realized_pnl)
self.last_price = current_price
def compute_pnl_variance(self):
"""
Calculate the rolling variance of PnL using the stored
,→ history.
"""
if len(self.pnl_history) < 2:
return 0.0
mean_pnl = [Link](self.pnl_history)
var_pnl = [Link]([(p - mean_pnl)**2 for p in
,→ self.pnl_history])
return var_pnl
def update_risk_aversion(self, current_volatility):
"""
Gradient-based update of the risk aversion parameter. Here
,→ we treat the difference
between actual PnL variance and target_pnl_variance as a
,→ 'loss' to be minimized.
The partial derivative is simplified for demonstration; in
,→ practice, you might add
smoothing or multiple terms to reflect volatility, volume,
,→ and sentiment signals.
"""
current_variance = self.compute_pnl_variance()
# 'loss' function measuring how far we are above/below the
,→ target variance
loss = current_variance - self.target_pnl_variance
# Basic gradient update: increase risk aversion if variance
,→ is above target, and vice versa
self.risk_aversion = self.risk_aversion + self.learning_rate
,→ * loss
# Constrain risk_aversion between 0 and 1 (just a typical
,→ simplified bound)
self.risk_aversion = max(0.0, min(1.0, self.risk_aversion))
# Extra volatility-based dynamic: if volatility is extremely
,→ high, push risk aversion up
if current_volatility > 0.5:
self.risk_aversion = min(1.0, self.risk_aversion + 0.05)
107
def compute_quotes(self, market_state):
"""
Example function that produces a bid and ask price based on
,→ the current risk aversion.
- The more risk averse, the wider the quote.
- The less risk averse, the narrower the quote.
"""
base_price = market_state["price"]
vol = market_state["volatility"]
# Spread is a function of risk aversion and volatility. For
,→ instance, we scale
# by a base factor times (1 + risk_aversion + vol).
spread = 0.01 * (1 + self.risk_aversion + vol)
bid_price = base_price * (1 - spread)
ask_price = base_price * (1 + spread)
return bid_price, ask_price
def execute_quotes(self, bid_price, ask_price, market_state):
"""
Execution simulation: random fill model. If the real price
is below our ask, we might sell; if above our bid, we might
,→ buy.
This is a simplified placeholder.
"""
current_price = market_state["price"]
# If current price is lower than ask, we might get filled on
,→ the ask (sell).
# If it's higher than bid, fill on the bid (buy).
# Probability of fill might also depend on volume,
,→ sentiment, etc.
fill_probability = 0.3 # for demonstration
if [Link]() < fill_probability:
if current_price >= ask_price:
# Market maker sells 1 unit
[Link] -= 1
elif current_price <= bid_price:
# Market maker buys 1 unit
[Link] += 1
def main_simulation():
"""
Main loop simulating market updates and dynamic risk aversion
,→ market making.
"""
# Initialize a market data feed and the dynamic risk aversion
,→ maker
feed = MarketDataFeed(price=100.0, volatility=0.2,
,→ sentiment=0.0, volume=1000)
108
maker =
,→ DynamicRiskAversionMarketMaker(initial_risk_aversion=0.5,
target_pnl_variance=2.0,
learning_rate=0.05,
history_length=50)
# Run a simple loop simulating intraday ticks
for timestep in range(200):
market_state = feed.update_market_state()
maker.update_pnl(market_state["price"]) # update PnL
,→ tracking
maker.update_risk_aversion(market_state["volatility"]) #
,→ dynamic aversion update
bid, ask = maker.compute_quotes(market_state)
maker.execute_quotes(bid, ask, market_state)
# Print out occasionally
if timestep % 50 == 0:
print(f"Time: {timestep}")
print(f" Current Price: {market_state['price']:.2f}")
print(f" Volatility: {market_state['volatility']:.2f}")
print(f" Risk Aversion: {maker.risk_aversion:.2f}")
print(f" Inventory: {[Link]}")
print(f" Rolling PnL Variance:
,→ {maker.compute_pnl_variance():.4f}")
print("-"*50)
if __name__ == "__main__":
main_simulation()
• MarketDataFeed supplies randomized price, volatility, vol-
ume, and sentiment updates, simulating live market condi-
tions.
• DynamicRiskAversionMarketMaker implements both PnL vari-
ance tracking and a gradient-based risk aversion update.
• update_risk_aversion raises (or lowers) the risk aversion
based on the deviation of actual PnL variance from the tar-
get, and nudges it further upward if volatility surpasses a
threshold.
• compute_quotes and execute_quotes illustrate how wider/-
narrower spreads and basic inventory changes could be incor-
porated.
• main_simulation runs a simple loop of 200 time steps, show-
ing how risk aversion evolves along with market conditions.
109
This end-to-end snippet demonstrates the essential mechanics
behind a dynamic, real-time risk aversion algorithm for market
making. In production, one would replace the mock data feed with
a real exchange or aggregator, introduce a more robust fill model,
and refine the gradient-based update to incorporate additional risk,
sentiment, or strategy-specific constraints.
110
Chapter 16
Meta-Learning Spread
Optimizer
Below is a detailed explanation of how a Meta-Learning Spread
Optimizer can be implemented for market making scenarios. The
approach outlined here begins by training a base model (either
through supervised learning or reinforcement learning) on histori-
cal data to learn initial quoting strategies. A meta-learner is then
introduced, observing the base model’s performance across various
instruments and market conditions, iteratively refining quoting pa-
rameters like spread size and refresh frequency to maximize profit
and manage risk.
Key stages include:
• Data Preparation: Gather and preprocess historical data for
multiple instruments spanning diverse market conditions.
• Base Model Training: Use a supervised or RL approach to learn
a primary policy for market making (e.g., fixing an initial risk aver-
sion and quoting style).
• Meta-Learner Initialization: Observe how the base policy per-
forms across different contexts, recording performance statistics.
• Meta-Learning Loops: Adjust quoting parameters in real time or
in mini-batches, refining the strategy with gradient-based updates
or attention-based re-weighting of learned features.
• Evaluation: Apply the refined quoting policy on out-of-sample
data or simulated forward environments to confirm improvements.
111
Python Code
Below is a Python code snippet that demonstrates the foundational
elements of a Meta-Learning Spread Optimizer, including environ-
ment simulation, base model training, meta-learner updates, and
a simplified quoting logic.
import numpy as np
import random
from typing import List, Dict, Tuple
class MarketEnvironment:
"""
A simple environment simulating multiple instruments under
,→ varying market conditions.
Each instrument has price dynamics, volatility, and trading
,→ volume features.
"""
def __init__(self, num_instruments: int = 3, max_steps: int =
,→ 1000, seed: int = 42):
[Link](seed)
[Link](seed)
self.num_instruments = num_instruments
self.max_steps = max_steps
self.current_step = 0
# Randomly generate initial prices and volatilities for each
,→ instrument
[Link] = [100 + [Link](0, 1) for _ in
,→ range(num_instruments)]
[Link] = [[Link](0.1, 0.5) for _ in
,→ range(num_instruments)]
self.step_data = []
def reset(self):
"""
Reset the environment to its initial state.
"""
self.current_step = 0
[Link] = [100 + [Link](0, 1) for _ in
,→ range(self.num_instruments)]
[Link] = [[Link](0.1, 0.5) for _ in
,→ range(self.num_instruments)]
self.step_data = []
return self._get_observation()
def step(self, actions: List[Dict[str, float]]) ->
,→ Tuple[List[float], float, bool]:
"""
Simulate a single time step with the given quoting actions
,→ for each instrument.
112
:param actions: A list of dicts specifying 'spread_size' and
,→ 'refresh_freq' for each instrument.
:return: (observation, reward, done)
"""
# Basic price update using random walk + mild trend
for i in range(self.num_instruments):
trend = [Link](0, 0.1) # random drift
[Link][i] += trend + [Link](0,
,→ [Link][i])
# Volatility could also evolve
[Link][i] = max(0.01, [Link][i] +
,→ [Link](0, 0.01))
# Compute reward from simplistic fill probability vs risk
,→ notion
reward = 0.0
for i, action in enumerate(actions):
spread = [Link]('spread_size', 1.0)
freq = [Link]('refresh_freq', 1.0)
# The fill probability might inversely scale with spread
fill_probability = max(0, 1.0 - 0.05 * spread)
# Reward can incorporate a scaled 'spread' minus some
,→ cost for large freq
reward_instrument = fill_probability * spread - 0.01 *
,→ freq
reward += reward_instrument
self.current_step += 1
done = (self.current_step >= self.max_steps)
obs = self._get_observation()
self.step_data.append({
'prices': [Link](),
'spreads': [act['spread_size'] for act in actions],
'freqs': [act['refresh_freq'] for act in actions],
'reward': reward
})
return obs, reward, done
def _get_observation(self) -> List[Dict[str, float]]:
"""
Return a list of dicts containing price and volatility info
,→ for each instrument.
"""
obs = []
for i in range(self.num_instruments):
[Link]({
'price': [Link][i],
'vol': [Link][i]
})
return obs
113
class BaseModel:
"""
A base quoting strategy that could be trained via supervised
,→ learning or RL.
Here, we simply store learned parameters for spread and refresh
,→ frequency.
"""
def __init__(self, num_instruments: int):
# Example: one param for spread and one for refresh
,→ frequency per instrument
self.spread_params = [1.0 for _ in range(num_instruments)]
self.freq_params = [1.0 for _ in range(num_instruments)]
def predict_actions(self, obs: List[Dict[str, float]]) ->
,→ List[Dict[str, float]]:
"""
Given the observation, predict quoting parameters
,→ (spread_size, refresh_freq).
"""
actions = []
for i, instrument_obs in enumerate(obs):
[Link]({
'spread_size': max(0.1, self.spread_params[i]),
'refresh_freq': max(0.1, self.freq_params[i])
})
return actions
def train_base_model(env: MarketEnvironment, base_model: BaseModel,
,→ steps: int = 200) -> None:
"""
Train the base model on the environment. This is a placeholder
,→ to mimic training.
In practice, reinforcement learning or regression on historical
,→ data would be applied.
"""
for _ in range(steps):
obs = [Link]()
done = False
while not done:
actions = base_model.predict_actions(obs)
obs, reward, done = [Link](actions)
# Toy update: shift spreads/frequency based on reward
,→ gradient approximation
# This is a placeholder gradient update.
for i in range(env.num_instruments):
base_model.spread_params[i] += 0.0001 * reward
base_model.freq_params[i] -= 0.00005 * reward
class MetaLearner:
"""
Observes the base model's performance across different contexts
,→ (instruments and market states).
114
Refines the base model's quoting parameters for better
,→ generalization.
"""
def __init__(self, base_model: BaseModel, meta_lr: float =
,→ 0.001):
self.base_model = base_model
self.meta_lr = meta_lr
def meta_update(self, performance_data: List[float]) -> None:
"""
Updates the base model parameters based on aggregated
,→ performance across multiple markets.
:param performance_data: List of total rewards or
,→ performance metrics from each market environment.
"""
# Example meta-update: shift parameters proportionally to
,→ the performance average
avg_performance = [Link](performance_data)
for i in range(len(self.base_model.spread_params)):
self.base_model.spread_params[i] += self.meta_lr *
,→ avg_performance
self.base_model.freq_params[i] -= self.meta_lr *
,→ avg_performance * 0.5
def meta_learning_training(
base_model: BaseModel,
meta_learner: MetaLearner,
num_instruments: int = 3,
num_meta_epochs: int = 3,
env_steps_per_epoch: int = 200
):
"""
Conduct meta-learning over multiple synthetic market
,→ environments to refine the base model.
:param base_model: The pre-trained base model.
:param meta_learner: The meta learner responsible for refining
,→ quoting parameters.
:param num_instruments: Number of instruments to generate for
,→ each environment.
:param num_meta_epochs: Number of meta-learning epochs.
:param env_steps_per_epoch: Steps to train in each environment
,→ in each meta epoch.
"""
for epoch in range(num_meta_epochs):
performance_across_envs = []
# Repeat for multiple random environments to gather broad
,→ performance data
for _ in range(5): # number of distinct markets we want to
,→ test
# Create a new environment with different random seed to
,→ simulate diverse context
env_seed = [Link](1, 10000)
115
env = MarketEnvironment(num_instruments=num_instruments,
,→ max_steps=50, seed=env_seed)
total_reward = 0.0
train_base_model(env, base_model,
,→ steps=env_steps_per_epoch)
# Evaluate final performance in the environment
obs = [Link]()
done = False
while not done:
actions = base_model.predict_actions(obs)
obs, reward, done = [Link](actions)
total_reward += reward
performance_across_envs.append(total_reward)
# Meta update based on performance across multiple market
,→ environments
meta_learner.meta_update(performance_across_envs)
print(f"[Meta Epoch {epoch+1}] Average Performance =
,→ {[Link](performance_across_envs):.4f}")
def example_usage():
"""
Demonstrate usage of the Meta-Learning Spread Optimizer.
"""
# Initialize environment, base model, and meta-learner
env = MarketEnvironment(num_instruments=3, max_steps=50,
,→ seed=42)
base_model = BaseModel(num_instruments=3)
train_base_model(env, base_model, steps=100) # initial base
,→ model training
meta_learner = MetaLearner(base_model=base_model, meta_lr=0.01)
meta_learning_training(
base_model=base_model,
meta_learner=meta_learner,
num_instruments=3,
num_meta_epochs=3,
env_steps_per_epoch=50
)
# Final demonstration of quoting after meta-learning
final_obs = [Link]()
final_actions = base_model.predict_actions(final_obs)
print("Final Quoting Parameters:", final_actions)
if __name__ == "__main__":
example_usage()
This Python code outlines a simplified example of how to imple-
ment a Meta-Learning Spread Optimizer for market making. The
stages of the code can be summarized as follows:
116
• MarketEnvironment class simulates multiple instruments with
prices and volatilities evolving over time.
• BaseModel represents a basic quoting strategy that predicts
actions (spread size, refresh frequency) given observations.
• train_base_model function demonstrates a placeholder ap-
proach to train the base model.
• MetaLearner refines the base model’s parameters based on
aggregated performance measures across multiple market con-
texts.
• meta_learning_training function coordinates the meta-learning
loop, testing the base model on several environments and us-
ing the meta_update method for improvement.
• example_usage provides an end-to-end demonstration of how
these components fit together, resulting in progressively im-
proved quoting parameters.
By adjusting the implementation details—for instance, replac-
ing placeholder updates with actual gradient-based (supervised or
RL) computations—this framework can be extended to real-world
meta-learning scenarios, where a single quoting strategy needs to
adapt to multiple instruments and rapidly changing market condi-
tions.
117
Chapter 17
RL-Based Inventory
Hedging
Below is an outline of how one could implement a Reinforcement
Learning (RL)–based inventory hedging algorithm for market mak-
ing, consistent with the concept described in this chapter. The
approach involves simultaneously trading a primary instrument
and its hedging asset (for instance, correlated futures or options),
thereby managing net exposure more effectively. The agent penal-
izes high net positions through its reward function, encouraging
balanced inventory states while still seeking to capture profitable
opportunities.
In this illustrative example, we:
• Create a multi-instrument trading environment simulating two
asset prices.
• Track the agent’s inventory in both the main instrument and the
hedging asset.
• Define a reward function that combines immediate PnL with a
penalty for large net or unbalanced positions.
• Implement a Deep Q-Network (DQN)–style training loop using
PyTorch.
• Demonstrate how to place offsetting trades and manage an in-
ventory ratio.
Though simplified, it demonstrates the core building blocks for
dynamic hedging in a multi-instrument setting.
118
Python Code
Below is a Python code snippet that encompasses a self-contained
RL-based inventory hedging strategy, including environment def-
inition, neural network agent, training loop, and reward function
capturing PnL and inventory penalties.
import numpy as np
import torch
import [Link] as nn
import [Link] as optim
import random
from collections import deque
class MultiInstrumentEnv:
"""
A simplified environment for RL-based inventory hedging. It
,→ simulates two assets:
1) A main traded instrument (e.g., a stock or crypto).
2) A hedging instrument (e.g., a correlated future or option).
The agent takes actions to adjust its position in either
,→ instrument, striving to
manage overall net exposure while capturing profit.
"""
def __init__(self,
initial_main_price=100.0,
initial_hedge_price=50.0,
max_steps=200,
inventory_limit=10,
seed=42):
[Link](seed)
[Link](seed)
self.initial_main_price = initial_main_price
self.initial_hedge_price = initial_hedge_price
self.max_steps = max_steps
self.inventory_limit = inventory_limit
# Action space: 0=Hold, 1=Buy main, 2=Sell main, 3=Buy
,→ hedge, 4=Sell hedge
self.action_space_size = 5
# For stochastic price updates
# Prices are updated via random shocks with mild correlation
[Link] = 0.3
self.vol_main = 1.0
self.vol_hedge = 0.5
[Link]()
def reset(self):
"""
Reset environment state at the beginning of each episode.
119
"""
self.step_count = 0
self.main_price = self.initial_main_price
self.hedge_price = self.initial_hedge_price
self.inventory_main = 0
self.inventory_hedge = 0
# Tally realized PnL if we choose to incorporate, though
,→ we'll compute reward each step
self.realized_pnl = 0.0
return self._get_state()
def _get_state(self):
"""
Return the current state representation.
State includes: main_price, hedge_price,
inventory_main, inventory_hedge,
ratio of main to hedge (for demonstration).
"""
ratio = 0.0
if abs(self.inventory_hedge) > 0:
ratio = self.inventory_main /
,→ float(self.inventory_hedge)
return [Link]([
self.main_price,
self.hedge_price,
self.inventory_main,
self.inventory_hedge,
ratio
], dtype=np.float32)
def step(self, action):
"""
Execute the agent's action, update the environment, compute
,→ reward, and return new state.
:param action: Discrete integer from 0 to 4 indicating how
,→ to adjust positions.
"""
# Record previous prices for PnL calculations
old_main_price = self.main_price
old_hedge_price = self.hedge_price
# Execute the chosen action
if action == 1: # Buy main
if self.inventory_main < self.inventory_limit:
self.inventory_main += 1
elif action == 2: # Sell main
if self.inventory_main > -self.inventory_limit:
self.inventory_main -= 1
elif action == 3: # Buy hedge
if self.inventory_hedge < self.inventory_limit:
self.inventory_hedge += 1
elif action == 4: # Sell hedge
if self.inventory_hedge > -self.inventory_limit:
120
self.inventory_hedge -= 1
# action == 0 means hold (no position change)
# Update prices
self._update_prices()
# Compute reward
reward = self._calculate_reward(old_main_price,
,→ old_hedge_price)
self.step_count += 1
done = (self.step_count >= self.max_steps)
next_state = self._get_state()
return next_state, reward, done, {}
def _update_prices(self):
"""
Update both asset prices with random shocks, introducing
,→ mild correlation.
"""
# Normally distributed random shocks
eps1 = [Link](0, self.vol_main)
eps2 = [Link](0, self.vol_hedge)
# Introduce correlation into eps2
eps2 = [Link] * eps1 + [Link](1 -
,→ [Link]**2) * eps2
self.main_price += eps1
self.hedge_price += eps2
# Ensure prices don't go negative in the simplified
,→ environment
self.main_price = max(0.1, self.main_price)
self.hedge_price = max(0.1, self.hedge_price)
def _calculate_reward(self, old_main_price, old_hedge_price):
"""
Reward = Delta PnL on main + Delta PnL on hedge - penalty *
,→ net_position^2
This encourages the agent to avoid carrying large positions
,→ while capturing profit from
favorable price movements. The net_position^2 penalty can be
,→ scaled to emphasize or
reduce inventory balancing behavior as needed.
"""
# Unrealized PnL changes from holding inventory across a
,→ price update
pnl_main = self.inventory_main * (self.main_price -
,→ old_main_price)
pnl_hedge = self.inventory_hedge * (self.hedge_price -
,→ old_hedge_price)
pnl_unrealized = pnl_main + pnl_hedge
121
# Inventory penalty factor
penalty_scale = 0.1
inv_penalty = penalty_scale * ((abs(self.inventory_main) +
,→ abs(self.inventory_hedge)) ** 2)
# Final reward
reward = pnl_unrealized - inv_penalty
return reward
def render(self):
"""
Optional rendering for debugging or demonstration.
Prints out current state details.
"""
print(f"Step: {self.step_count}, Main Price:
,→ {self.main_price:.2f}, "
f"Hedge Price: {self.hedge_price:.2f}, Inv Main:
,→ {self.inventory_main}, "
f"Inv Hedge: {self.inventory_hedge}")
class QNetwork([Link]):
"""
A simple feed-forward network for approximating the Q-function.
Given the current environment state, it outputs Q-values for
,→ each possible action.
"""
def __init__(self, state_dim, action_dim):
super(QNetwork, self).__init__()
self.fc1 = [Link](state_dim, 64)
self.fc2 = [Link](64, 64)
self.fc3 = [Link](64, action_dim)
[Link] = [Link]()
def forward(self, x):
x = [Link](self.fc1(x))
x = [Link](self.fc2(x))
return self.fc3(x)
class DQNAgent:
"""
A DQN-style agent that learns to pick actions (buy, sell, hold)
,→ for both the main
and hedging instrument to optimize the inventory risk-reward
,→ mechanics.
"""
def __init__(self, state_dim, action_dim, gamma=0.99, lr=1e-3,
batch_size=32, buffer_size=5000, epsilon_start=1.0,
epsilon_end=0.01, epsilon_decay=0.995):
self.state_dim = state_dim
self.action_dim = action_dim
[Link] = gamma
[Link] = lr
self.batch_size = batch_size
122
self.replay_buffer = deque(maxlen=buffer_size)
[Link] = epsilon_start
self.epsilon_min = epsilon_end
self.epsilon_decay = epsilon_decay
[Link] = [Link]("cuda" if
,→ [Link].is_available() else "cpu")
self.q_net = QNetwork(state_dim, action_dim).to([Link])
self.target_net = QNetwork(state_dim,
,→ action_dim).to([Link])
self.target_net.load_state_dict(self.q_net.state_dict())
self.target_net.eval()
[Link] = [Link](self.q_net.parameters(),
,→ lr=[Link])
def select_action(self, state):
"""
Epsilon-greedy action selection. The agent either picks a
,→ random action or
uses its Q-network to pick the best action given the current
,→ state.
"""
if [Link]() < [Link]:
return [Link](0, self.action_dim)
else:
state_t =
,→ [Link](state).unsqueeze(0).to([Link])
with torch.no_grad():
q_values = self.q_net(state_t)
return q_values.argmax().item()
def store_transition(self, state, action, reward, next_state,
,→ done):
self.replay_buffer.append((state, action, reward,
,→ next_state, done))
def sample_batch(self):
batch = [Link](self.replay_buffer, self.batch_size)
states, actions, rewards, next_states, dones = zip(*batch)
states = [Link](states).to([Link])
actions = [Link](actions).to([Link])
rewards = [Link](rewards).to([Link])
next_states = [Link](next_states).to([Link])
dones = [Link](dones).to([Link])
return states, actions, rewards, next_states, dones
def update_target_network(self):
self.target_net.load_state_dict(self.q_net.state_dict())
123
def train_step(self):
"""
Runs a single training update step on a sampled mini-batch
,→ of transitions.
"""
if len(self.replay_buffer) < self.batch_size:
return
states, actions, rewards, next_states, dones =
,→ self.sample_batch()
# Current Q estimates
q_values = self.q_net(states)
# Index the Q-values corresponding to the actions taken
q_values = q_values.gather(1,
,→ [Link](1)).squeeze(1)
# Next Q values from target network
with torch.no_grad():
next_q_values = self.target_net(next_states).max(1)[0]
# Bellman backup
target_q_values = rewards + [Link] * next_q_values * (1
,→ - dones)
loss = [Link]()(q_values, target_q_values)
[Link].zero_grad()
[Link]()
[Link]()
# Decay epsilon
if [Link] > self.epsilon_min:
[Link] *= self.epsilon_decay
def main():
# Hyperparameters
num_episodes = 200
update_target_every = 20
env = MultiInstrumentEnv()
agent = DQNAgent(state_dim=5, action_dim=env.action_space_size)
all_rewards = []
for episode in range(num_episodes):
state = [Link]()
episode_reward = 0
done = False
while not done:
action = agent.select_action(state)
next_state, reward, done, _ = [Link](action)
124
agent.store_transition(state, action, reward,
,→ next_state, float(done))
agent.train_step()
state = next_state
episode_reward += reward
# Update target network occasionally
if episode % update_target_every == 0:
agent.update_target_network()
all_rewards.append(episode_reward)
# Print out progress
if (episode+1) % 10 == 0:
print(f"Episode {episode+1}, Reward:
,→ {episode_reward:.2f}, Epsilon: {[Link]:.2f}")
print("Training complete. Sample final rewards:",
,→ all_rewards[-10:])
if __name__ == "__main__":
main()
This full Python code example demonstrates one way to set up
an RL-based inventory hedging scheme:
• The MultiInstrumentEnv tracks two asset prices (a primary
instrument and a hedging instrument) and the agent’s inven-
tory in each.
• The reward function incentivizes the agent to capture PnL
from favorable price moves while penalizing large or unbal-
anced positions.
• The DQNAgent implements a Deep Q-Network that learns a
policy for how to buy, sell, or hold each instrument.
• Epsilon-greedy exploration ensures the agent continues to dis-
cover better hedging strategies throughout training.
• The environment and agent design can be extended or modi-
fied to incorporate more realistic market models, transaction
costs, or more sophisticated hedging instrument mechanics
(e.g., options Greek exposures).
This completes a minimal but functional demonstration of how
reinforcement learning may be used to manage inventory risk in a
multi-instrument market-making context.
125
Chapter 18
Quantum Annealing
Price Discovery
In this approach, we introduce a simplified yet illustrative exam-
ple of using quantum annealing (or classical approximate solvers)
to tackle the combinatorial problem of optimal quoting in real
time. We model the decision to select one among multiple bid-ask
spread widths as a Quadratic Unconstrained Binary Optimization
(QUBO) problem. The guiding objective function incorporates:
(1) Inventory risk (penalizing large or imbalanced positions),
(2) Fill probability (encouraging actively traded spreads),
(3) Price uncertainty (widening spreads to mitigate risk).
By minimizing the energy function that sums these terms, we ef-
fectively encode the trade-off a market maker faces when choosing
spreads. In a real-world system, additional constraints (like a bud-
get for risk utilization or real-time updates to partial fills) would
be integrated similarly into the QUBO formulation.
Python Code
Below is a Python code snippet that demonstrates how to:
• Construct a QUBO dictionary capturing inventory risk, fill prob-
ability, and price uncertainty.
• Solve this QUBO using a classical solver from the dimod library.
• Illustrate how one might extend the solver call to quantum an-
nealers for more complex or larger-scale problems.
126
import numpy as np
import dimod
def build_qubo(spreads, target_inventory, current_inventory,
fill_prob_estimates, price_uncertainty_estimates,
w_inventory=1.0, w_fill=1.0, w_price=1.0):
"""
Builds the QUBO dictionary representing the cost function for
,→ each possible
spread selection. The QUBO aims to minimize:
w_inventory * InventoryCost + w_fill * FillCost + w_price *
,→ PriceUncertaintyCost
subject to exactly one spread being chosen.
:param spreads: List of possible spread widths (e.g., [1, 2,
,→ 3]).
:param target_inventory: Desired target inventory level.
:param current_inventory: Current inventory level.
:param fill_prob_estimates: Estimated fill probabilities for
,→ each spread.
:param price_uncertainty_estimates: Estimated price uncertainty
,→ for each spread.
:param w_inventory: Weight of inventory risk in cost function.
:param w_fill: Weight (negative) of fill probability in cost
,→ function (since we want to maximize fill, we'll treat fill
,→ as negative cost).
:param w_price: Weight of price uncertainty in cost function.
:return: A dictionary representing the QUBO and a list of
,→ variable names.
"""
# Number of possible spreads
n_spreads = len(spreads)
# Variable naming convention: x_i for i-th spread
variables = [f'x_{i}' for i in range(n_spreads)]
# Initialize QUBO as a dictionary for dimod
# QUBO keys are (var_i, var_j), and values are their associated
,→ coefficients.
# For linear terms, j == i. For quadratic terms, j > i.
qubo = {}
# Penalty for picking more than one spread: (sum x_i - 1)^2
# Expand (sum x_i - 1)^2 = sum x_i^2 + sum_{i<j} 2*x_i*x_j - 2
,→ sum x_i + 1
# We'll incorporate linear terms (x_i^2) and quadratic cross
,→ terms (x_i*x_j)
# to ensure exactly one spread is chosen.
for i in range(n_spreads):
# x_i^2 term
127
qubo[(variables[i], variables[i])] = [Link]((variables[i],
,→ variables[i]), 0.0) + 1.0
for j in range(i + 1, n_spreads):
# 2 * x_i * x_j cross term
qubo[(variables[i], variables[j])] =
,→ [Link]((variables[i], variables[j]), 0.0) + 2.0
# We'll also add the cost function terms here:
# Inventory cost: (current_inventory - target_inventory)^2
,→ if we pick spread i
# For demonstration, interpret a bigger spread as less risk
,→ to inventory
inventory_cost = (current_inventory - target_inventory)**2 /
,→ (spreads[i] + 1e-9)
# Fill probability cost: we want to maximize fill
,→ probability, so treat it as negative cost
# fill_prob_estimates[i] is a fraction, e.g., 0.8 means 80%
,→ fill
fill_cost = -fill_prob_estimates[i]
# Price uncertainty cost: higher uncertainty, bigger cost
price_cost = price_uncertainty_estimates[i]
# Summation of weighted cost for picking spread i:
# cost_i = w_inventory * inventory_cost + w_fill * fill_cost
,→ + w_price * price_cost
cost_i = w_inventory * inventory_cost + w_fill * fill_cost +
,→ w_price * price_cost
# Add cost_i to the diagonal (linear term) for x_i
qubo[(variables[i], variables[i])] += cost_i
# Now incorporate the linear adjustments for the penalty term:
,→ -2 for sum x_i
# (from expanding (sum x_i - 1)^2, we get -2 for each x_i)
for i in range(n_spreads):
qubo[(variables[i], variables[i])] = [Link]((variables[i],
,→ variables[i]), 0.0) - 2.0
# Lastly, there's a constant +1 from ( -1 )^2, but that doesn't
,→ affect the QUBO solution,
# so we can omit adding a constant term (or we can track it if
,→ needed).
return qubo, variables
def solve_qubo(qubo_dict, variables, use_quantum_solver=False):
"""
Solves the QUBO using either a classical solver from dimod or,
if use_quantum_solver is True, a placeholder call for a quantum
,→ annealer.
128
:param qubo_dict: Dictionary representing QUBO.
:param variables: List of variable names.
:param use_quantum_solver: Whether to attempt quantum annealing
,→ approach.
:return: A dictionary with the best solution and the
,→ corresponding energy.
"""
# Convert dictionary to a dimod BinaryQuadraticModel
bqm = [Link].from_qubo(qubo_dict)
if not use_quantum_solver:
# Use a classical solver (ExactSolver) for demonstration
solver = [Link]()
solutions = [Link](bqm)
best_solution = [Link]
best_energy = [Link]
return best_solution, best_energy
else:
# In an actual system, you'd interface with a quantum
,→ annealer, e.g.:
#
# from [Link] import DWaveSampler, EmbeddingComposite
# sampler = EmbeddingComposite(DWaveSampler()) # Connect to
,→ quantum hardware
# solutions = [Link](bqm, num_reads=100)
#
# For demonstration, we will just use a placeholder approach
,→ here:
solver = [Link]()
solutions = [Link](bqm, num_reads=1000)
best_solution = [Link]
best_energy = [Link]
return best_solution, best_energy
def simulate_market_environment():
"""
In a real system, these parameters would come from real-time
,→ data feeds or predictive models:
- current inventory
- possible spreads
- fill probability estimates
- price uncertainty estimates
"""
# Example scenario:
spreads = [1, 2, 3] # in ticks
current_inventory = 15
target_inventory = 10
# Example fill probabilities for each spread (higher for tighter
,→ spreads)
fill_prob_estimates = [0.8, 0.5, 0.3]
129
# Example price uncertainty estimates for each spread (lower for
,→ wide spreads)
price_uncertainty_estimates = [3.0, 2.0, 1.0]
return spreads, current_inventory, target_inventory,
,→ fill_prob_estimates, price_uncertainty_estimates
def main():
# Simulate environment parameters for demonstration
spreads, current_inventory, target_inventory, fill_probs,
,→ price_uncertainties = simulate_market_environment()
# Build QUBO using the approach described
qubo_dict, var_names = build_qubo(spreads,
target_inventory=target_inventory,
current_inventory=current_inventory,
fill_prob_estimates=fill_probs,
price_uncertainty_estimates=price_uncertainties,
w_inventory=1.0, # weight for inventory cost
w_fill=1.0, # weight for fill probability
w_price=1.0) # weight for price uncertainty
# Solve QUBO with classical solver (ExactSolver) for
,→ demonstration
best_solution, best_energy = solve_qubo(qubo_dict, var_names,
,→ use_quantum_solver=False)
print("Best solution (spread selection):", best_solution)
print("Minimum energy (objective value):", best_energy)
# Optionally, demonstrate how quantum annealing call might look:
# best_solution_q, best_energy_q = solve_qubo(qubo_dict,
,→ var_names, use_quantum_solver=True)
# print("Best Q-annealer solution:", best_solution_q)
# print("Minimum energy (Q-annealer):", best_energy_q)
if __name__ == "__main__":
main()
This code defines several key components necessary for a quan-
tum annealing market-making framework:
• build_qubo function constructs the QUBO dictionary. It
encodes:
1. Penalties to ensure exactly one spread is chosen.
2. Weighted costs for inventory risk, fill probability, and
price uncertainty.
• solve_qubo uses dimod’s solvers to minimize the QUBO. A
130
classical solver is demonstrated, but quantum annealing hard-
ware could be used if configured.
• simulate_market_environment provides a mock scenario of
possible spreads, current inventory, fill estimates, and price
uncertainty estimates—parameters that would typically orig-
inate from real-time market data feeds or advanced predictive
models.
• main orchestrates the demonstration by simulating market
conditions, building the QUBO, solving it, and printing the
best solution.
By leveraging quantum annealing (or classical approximate meth-
ods), market makers can rapidly search a vast combinatorial space
of spread decisions, balancing risk, fill rates, and price fluctuations
to converge on near-optimal quoting strategies in real time.
131
Chapter 19
Peer Momentum
Siphoning
Below is a detailed explanation of how “Peer Momentum Siphon-
ing” can be implemented. The strategy monitors real-time order
flow, searching for large-lot trades that may signal institutional ac-
tivity. By detecting these blocks and measuring subsequent price
movement, the algorithm adaptively skews quotes—either front-
running or following potential momentum. It also includes a sim-
plified risk management layer to help balance position exposure.
Python Code
import numpy as np
import random
from collections import deque
class Trade:
"""
A basic structure to store a trade event.
Attributes:
time (float): Timestamp (in seconds or any chosen unit).
price (float): Execution price of the trade.
volume (int): Number of shares/contracts traded.
side (str): 'BUY' or 'SELL' depending on which side of the
,→ trade initiated.
"""
def __init__(self, time, price, volume, side):
[Link] = time
[Link] = price
132
[Link] = volume
[Link] = side
class PeerMomentumSiphoning:
"""
This class encapsulates the logic for detecting large-lot
,→ trades, measuring momentum,
and adjusting quotes based on institutional footprints.
"""
def __init__(self,
large_lot_threshold=1000,
momentum_window=10,
price_move_threshold=0.2,
max_position=500,
quote_spacing=0.01):
"""
Initialize parameters for the Peer Momentum Siphoning
,→ strategy.
:param large_lot_threshold: The minimum trade volume to flag
,→ as 'institutional'.
:param momentum_window: The number of trades (or ticks) to
,→ look back for price momentum.
:param price_move_threshold: Minimum price move needed to
,→ confirm momentum.
:param max_position: Maximum inventory exposure allowed.
:param quote_spacing: Base spread (fraction of price) used
,→ to set bid and ask quotes.
"""
self.large_lot_threshold = large_lot_threshold
self.momentum_window = momentum_window
self.price_move_threshold = price_move_threshold
self.max_position = max_position
self.quote_spacing = quote_spacing
# Internal state
[Link] = deque(maxlen=1000) # rolling buffer of
,→ trades
[Link] = 0 # current net position
,→ (positive = long, negative = short)
self.last_price = None # last traded price
self.bid_quote = None # current bid price
self.ask_quote = None # current ask price
self.momentum_signal = 0 # +1 means up momentum, -1
,→ means down momentum, 0 means neutral
def update_tape(self, trade):
"""
Processes a new trade, updates internal rolling buffer, and
,→ triggers momentum detection.
:param trade: A Trade object with time, price, volume, side.
133
"""
[Link](trade)
self.last_price = [Link]
# Detect large-lot trades
if [Link] >= self.large_lot_threshold:
# We interpret this as a strong potential push in the
,→ direction of [Link]
self._analyze_large_lot_trade(trade)
def _analyze_large_lot_trade(self, trade):
"""
Internal method that analyzes a large-lot trade and updates
,→ momentum signals.
We consider the effect on price and look for follow-through
,→ in subsequent trades.
:param trade: A Trade (large-lot) that might trigger
,→ momentum.
"""
# If the side is 'BUY', we suspect upward momentum, else
,→ downward
suspected_direction = +1 if [Link]() == 'BUY' else
,→ -1
self.update_momentum_signal(suspected_direction)
def update_momentum_signal(self, suspected_direction):
"""
Updates momentum signal based on suspected direction and
,→ recent price action.
:param suspected_direction: +1 if we suspect upward
,→ momentum, -1 if downward.
"""
# Gather recent trades within the momentum_window
if len([Link]) < self.momentum_window:
# Not enough data for a meaningful momentum measure
return
recent_prices = [[Link] for t in
,→ tuple([Link])[-self.momentum_window:]]
price_change = recent_prices[-1] - recent_prices[0]
# Check if price movement aligns with suspected_direction
if suspected_direction == +1 and price_change >=
,→ self.price_move_threshold:
self.momentum_signal = +1
elif suspected_direction == -1 and price_change <=
,→ -self.price_move_threshold:
self.momentum_signal = -1
else:
self.momentum_signal = 0
def adjust_quotes(self):
134
"""
Adjusts current bid_quote and ask_quote based on the
,→ momentum_signal.
If momentum_signal is +1 (up), we will move our quotes
,→ slightly higher (front-running).
If momentum_signal is -1 (down), we will move quotes
,→ slightly lower.
"""
if self.last_price is None:
return
base_spread = self.last_price * self.quote_spacing
# The skew logic: we shift quotes in direction of the
,→ momentum
if self.momentum_signal == +1:
# We want to front-run upward momentum with a higher
,→ quote
self.bid_quote = self.last_price + base_spread * 0.5
self.ask_quote = self.last_price + base_spread * 1.5
elif self.momentum_signal == -1:
# We want to front-run downward momentum with a lower
,→ quote
self.bid_quote = self.last_price - base_spread * 1.5
self.ask_quote = self.last_price - base_spread * 0.5
else:
# Neutral quoting
self.bid_quote = self.last_price - base_spread
self.ask_quote = self.last_price + base_spread
def execute_quotes(self, market_price):
"""
This simulates the matching/execution of our quotes if the
,→ market trades
through our bid or ask.
:param market_price: The current matched price from the
,→ market feed.
"""
if self.bid_quote is not None and market_price <=
,→ self.bid_quote:
# We buy at bid_quote
fill_size = [Link](1, 10) # random fill size
,→ for the example
[Link] += fill_size
print(f"BID filled at {self.bid_quote} for
,→ size={fill_size}, new position={[Link]}")
if self.ask_quote is not None and market_price >=
,→ self.ask_quote:
# We sell at ask_quote
fill_size = [Link](1, 10) # random fill size
,→ for the example
[Link] -= fill_size
135
print(f"ASK filled at {self.ask_quote} for
,→ size={fill_size}, new position={[Link]}")
# Check simple risk management: if position > max_position,
,→ we flatten some
if abs([Link]) > self.max_position:
self._flatten_position(market_price)
def _flatten_position(self, market_price):
"""
Simple risk management method that flattens position if it
,→ exceeds max_position.
We'll assume we can instantly offset with a market order for
,→ demonstration.
:param market_price: The current market price used for
,→ offset.
"""
needed_offset = abs([Link]) - self.max_position
# If position is too large, offset the difference
if [Link] > 0:
[Link] -= needed_offset
print(f"Flattener sells {needed_offset} units at
,→ {market_price}, new position={[Link]}")
else:
[Link] += needed_offset
print(f"Flattener buys {needed_offset} units at
,→ {market_price}, new position={[Link]}")
def simulate_market_data(ps_engine, total_trades=50):
"""
Simulate a stream of trades with random volumes, prices, and
,→ direction,
occasionally inserting large-lot trades to trigger momentum
,→ logic.
:param ps_engine: An instance of PeerMomentumSiphoning.
:param total_trades: Number of trades to simulate.
"""
current_time = 0.0
price = 100.0 # starting price
for i in range(total_trades):
current_time += 1.0
# small random walk on price
price += [Link](-0.5, 0.5)
volume = [Link](50, 1200) # can produce large-lot
,→ trades above threshold
side = 'BUY' if [Link]() < 0.5 else 'SELL'
trade = Trade(time=current_time, price=price, volume=volume,
,→ side=side)
# Update tape and adjust quotes
ps_engine.update_tape(trade)
136
ps_engine.adjust_quotes()
# Here we simulate a random 'market_price' that might fill
,→ our quotes
# We pick +/- 0.2 range around the last price
market_price = price + [Link](-0.2, 0.2)
ps_engine.execute_quotes(market_price)
def main():
"""
Main entry point for simulating the Peer Momentum Siphoning
,→ strategy.
"""
ps_engine = PeerMomentumSiphoning(
large_lot_threshold=800,
momentum_window=5,
price_move_threshold=0.2,
max_position=20,
quote_spacing=0.02
)
simulate_market_data(ps_engine, total_trades=100)
if __name__ == "__main__":
main()
• The Trade class encapsulates incoming trade data (e.g., time,
price, volume, side).
• The PeerMomentumSiphoning class handles core strategy logic:
– update_tape() ingests new trades and checks for large-
lot events.
– _analyze_large_lot_trade() and update_momentum_signal()
detect potential directional bias.
– adjust_quotes() skews the quote prices (bid and ask)
based on detected momentum.
– execute_quotes() simulates order fills when market prices
cross bid or ask levels.
– A simple risk management routine _flatten_position()
offsets positions that exceed the tolerance.
• The simulate_market_data() function generates a random
trade flow, including large-lot trades, to exercise the strategy.
• The main() function initializes and runs the whole simula-
tion, demonstrating how the strategy would function in a
streaming context.
137
Chapter 20
Critic-Based Latent
Feature Market
Making
Inspired by actor-critic methodologies, this algorithm uses a critic
network to evaluate the latent features of market states—such
as hidden volume accumulation, short-term reversals, or subtle
correlation shifts—while the actor proposes quoting actions (e.g.,
spreads and volumes). The critic’s feedback refines the actor’s pol-
icy, improving order placement and reducing misfills over time. A
dimension-reduction module (e.g., autoencoder or PCA) helps dis-
till raw market data into concise latent spaces, capturing essential
signals without excessive noise. The actor then bases its decisions
on these latent representations, allowing for more interpretable and
targeted market making. Below is a reference Python code snippet
that illustrates this workflow.
Python Code
import torch
import [Link] as nn
import [Link] as optim
import numpy as np
import random
# Set a random seed for reproducibility
torch.manual_seed(42)
138
[Link](42)
[Link](42)
# -----------------------------
# Environment Definition (Toy)
# -----------------------------
class MarketEnvironment:
"""
A simple, toy environment simulating market states and rewards
,→ for
demonstration purposes. Each step returns:
- state: random vector with shape [state_dim]
- reward: a simple function of the difference between chosen
,→ and 'ideal' spread
- done: whether the episode is finished (size-limited steps)
- info: extra info dictionary (unused here)
"""
def __init__(self, state_dim=10, max_steps=50):
self.state_dim = state_dim
self.max_steps = max_steps
self.current_step = 0
self.ideal_spread = 1.0 # Some fixed 'ideal' spread for
,→ reward shaping
[Link]()
def reset(self):
self.current_step = 0
# Simulate an initial random state
[Link] = [Link](0, 1, self.state_dim)
return [Link]
def step(self, action):
"""
action: [spread, volume] (both scalars for demonstration)
Return next_state, reward, done, info
"""
self.current_step += 1
# Reward is artificially designed for demonstration:
# The closer the 'spread' to self.ideal_spread, the better
chosen_spread, chosen_volume = action
spread_diff = abs(chosen_spread - self.ideal_spread)
reward = -spread_diff # negative cost if deviates from
,→ ideal
# Generate next state
next_state = [Link](0, 1, self.state_dim)
[Link] = next_state
# Check if episode is done
done = (self.current_step >= self.max_steps)
info = {}
return next_state, reward, done, info
139
# ---------------------------
# Dimension Reduction Module
# ---------------------------
class MarketAutoencoder([Link]):
"""
A simple autoencoder that acts as a feature extractor.
Input: state_dim
Hidden: 2 or 3 layers dimension
Output: Compressed latent representation
"""
def __init__(self, state_dim=10, latent_dim=3):
super(MarketAutoencoder, self).__init__()
[Link] = [Link](
[Link](state_dim, 16),
[Link](),
[Link](16, latent_dim),
[Link]()
)
[Link] = [Link](
[Link](latent_dim, 16),
[Link](),
[Link](16, state_dim),
)
def forward(self, x):
latent = [Link](x)
reconstructed = [Link](latent)
return reconstructed, latent
# ---------------------------
# Critic Network Definition
# ---------------------------
class CriticNetwork([Link]):
"""
Critic takes in the latent state (and possibly the action),
then outputs a value estimate (state-value or action-value).
For simplicity, use (latent + action_dim) -> value.
"""
def __init__(self, latent_dim=3, action_dim=2):
super(CriticNetwork, self).__init__()
[Link] = [Link](
[Link](latent_dim + action_dim, 32),
[Link](),
[Link](32, 1) # outputs single value
)
def forward(self, latent_features, action):
# action is shape [batch, action_dim]
# latent_features is shape [batch, latent_dim]
x = [Link]([latent_features, action], dim=1)
value = [Link](x)
return value
140
# --------------------------
# Actor Network Definition
# --------------------------
class ActorNetwork([Link]):
"""
Actor takes latent state, outputs continuous factors for
spread and volume quotes. We keep them as real numbers for
,→ demonstration.
"""
def __init__(self, latent_dim=3, action_dim=2):
super(ActorNetwork, self).__init__()
[Link] = [Link](
[Link](latent_dim, 32),
[Link](),
[Link](32, action_dim),
)
def forward(self, latent_features):
# Output raw action.
# We can clamp or transform for real environment usage,
# but keep it direct for demonstration.
action = [Link](latent_features)
return action
# ------------------------------------------------
# Combined Critic-Based Latent Feature Market Maker
# ------------------------------------------------
class CriticLatentFeatureMarketMaker:
"""
This class ties together:
- MarketAutoencoder (for dimension reduction)
- CriticNetwork (for value estimation)
- ActorNetwork (for quoting decisions)
- A training loop that updates both Actor and Critic
using an advantage-based or simple policy gradient approach.
"""
def __init__(self, state_dim=10, latent_dim=3, action_dim=2,
,→ lr=1e-3):
self.state_dim = state_dim
self.latent_dim = latent_dim
self.action_dim = action_dim
# Dimension reduction
[Link] = MarketAutoencoder(state_dim, latent_dim)
# Networks
[Link] = CriticNetwork(latent_dim, action_dim)
[Link] = ActorNetwork(latent_dim, action_dim)
# Optimizers
self.ae_optimizer =
,→ [Link]([Link](), lr=lr)
141
self.critic_optimizer = [Link]([Link](),
,→ lr=lr)
self.actor_optimizer = [Link]([Link](),
,→ lr=lr)
self.mse_loss = [Link]()
def train_autoencoder(self, batch_states, epochs=5):
"""
Train autoencoder on batch of states for dimension
,→ reduction.
Typically done offline or in parallel with policy learning.
"""
for _ in range(epochs):
reconstructed, _ = [Link](batch_states)
loss = self.mse_loss(reconstructed, batch_states)
self.ae_optimizer.zero_grad()
[Link]()
self.ae_optimizer.step()
def get_latent(self, state):
"""
Pass state through autoencoder's encoder part to get latent
,→ representation.
"""
with torch.no_grad():
_ , latent = [Link](state)
return latent
def select_action(self, latent_features):
"""
Actor forward pass to get action from latent features.
"""
action = [Link](latent_features)
return action
def evaluate_critic(self, latent_features, action):
"""
Critic forward pass to get value.
"""
value = [Link](latent_features, action)
return value
def update(self, transitions, gamma=0.99):
"""
transitions: list of (state, action, reward, next_state,
,→ done)
Use a simple advantage-based approach:
advantage = reward + gamma * V(next_state) - V(state)
then actor_loss = -log_prob(action) * advantage
(for demonstration, we skip distribution approach and use
,→ MSE or direct gradient).
"""
142
# Convert transitions to Tensors
states = [Link]([t[0] for t in transitions])
actions = [Link]([t[1] for t in transitions])
rewards = [Link]([t[2] for t in transitions])
next_states = [Link]([t[3] for t in transitions])
dones = [Link]([float(t[4]) for t in
,→ transitions])
# First update autoencoder on current states (simple
,→ approach)
self.train_autoencoder(states)
# Get latents for current and next states
_, latent_s = [Link](states)
_, latent_ns = [Link](next_states)
# Critic values
values_s = [Link](latent_s, actions).squeeze(-1)
# Next state value for a "greedy" action assumption
# We'll pick next_actions from the actor
next_actions = [Link](latent_ns)
values_ns = [Link](latent_ns, next_actions).squeeze(-1)
# Calculate target
targets = rewards + gamma * values_ns * (1.0 - dones)
# Critic loss: MSE between predicted values_s and targets
critic_loss = self.mse_loss(values_s, [Link]())
self.critic_optimizer.zero_grad()
critic_loss.backward()
self.critic_optimizer.step()
# Actor update: We interpret advantage = (targets -
,→ values_s)
advantage = (targets - values_s).detach()
# Actor tries to maximize advantage -> minimize negative
,→ advantage
# We'll do a simplistic approach with MSE to advantage
# In a real setting, you'd do policy gradient with log
,→ probabilities.
pred_actions = [Link](latent_s)
actor_values = [Link](latent_s,
,→ pred_actions).squeeze(-1)
# Minimizing -(advantage) ~ negative advantage or difference
,→ with advantage as a regression
# but let's keep it simple: MSE of actor_values and
,→ advantage
actor_loss = self.mse_loss(actor_values, advantage)
self.actor_optimizer.zero_grad()
actor_loss.backward()
self.actor_optimizer.step()
return critic_loss.item(), actor_loss.item()
143
# ---------------
# Training Script
# ---------------
def main():
# Hyperparams
episodes = 10
max_steps = 20
state_dim = 10
latent_dim = 3
action_dim = 2
env = MarketEnvironment(state_dim=state_dim,
,→ max_steps=max_steps)
agent = CriticLatentFeatureMarketMaker(state_dim=state_dim,
latent_dim=latent_dim,
action_dim=action_dim,
lr=1e-3)
all_rewards = []
for ep in range(episodes):
state = [Link]()
ep_rewards = 0
transitions = []
for step_i in range(max_steps):
# Convert state to torch and get latent
state_t = [Link](state).unsqueeze(0)
with torch.no_grad():
_, latent_s = [Link](state_t)
action_t = [Link](latent_s)
action = action_t.squeeze(0).numpy()
# Step environment
next_state, reward, done, _ = [Link](action)
ep_rewards += reward
[Link]((state, action, reward, next_state,
,→ done))
state = next_state
if done:
break
# Update networks after episode
critic_loss, actor_loss = [Link](transitions)
all_rewards.append(ep_rewards)
print(f"Episode {ep+1}/{episodes}, "
f"Total Reward: {ep_rewards:.3f}, "
f"Critic Loss: {critic_loss:.3f}, "
f"Actor Loss: {actor_loss:.3f}")
144
print("Training complete.")
print("Final average reward:", [Link](all_rewards))
if __name__ == "__main__":
main()
Below is an outline of how the solution is organized:
• MarketEnvironment: A toy simulation environment re-
turning random states and a reward that penalizes deviation
from an ideal spread.
• MarketAutoencoder: A simple autoencoder to extract la-
tent features from the raw state.
• CriticNetwork: Feeds on latent state + action to output a
single value, which estimates future reward.
• ActorNetwork: Produces quoting actions (spread, volume)
from the latent feature vector.
• CriticLatentFeatureMarketMaker: Coordinates the au-
toencoder, actor, and critic. It contains:
1. Methods for autoencoder training (to refine dimension
reduction).
2. Methods for critic updates (value estimation).
3. Methods for actor updates (policy improvement using
advantage-like signals).
• main(): Runs a small training loop over a number of episodes,
collecting transitions and updating networks at the end of
each episode.
This toy code demonstrates how an actor-critic approach can
harness latent feature extraction—via a simple autoencoder—to
guide quoting decisions. In real applications, the environment, re-
ward structure, dimensionality reduction, and optimization details
would be far more sophisticated to handle the complexities of mar-
ket microstructure.
145
Chapter 21
Graph Neural Network
Price Flow
Below is a detailed explanation of how one could implement a
dynamic-graph-based market maker using Graph Neural Networks
(GNNs) to process order book data in real time. The core idea is
to represent the order book and trades as a continually updating
graph structure. Nodes correspond to price levels (with associated
features such as volume, best bid/ask data, etc.), and edges cap-
ture the interactions (trades, quote changes, or significant liquidity
links) among these levels. A GNN then processes this dynamic
graph, producing node-level or graph-level embeddings that can
inform quoting decisions. By analyzing these embeddings, the algo-
rithm identifies supply-demand imbalances or “price flow hotspots”
and adaptively adjusts its quotes.
Key steps in the approach:
• Continuously ingest or simulate order book events.
• Build/update a graph where each node corresponds to an order
book level.
• Define edges to represent immediate liquidity interactions or
trade connections.
• Pass this graph data to a GNN to learn node embeddings.
• Extract the embeddings to infer local supply-demand pressure
and compute quoting actions accordingly.
Many libraries (e.g., PyTorch Geometric or DGL) can facili-
tate GNN-based modeling. Below is an illustrative Python code
snippet using PyTorch Geometric to demonstrate this concept in
a simplified environment.
146
Python Code
Below is a Python code snippet that implements a demonstration
of the core components of a graph-based approach for processing
order book data, training a GNN, and making quoting decisions:
import torch
import [Link] as F
from torch_geometric.nn import GCNConv
from torch_geometric.data import Data, Batch
import numpy as np
import random
import time
#####################################################################
# Simulated Data Generation
#####################################################################
def simulate_order_book_stream(num_steps=50, max_price_levels=5):
"""
Generates a simulated stream of order book data over
,→ `num_steps`.
Each 'tick' of data includes:
- A list of price levels (bid or ask).
- Volume at each level.
- Trade flow or other connectivity info indicating edges among
,→ levels.
"""
# For simplicity, randomly create scaffolding data:
# We'll assume each "price level" is a node,
# and we'll create random edges to simulate trade flow.
for step in range(num_steps):
# Random number of price levels
num_levels = [Link](2, max_price_levels)
# Node features:
# price offset from some mid-price, volume, is_bid (0 or 1)
node_features = []
for _ in range(num_levels):
price_offset = [Link](-1, 1) # e.g., from
,→ mid-price
volume = [Link](0.1, 1.0)
is_bid = [Link](0, 1)
node_features.append([price_offset, volume, is_bid])
node_features = [Link](node_features,
,→ dtype=[Link])
# Edges: random adjacency among these levels
147
# We'll define edges to have random connectivity to indicate
,→ liquidity/trade flow
if num_levels > 1:
edge_index_pairs = []
for src in range(num_levels):
for dst in range(num_levels):
if src != dst and [Link]() > 0.5:
edge_index_pairs.append([src, dst])
if not edge_index_pairs:
# ensure at least one edge if we end up empty
edge_index_pairs.append([0, 1])
edge_index = [Link](edge_index_pairs,
,→ dtype=[Link]).t().contiguous()
else:
# if only 1 level, no edges to define
edge_index = [Link]([],
,→ dtype=[Link]).reshape(2, 0)
yield step, node_features, edge_index
#####################################################################
# GNN Model Definition
#####################################################################
class OrderBookGNN([Link]):
def __init__(self, in_channels=3, hidden_dim=8, out_channels=4):
"""
Simple GNN with two GCNConv layers.
:param in_channels: Dimension of node features
,→ (price_offset, volume, is_bid).
:param hidden_dim: Dimension of the hidden layer.
:param out_channels: Output embedding dimension.
"""
super(OrderBookGNN, self).__init__()
self.conv1 = GCNConv(in_channels, hidden_dim)
self.conv2 = GCNConv(hidden_dim, out_channels)
def forward(self, x, edge_index):
"""
Forward pass of the GNN.
:param x: Node feature matrix [num_nodes, in_channels].
:param edge_index: Graph connectivity [2, num_edges].
:return: Node embeddings [num_nodes, out_channels].
"""
x = self.conv1(x, edge_index)
x = [Link](x)
x = self.conv2(x, edge_index)
return x
#####################################################################
# Training Procedure
#####################################################################
148
def train_gnn_model(model, data_list, epochs=10, lr=0.01):
"""
Example training procedure for a GNN using a batch of historical
,→ snapshots.
For demonstration, we assume a trivial label (e.g., random or
,→ dummy),
but in practice you'd have a real label (like realized
,→ profit/imbalance).
:param model: OrderBookGNN instance.
:param data_list: List of PyTorch Geometric Data objects.
:param epochs: Number of training epochs.
:param lr: Learning rate.
"""
optimizer = [Link]([Link](), lr=lr)
[Link]()
# For demonstration, we create random labels for node-level
,→ classification/regression
# In real scenario, labels might correspond to future price
,→ movement or fill rates.
for epoch in range(epochs):
[Link](data_list)
total_loss = 0
for data in data_list:
# Create random labels for each node
# Suppose we do a node-level regression to a 4D target
y = [Link](([Link](0), 4))
# Move data and labels to your device if GPU is used
optimizer.zero_grad()
out = model(data.x, data.edge_index)
# Mean squared error loss
loss = F.mse_loss(out, y)
[Link]()
[Link]()
total_loss += [Link]()
avg_loss = total_loss / len(data_list)
print(f"Epoch {epoch+1}/{epochs}, Loss: {avg_loss:.4f}")
#####################################################################
# Quoting Logic
#####################################################################
def generate_quoting_decisions(node_embeddings):
"""
Given node embeddings from the GNN, decide how to place quotes.
Each node embedding can indicate local supply-demand pressure,
,→ etc.
Here we do a simple demonstration that picks a spread/mid offset
based on embedding magnitude.
:param node_embeddings: [num_nodes, out_channels].
149
:return: List of (spread, quote_offset) decisions or any custom
,→ logic.
"""
decisions = []
for emb in node_embeddings:
# Simple heuristic: higher embedding norm => widen spread.
magnitude = [Link](emb).item()
spread = 0.01 + 0.01 * magnitude
offset = 0.0
[Link]((spread, offset))
return decisions
#####################################################################
# Main Execution Flow Demonstration
#####################################################################
if __name__ == "__main__":
# Initialize the GNN model
model = OrderBookGNN(in_channels=3, hidden_dim=8,
,→ out_channels=4)
# Step 1: Collect a small dataset of historical (simulated)
,→ snapshots
data_list = []
for step, node_features, edge_index in
,→ simulate_order_book_stream(num_steps=30):
# Convert to a PyG Data object
data = Data(x=node_features, edge_index=edge_index)
data_list.append(data)
# Step 2: Train the GNN model on historical data
train_gnn_model(model, data_list, epochs=5, lr=0.01)
# Step 3: Demonstration of real-time inference
[Link]()
with torch.no_grad():
# We'll simulate a new stream for 'live' data
for step, node_features, edge_index in
,→ simulate_order_book_stream(num_steps=5):
# Build one snapshot
data = Data(x=node_features, edge_index=edge_index)
# Generate embeddings from GNN
embeddings = model(data.x, data.edge_index)
# Make quoting decisions based on embeddings
decisions = generate_quoting_decisions(embeddings)
# For demonstration, we just print them
print(f"\n--- Real-Time Step {step} ---")
print("Node embeddings:")
print(embeddings)
150
print("Quoting decisions (spread, offset) for each
,→ node:")
print(decisions)
# Sleep to simulate time gap in streaming
[Link](1)
This code defines the following key components necessary for a
graph-based order book analysis:
• A simulated “live” order book stream with random node fea-
tures and edges (in a real system, this would be replaced by
actual exchange market data).
• A GNN model (OrderBookGNN) composed of two GCNConv
layers from the PyTorch Geometric library.
• A demonstration of a training routine (train_gnn_model)
that, in practice, would use real labels reflecting predictive
targets such as future price changes or fill probabilities.
• A quoting decision function (generate_quoting_decisions) that
uses the learned node embeddings to determine how tightly
or widely to quote around each price level.
• A main execution flow that simulates data collection and pro-
cesses a “live” order book stream, outputting real-time quot-
ing decisions.
While the above code is simplified for educational purposes,
it showcases how dynamic graphs and GNNs can be leveraged to
detect cross-level liquidity pressure, integrate local supply-demand
signals, and produce context-aware market making strategies in
real time.
151
Chapter 22
Volatility-Triggered
Reinforcement Bandits
In this chapter, we explore a “Volatility-Triggered Reinforcement
Bandits” approach that fuses multi-armed bandit strategies with
reinforcement learning elements. The main idea is to dynamically
select from a set of quoting “arms,” such as different spread sizes
or inventory adjustments, while leveraging a real-time volatility
signal that helps determine which arms may yield higher expected
rewards under current market conditions.
Over time, the algorithm learns to allocate more quoting actions
to arms that perform well during specific volatility regimes—tight
spreads in low-volatility environments, wider spreads when volatil-
ity picks up, or adaptive inventory actions based on realized profit
and risk constraints. By continually updating its estimates of each
arm’s performance, the system converges on quoting strategies ro-
bust to both stable and chaotic market phases.
This approach can be broken down into the following core com-
ponents:
• A set of arms (spread sets or inventory adjustments).
• A volatility estimator or switch that identifies current market
conditions.
• A reward function that measures profit and/or risk-adjusted per-
formance for each chosen arm.
• A bandit algorithm (-greedy, UCB, Thompson Sampling, etc.)
combined with reinforcement-learning style updates to continu-
ously improve action selection.
152
Python Code
Below is a Python code snippet that demonstrates a simplified ver-
sion of the Volatility-Triggered Reinforcement Bandits algorithm,
including: 1) A simulation environment that randomly generates
price movements and volatility levels. 2) A multi-armed bandit
mechanism that selects an optimal action (quoting spread size)
conditioned on volatility. 3) A reward function that reflects PnL
outcomes when choosing different spreads in different volatility
regimes.
import numpy as np
import random
class MarketEnvironment:
"""
A simplified environment that simulates price movements and
estimates volatility. Rewards depend on the chosen spread
and the current volatility.
"""
def __init__(self,
initial_price=100.0,
initial_volatility=0.02,
volatility_threshold=0.03,
seed=42):
[Link](seed)
[Link](seed)
[Link] = initial_price
[Link] = initial_volatility
self.vol_threshold = volatility_threshold
self.step_count = 0
def evolve_market(self):
"""
Randomly changes the price and volatility to simulate
different market regimes over time.
"""
# Random walk for price
price_change = [Link](0, [Link] * 2)
[Link] += price_change
# Random drift for volatility
vol_change = [Link](0, 0.001)
[Link] = max(0.0001, [Link] + vol_change)
self.step_count += 1
def get_current_state(self):
"""
Returns the current price and volatility.
153
In a more advanced setup, you might include
order book depth, time of day, etc.
"""
return [Link], [Link]
def is_high_vol(self):
"""
Determines whether the current volatility
exceeds a threshold, acting as our volatility switch.
"""
return [Link] > self.vol_threshold
def get_reward(self, spread):
"""
Computes a reward (PnL) for choosing a given spread.
Higher spreads might reduce fill probability or PnL in calm
markets, but protect better when volatility is high.
This is purely illustrative; real logic can be more complex.
"""
# Basic heuristic:
# - If volatility is high, bigger spread can yield higher
,→ net reward
# - If volatility is low, smaller spread can yield higher
,→ net reward
price, vol = self.get_current_state()
# The fill probability is modeled as inversely related to
,→ spread
fill_probability = max(0.0, 1.0 - spread * 10)
# Potential profit is bigger with a larger spread if
,→ volatility is high
if vol > self.vol_threshold:
expected_profit = (spread * 100) * fill_probability
else:
# Lower volatility scenario might benefit from tighter
,→ spread, but
# we use a smaller multiplier for large spreads in calm
,→ markets.
factor = max(5.0, 20.0 * (1.0 - spread))
expected_profit = spread * factor * fill_probability
# We add some random noise to simulate unpredictable market
,→ outcomes
noise = [Link](0, vol * 50)
reward = expected_profit + noise
return reward
class VolatilityTriggeredBandit:
"""
A multi-armed bandit system with a volatility trigger. It keeps
separate performance estimates (Q-values) for each arm under
154
low-volatility and high-volatility conditions.
"""
def __init__(self, arms, alpha=0.1, epsilon=0.1):
"""
:param arms: List of possible 'spread' values or actions.
:param alpha: Learning rate for Q-value updates.
:param epsilon: Exploration rate for epsilon-greedy policy.
"""
[Link] = arms
[Link] = alpha
[Link] = epsilon
# Q-values separated by volatility regime: 0 -> low-vol, 1
,→ -> high-vol
self.q_values = {
0: [Link](len(arms)), # For low volatility
1: [Link](len(arms)) # For high volatility
}
# Keep counts for each arm in each regime, used for optional
,→ exploration stats
self.arm_counts = {
0: [Link](len(arms), dtype=int),
1: [Link](len(arms), dtype=int)
}
def select_arm(self, vol_is_high):
"""
Selects an arm using an epsilon-greedy approach.
:param vol_is_high: Boolean indicating if current vol is
,→ high.
:return: Index of chosen arm, chosen spread.
"""
regime = 1 if vol_is_high else 0
# Epsilon-greedy
if [Link]() < [Link]:
arm_idx = [Link](len([Link]))
else:
arm_idx = [Link](self.q_values[regime])
return arm_idx, [Link][arm_idx]
def update_q_values(self, vol_is_high, arm_idx, reward):
"""
Updates Q-values (bandit learning rule).
:param vol_is_high: Boolean indicating high or low
,→ volatility.
:param arm_idx: The index of the chosen arm.
:param reward: Observed reward for that choice.
"""
regime = 1 if vol_is_high else 0
self.arm_counts[regime][arm_idx] += 1
# Standard bandit Q-value update
155
old_q = self.q_values[regime][arm_idx]
self.q_values[regime][arm_idx] = old_q + [Link] *
,→ (reward - old_q)
def run_simulation(num_steps=500,
arms=[0.01, 0.02, 0.05, 0.08, 0.1],
alpha=0.1,
epsilon=0.1,
verbose=False):
"""
Runs a simulation of the volatility-triggered bandit over a
,→ number of steps,
returning history and final Q-values for analysis.
"""
# Instantiate environment and bandit
env = MarketEnvironment()
bandit = VolatilityTriggeredBandit(arms, alpha, epsilon)
reward_history = []
chosen_spreads = []
volatility_history = []
for t in range(num_steps):
# Evolve the market environment
env.evolve_market()
current_price, current_vol = env.get_current_state()
# Check the volatility regime
high_vol = env.is_high_vol()
# Bandit selects an arm (spread) based on the current
,→ volatility regime
arm_idx, spread = bandit.select_arm(high_vol)
# Environment returns a reward for that chosen spread
reward = env.get_reward(spread)
# Bandit updates Q-values based on observed reward
bandit.update_q_values(high_vol, arm_idx, reward)
# Record data for analysis
reward_history.append(reward)
chosen_spreads.append(spread)
volatility_history.append(current_vol)
if verbose and (t % 50 == 0):
print(f"Step {t}, Price: {current_price:.2f}, Vol:
,→ {current_vol:.4f}, "
f"ChosenSpread: {spread:.2f}, Reward:
,→ {reward:.2f}")
return {
"reward_history": reward_history,
156
"chosen_spreads": chosen_spreads,
"volatility_history": volatility_history,
"q_values": bandit.q_values,
"arm_counts": bandit.arm_counts
}
if __name__ == "__main__":
# Run the simulation with verbose output
results = run_simulation(num_steps=500, verbose=True)
print("\nFinal Q-values for low-vol regime:\n",
,→ results['q_values'][0])
print("Final Q-values for high-vol regime:\n",
,→ results['q_values'][1])
print("\nArm selection counts (low-vol regime):",
,→ results['arm_counts'][0])
print("Arm selection counts (high-vol regime):",
,→ results['arm_counts'][1])
Outside the code, the key steps of this design are:
• MarketEnvironment simulates evolving price and volatil-
ity, generating the observations and rewards for each spread
choice.
• VolatilityTriggeredBandit manages two sets of Q-values
(low volatility vs. high volatility) and uses an -greedy policy
to select spreads.
• get_reward(spread) in the environment models how re-
ward (or PnL) depends on both the chosen spread and the
current volatility.
• run_simulation puts everything together, stepping through
the market over multiple iterations, selecting spreads, and
updating performance estimates.
The result is a streamlined demonstration of how multi-armed
bandit logic can adapt quoting decisions to current market volatil-
ity, converging on behaviors that handle both calm and turbulent
phases effectively.
157
Chapter 23
Sparse Regression
Quoting
Below is a comprehensive explanation of how our specialized quot-
ing mechanism leverages L1-regularized regression (Lasso) to iso-
late critical features for immediate price changes. The procedure
continuously updates the model on a rolling (“adaptive”) data win-
dow, performs robust standardization, and fits a sparse model that
highlights high-impact signals. Once Lasso identifies significant
features, the quoting logic places tight, targeted orders aiming for
quick fills. A custom cost function implicitly balances profitability
(through signals that generate profitable quotes) and interpretabil-
ity (by minimizing the number of non-zero coefficients).
Key steps include:
1. Collecting or simulating incoming market data (e.g., price changes,
volume, order book features).
2. Maintaining an adaptive rolling window that is re-centered or
shifted over time.
3. Performing robust scaling to mitigate outliers.
4. Fitting a Lasso model that zeroes out insignificant features, re-
taining only those most predictive of immediate price moves.
5. Generating quotes guided by the extracted signals—those sig-
nals that remain after the sparse regression phase.
6. Periodically computing a simple “profitability vs. interpretabil-
ity” measure to guide the choice of regularization hyperparameters
and adaptive quoting thresholds.
158
Python Code
Below is a Python code snippet that demonstrates these concepts
end-to-end, including data simulation, rolling-window updates, ro-
bust normalization, L1 regression, signal extraction, and quoting
decisions:
import numpy as np
import pandas as pd
from sklearn.linear_model import Lasso
from [Link] import RobustScaler
class L1MarketMaker:
"""
Implements a specialized quoting mechanism using L1-regularized
,→ regression
(Lasso) to isolate critical signals for immediate price changes.
"""
def __init__(self,
window_size=50,
alpha=0.01,
quote_multiplier=1.0):
"""
:param window_size: Number of latest observations to include
,→ in the rolling window.
:param alpha: Regularization strength for Lasso.
:param quote_multiplier: Scales the generated quote
,→ magnitude from extracted signals.
"""
self.window_size = window_size
[Link] = alpha
self.quote_multiplier = quote_multiplier
# Placeholder for storing recent observations in a rolling
,→ window
self.feature_window = [Link]()
self.target_window = [Link](dtype=float)
# RobustScaler for outlier-insensitive normalization
[Link] = RobustScaler()
# Lasso model initialization
[Link] = Lasso(alpha=[Link], fit_intercept=True)
def update_data_window(self, features: [Link], target:
,→ float):
"""
Updates the rolling window with new data points.
:param features: A [Link] of feature values for this
,→ timestep.
159
:param target: The next price change or immediate
,→ one-step-ahead target.
"""
# Append new observation
new_row = features.to_frame().T # Convert series to
,→ DataFrame
self.feature_window = [Link]([self.feature_window,
,→ new_row], ignore_index=True)
self.target_window =
,→ self.target_window.append([Link]([target]),
,→ ignore_index=True)
# Ensure we do not exceed the defined window size
if len(self.feature_window) > self.window_size:
self.feature_window =
,→ self.feature_window.iloc[-self.window_size:]
self.target_window =
,→ self.target_window.iloc[-self.window_size:]
def fit_l1_model(self):
"""
Fits the L1-regularized (Lasso) model on the current rolling
,→ window.
Applies robust scaling before training to reduce the impact
,→ of outliers.
"""
if len(self.feature_window) < 2:
# Not enough data to fit
return
# Scale features
scaled_features =
,→ [Link].fit_transform(self.feature_window.values)
# Fit Lasso
[Link](scaled_features, self.target_window.values)
def current_signals(self) -> [Link]:
"""
Returns the non-zero signals (coefficients) from the Lasso
,→ model.
If coefficients are zeroed out, those features are
,→ considered irrelevant.
:return: Array of Lasso coefficients (signals).
"""
return [Link].coef_
def generate_quotes(self, latest_features: [Link]):
"""
Generates a quote based on the Lasso-predicted signals.
:param latest_features: A [Link] of the most recent
,→ feature set.
:return: The recommended quote price shift or offset.
"""
160
if len(self.feature_window) < 2:
# Not enough data to generate a meaningful quote
return 0.0
# Scale the incoming features using the same scaler
scaled_input =
,→ [Link]([latest_features.values])
predicted_change = [Link](scaled_input)[0]
# Example quoting logic: multiply predicted price change by
,→ a user-defined factor
# Positive predicted_change => place quotes slightly above
,→ current mid-price, etc.
# Negative => place quotes slightly below mid-price to
,→ accelerate fills.
return predicted_change * self.quote_multiplier
def calculate_cost_function(self, realized_profit: float) ->
,→ float:
"""
An example cost function that balances realized
,→ profitability and interpretability.
:param realized_profit: Example measure of profit from the
,→ strategy over the last window.
:return: Cost value (lower is better).
"""
# Interpretability measure: number of nonzero coefficients
nonzero_coefs = [Link]([Link]([Link].coef_) > 1e-8)
# Cost: negative profit + some penalty for complexity
return -realized_profit + 0.5 * nonzero_coefs
def simulate_market_data(num_steps=200, seed=42):
"""
Simulates random market data for demonstration:
- 4 random features (e.g., volume change, order book imbalance,
,→ etc.)
- A target that is slightly correlated with one or two of the
,→ features.
:param num_steps: Number of time steps to simulate.
:param seed: Random seed for reproducibility.
:return: A DataFrame with columns [f1, f2, f3, f4], and a Series
,→ of slightly correlated targets.
"""
[Link](seed)
# Example random features
f1 = [Link](0, 1, num_steps)
f2 = [Link](0, 1, num_steps)
f3 = [Link](0, 1, num_steps)
f4 = [Link](0, 1, num_steps)
# Suppose the true underlying model is:
# target = 0.5*f1 + -1.0*f3 + some noise
noise = [Link](0, 0.5, num_steps)
161
target = 0.5 * f1 - 1.0 * f3 + noise
df_features = [Link]({
'f1': f1,
'f2': f2,
'f3': f3,
'f4': f4
})
return df_features, [Link](target)
def run_l1_quoting_simulation(num_steps=200):
"""
Main loop demonstrating how the L1MarketMaker might be used in
,→ practice:
1. Simulates market data
2. Feeds data into an L1MarketMaker with a rolling window
3. Periodically fits the model and generates quotes
4. Computes an example measure of realized profit and cost
"""
# Simulate data
df_features, target = simulate_market_data(num_steps=num_steps)
# Initialize market maker
mm = L1MarketMaker(window_size=20, alpha=0.05,
,→ quote_multiplier=1.5)
# Variables to simulate "realized profit"
realized_profit = 0.0
profit_history = []
# For demonstration, assume we have a mid-price we track
current_price = 100.0
for i in range(num_steps):
# Step 1: Update rolling window with new data
mm.update_data_window(df_features.iloc[i], [Link][i])
# Step 2: Periodically refit the model
if i % 5 == 0 and i > 5:
mm.fit_l1_model()
# Step 3: Generate a quote based on the newest data
recommended_quote_offset =
,→ mm.generate_quotes(df_features.iloc[i])
# Example profit calculation:
# If recommended_quote_offset is positive, we "sell"
,→ slightly above current_price
# If negative, we "buy" below current_price. We simulate a
,→ small fill for demonstration.
# In a real scenario, you'd have actual fills, inventory
,→ tracking, etc.
fill_size = 1.0
162
if recommended_quote_offset > 0:
# We "sell" at current_price + offset
fill_price = current_price + recommended_quote_offset
# Suppose next price is current_price + next target
# Profit from short if the price goes down
next_price = current_price + [Link][i]
# simplistic PnL
trade_pnl = fill_size * (fill_price - next_price)
else:
# We "buy" at current_price + offset (offset is negative
,→ => buy below current price)
fill_price = current_price + recommended_quote_offset
next_price = current_price + [Link][i]
trade_pnl = fill_size * (next_price - fill_price)
realized_profit += trade_pnl
profit_history.append(realized_profit)
# Update current price for next iteration's reference
# This is a simplified approach for illustrative purposes.
current_price = next_price
# Finally, compute cost function
cost = mm.calculate_cost_function(realized_profit)
# Present findings
print("Final Realized Profit:", realized_profit)
print("Cost Function Value:", cost)
print("Non-zero Coefficients in Lasso Model:",
,→ [Link]([Link]([Link].coef_) > 1e-8))
print("Profit History (last 10 steps):", profit_history[-10:])
return mm, profit_history
if __name__ == "__main__":
# Run the simulation to demonstrate functionality
strategy, pnl_history = run_l1_quoting_simulation(num_steps=100)
• The “L1MarketMaker” class manages the rolling window of
feature/target data, fits a Lasso model, and uses its coeffi-
cients to drive quote generation.
• The “fit_l1_model” method applies robust scaling and in-
vokes Lasso, discarding less informative features automati-
cally via coefficient shrinkage.
• The “current_signals” method extracts the active (non-zero)
signals from the Lasso coefficients.
• The “generate_quotes” function returns a suggested price off-
set based on the predicted price move—positive offsets typ-
163
ically correspond to quoting at higher ask prices, negative
offsets at lower bid prices, etc.
• The “calculate_cost_function” illustrates how one might com-
bine an interpretability penalty (quantity of non-zero coeffi-
cients) with realized profit into a single metric.
• The “simulate_market_data” method produces synthetic fea-
tures and a hidden true model to drive target values.
• The “run_l1_quoting_simulation” function brings everything
together, feeding data step-by-step into the system, peri-
odically retraining the Lasso model, generating quotes, and
tracking a simple realized profit measure.
This end-to-end example demonstrates how an L1-regularized
quoting mechanism can adaptively refit to market dynamics, focus-
ing on high-impact features while minimizing complexity to main-
tain interpretability.
164
Chapter 24
Liquidity Mining
Arbitrage
Below is a comprehensive explanation and Python code snippet
illustrating how one might implement a liquidity mining arbitrage
strategy. This strategy aims to systematically place orders that
unlock fee rebates or reward tokens offered by exchanges in their
liquidity mining programs, while carefully monitoring partial fill
risk, adverse selection, and net benefit.
In essence, the approach involves:
• Connecting to exchange APIs to gather real-time data about
filled volumes, reward rates, and current order book conditions.
• Continuously calculating the net benefit of placing large-limit
orders, factoring in fees, partial fills, reward tokens, and potential
slippage.
• Dynamically adjusting the trading volume and quoted spreads
in response to changing market conditions and updated reward
incentives.
The key steps include:
1. Fetching exchange or broker data (order book, filled trades,
liquidity mining parameters).
2. Estimating partial fill probabilities and potential adverse selec-
tion costs (where the market moves unfavorably against your large
orders).
3. Calculating the real-time net benefit of providing large volumes
(fee rebates + reward tokens slippage cost partial fill risk).
4. Updating or canceling orders if the real-time net benefit falls
165
below a threshold, or scaling up if conditions are favorable.
This loop continues until a strategic stop condition (e.g., daily tar-
get volume, time cut-off, or diminishing reward threshold), thereby
systematically capturing exchange-provided incentives.
Python Code
import time
import random
import numpy as np
class ExchangeAPIPlaceholder:
"""
Placeholder class simulating interactions with an exchange.
In a real deployment scenario, this would call the exchange's
,→ REST or WebSocket APIs
to manage orders, retrieve market data, and track reward
,→ programs.
"""
def __init__(self):
# Simulate an internal order book and reward parameters
self.best_bid = 99.5
self.best_ask = 100.5
self.order_book_depth = [(99.4, 100.6)] # A simplified
,→ single-level mock
self.mining_reward_rate = 0.0002 # Reward tokens per
,→ unit volume
self.mining_fee_rebate = 0.0001 # Fee rebate
,→ fraction
def get_best_bid_ask(self):
"""
Returns tuple (best_bid, best_ask).
"""
# In real code, this method would query the live order book
,→ from the exchange
return (self.best_bid, self.best_ask)
def get_liquidity_mining_info(self):
"""
Returns parameters related to the liquidity mining program
,→ (token reward rate, fee rebates, etc.).
"""
return {
"reward_rate": self.mining_reward_rate,
"fee_rebate": self.mining_fee_rebate
}
def place_limit_order(self, price, volume, side):
"""
166
Simulates placing a limit order on the exchange.
:param price: The limit price for the order.
:param volume: The size (quantity) of the order.
:param side: 'buy' or 'sell'.
:return: A simulated order ID (string).
"""
# In a real implementation, this would send an order to the
,→ exchange,
# returning an order ID that we can track for fills.
order_id = f"order_{[Link](1000,9999)}"
print(f"[DEBUG] Placed {side} limit order with ID
,→ {order_id}, price: {price}, volume: {volume}")
return order_id
def cancel_order(self, order_id):
"""
Simulates canceling an existing limit order by order_id.
"""
# Real implementation would call the exchange's cancel
,→ endpoint.
print(f"[DEBUG] Canceled order with ID {order_id}")
def get_filled_volume_and_price(self, order_id):
"""
Retrieves the volume filled and average fill price for an
,→ order.
In real code, this would poll the exchange to see if the
,→ order was partially or fully filled.
For this placeholder, we'll simulate random fills.
:return: (filled_volume, fill_price)
"""
filled_volume = [Link](0, 1)
fill_price = [Link](self.best_bid, self.best_ask)
return filled_volume, fill_price
def estimate_partial_fill_cost(volume, virtual_fill_ratio=0.5,
,→ slippage_factor=0.001):
"""
Estimate the cost resulting from partial fills and potential
,→ market slippage.
:param volume: The intended order volume.
:param virtual_fill_ratio: Estimated ratio that actually gets
,→ filled based on historical data.
:param slippage_factor: Fraction representing typical slippage
,→ encountered on partial fills.
:return: Estimated slippage cost in monetary units.
"""
# In actual scenario, these parameters would be gleaned from
,→ historical fill data and real-time market conditions
expected_filled_volume = volume * virtual_fill_ratio
# Slippage cost is approximated as volume * price move *
,→ slippage_factor for demonstration
167
# We'll treat the 'price move' as an abstract fraction of an
,→ average mid price. e.g. 0.001 * mid price
base_price = 100.0 # Example mid price
estimated_slippage = expected_filled_volume * (base_price *
,→ slippage_factor)
return estimated_slippage
def compute_reward_tokens(volume, reward_rate):
"""
Compute how many reward tokens are awarded based on volume and
,→ the exchange's reward rate.
:param volume: The volume contributed to the order book.
:param reward_rate: Amount of tokens awarded per volume unit.
:return: Number of reward tokens earned.
"""
return volume * reward_rate
def compute_fee_rebate(volume, price, fee_rebate):
"""
Compute the total fee rebate in monetary terms for providing
,→ liquidity.
:param volume: Order volume.
:param price: Execution price or mid-price approximation.
:param fee_rebate: Fraction of notional returned as a rebate.
:return: Rebate in monetary terms.
"""
notional = volume * price
return notional * fee_rebate
def net_benefit_of_order(volume, side, reward_rate, fee_rebate,
,→ partial_fill_ratio=0.5):
"""
Calculate the net benefit of placing a large order, factoring in
,→ reward tokens, fee rebates,
partial fills, and slippage cost.
:param volume: The order volume.
:param side: 'buy' or 'sell'.
:param reward_rate: Liquidity mining reward rate (tokens per
,→ volume).
:param fee_rebate: Fraction of notional returned as a rebate.
:param partial_fill_ratio: The fraction of the volume likely to
,→ be filled.
:return: Net benefit (positive or negative) in some monetary
,→ unit.
"""
# Suppose the average fill price is approximated by 100 for
,→ demonstration
# In practice, you'd use the expected fill price or best bid/ask
approximate_price = 100.0
# Calculate reward tokens earned if the order is partially or
,→ fully filled
168
reward_tokens = compute_reward_tokens(volume *
,→ partial_fill_ratio, reward_rate)
# Suppose each reward token is worth $1.5 in the open market
,→ (example assumption)
token_market_value = 1.5
reward_value = reward_tokens * token_market_value
# Calculate fee rebate
fee_rebate_value = compute_fee_rebate(volume *
,→ partial_fill_ratio, approximate_price, fee_rebate)
# Estimate partial fill cost or slippage
slippage_cost = estimate_partial_fill_cost(volume,
,→ virtual_fill_ratio=partial_fill_ratio)
# If side == 'buy', we might account for potential negative cost
,→ if price moves down, etc.
# For this simplified approach, we just incorporate slippage as
,→ a cost for either side.
net_benefit = reward_value + fee_rebate_value - slippage_cost
return net_benefit
def run_liquidity_mining_strategy():
"""
Main strategy function that repeatedly places or updates orders
,→ to capture
liquidity mining incentives. For demonstration, it runs a small
,→ loop of order placements
and checks net benefit each iteration.
"""
api = ExchangeAPIPlaceholder()
reward_info = api.get_liquidity_mining_info()
# Example strategy parameters
target_volume = 5.0 # Our desired order size
side = 'sell' # We'll place a sell limit order,
,→ for instance
partial_fill_ratio = 0.4 # Calibrated from historical data
quote_update_interval = 2.0 # Seconds between checks/updates
net_benefit_threshold = 0.1 # Minimal net benefit threshold to
,→ keep the order active
# Place an initial order at best ask + small offset to remain
,→ near top of book
best_bid, best_ask = api.get_best_bid_ask()
initial_quote_price = best_ask + 0.01
order_id = api.place_limit_order(initial_quote_price,
,→ target_volume, side)
try:
for iteration in range(5):
169
# Simulate waiting for partial fills over a short time
[Link](quote_update_interval)
# Check how much got filled
filled_volume, fill_price =
,→ api.get_filled_volume_and_price(order_id)
print(f"[INFO] Iteration {iteration}, Order {order_id},
,→ Filled Volume: {filled_volume:.4f}, Fill Price:
,→ {fill_price:.4f}")
# Re-compute the net benefit with the updated reward and
,→ partial fill ratio
# (We assume partial fill ratio might dynamically change
,→ each iteration)
partial_fill_ratio = 0.3 + 0.2 * [Link]() #
,→ Example dynamic update
net_benefit = net_benefit_of_order(
volume=target_volume,
side=side,
reward_rate=reward_info["reward_rate"],
fee_rebate=reward_info["fee_rebate"],
partial_fill_ratio=partial_fill_ratio
)
print(f"[INFO] Current Net Benefit Estimate:
,→ {net_benefit:.3f}")
# If net benefit falls below threshold, we cancel or
,→ adjust the order
if net_benefit < net_benefit_threshold:
api.cancel_order(order_id)
# Recalculate a new quote price (e.g., move closer
,→ to best ask if conditions improved)
best_bid, best_ask = api.get_best_bid_ask()
adjusted_quote_price = best_ask +
,→ [Link](0.0, 0.02)
order_id =
,→ api.place_limit_order(adjusted_quote_price,
,→ target_volume, side)
else:
print("[DEBUG] Keeping current order as net benefit
,→ remains above threshold.")
# End strategy, final cancellation of leftover order
api.cancel_order(order_id)
print("[INFO] Finished liquidity mining strategy.")
except KeyboardInterrupt:
# In production, handle graceful shutdown
api.cancel_order(order_id)
print("[INFO] Strategy interrupted and order canceled.")
except Exception as e:
api.cancel_order(order_id)
print(f"[ERROR] Unexpected error: {e}. Order canceled.")
170
if __name__ == "__main__":
run_liquidity_mining_strategy()
Outside of this snippet, you would typically integrate:
• A live Exchange API client (via REST or WebSocket) to
replace the “ExchangeAPIPlaceholder”.
• Detailed logic for precisely monitoring partial fills and cumu-
lative daily volume.
• Real-time slippage estimations using actual market depth
data (level 2 or level 3).
• Mechanisms for storing and withdrawing any reward tokens
earned, as well as up-to-date market prices for those tokens
if you wish to monetize the rewards periodically.
• Refined risk management rules to halt or slowdown if the
market turns illiquid or if the net benefit consistently drops
below threshold levels.
In production, you may also enhance the code to track historical
net performance, measure reward efficiency, and integrate advanced
compensation for adverse selection and tail-risk events.
171
Chapter 25
Nonlinear Kalman
Filter Price
Stabilization
Below is an extended explanation of how one can implement an
Unscented (or Extended) Kalman Filter–based market making ap-
proach to stabilize quotes in the presence of potentially nonlinear
price transitions. The algorithm tracks an evolving belief state of
(price, volatility), then adjusts the market maker’s quotes—bid and
ask prices—based on those estimates to stay ahead of rapid mar-
ket moves (e.g., sudden jumps or squeezes). By updating the filter
frequently across micro-lookback intervals, the strategy aims for
near-instant adaptation, thus reducing the risk of large inventory
imbalances during unpredictable swings.
• State Representation: We define a two-dimensional state
(price, volatility). The price is the estimated fair value, and the
volatility dimension captures short-term uncertainty or turbulence.
• Transition Function: Driven by a diffusion-like process with po-
tential nonlinear effects (jumps or regime shifts).
• Measurement Function: Observes noisy price ticks or mid-prices
to update our belief state via the Unscented Kalman Filter (UKF).
• Quoting Logic: The quoted bid and ask revolve around the es-
timated fair price, expanded or contracted based on the filter’s
volatility reading.
• Real-Time Operation: We re-run the UKF-based update each
time fresh market data arrives. A micro-lookback buffer can store
172
the last few seconds (or even milliseconds) of data, ensuring near-
instant reactivity.
By combining these elements, the market maker continuously
refines its assessment of fair value. As volatility flares, spreads
widen to limit risk exposure; when volatility recedes, the spreads
tighten to capture more fills.
Python Code
Below is a Python code snippet demonstrating a simplified Un-
scented Kalman Filter–driven market maker. For brevity, it simu-
lates market data in a loop, updates filter estimates, and calculates
corresponding bid-ask quotes. In a production environment, real
market feeds and concurrency control would replace the simulated
data:
import numpy as np
from [Link] import UnscentedKalmanFilter,
,→ MerweScaledSigmaPoints
class UnscentedKalmanMarketMaker:
"""
A simplified market making model using an Unscented Kalman
,→ Filter to track
price and volatility. The quoting logic is adjusted dynamically
,→ based on
the filter's real-time estimates.
"""
def __init__(self,
initial_price=100.0,
initial_vol=0.05,
measurement_noise=0.5,
process_noise_price=0.1,
process_noise_vol=0.01,
micro_lookback=5):
"""
Initialize the Unscented Kalman Filter and
related market making parameters.
:param initial_price: Starting estimate of the asset price.
:param initial_vol: Starting estimate of the volatility.
:param measurement_noise: Standard deviation of measurement
,→ noise.
:param process_noise_price: Process noise for the price
,→ state.
:param process_noise_vol: Process noise for the volatility
,→ state.
173
:param micro_lookback: Number of recent price updates to
,→ store
(micro-lookback).
"""
self.state_dim = 2 # [price, volatility]
self.measurement_noise = measurement_noise
self.process_noise_price = process_noise_price
self.process_noise_vol = process_noise_vol
self.micro_lookback = micro_lookback
# Initialize buffer for micro-lookback intervals
self.recent_prices = []
# Define sigma points for the Unscented Kalman Filter
self.sigma_points = MerweScaledSigmaPoints(
n=self.state_dim,
alpha=0.1,
beta=2.0,
kappa=0.0
)
# Create and initialize the UKF
[Link] = UnscentedKalmanFilter(
dim_x=self.state_dim,
dim_z=1,
fx=self.state_transition,
hx=self.measurement_function,
points=self.sigma_points
)
# Initial state and covariance
[Link].x = [Link]([initial_price, initial_vol])
[Link].P = [Link](self.state_dim) * 1.0
# Process noise matrix Q
[Link].R = [Link]([[measurement_noise**2]])
[Link].Q = [Link]([
process_noise_price**2,
process_noise_vol**2
])
# Inventory and quote placeholders
[Link] = 0
self.bid_price = 0.0
self.ask_price = 0.0
def state_transition(self, state, dt=1.0):
"""
UKF state transition function. Models how the underlying
price and volatility evolve over each time step.
:param state: [price, volatility] at the previous time step.
:param dt: Time increment.
174
:return: Updated [price, volatility] state.
"""
price, vol = state
# Nonlinear drift example for price
drift = 0.01 * price # e.g., small drift proportional to
,→ current price
# Potential jump or random walk element
noise_price = [Link](0, self.process_noise_price)
noise_vol = [Link](0, self.process_noise_vol)
# Update equations
new_price = price + drift * dt + noise_price
new_vol = abs(vol + noise_vol) # keep volatility positive
return [Link]([new_price, new_vol])
def measurement_function(self, state):
"""
UKF measurement function. We observe the price dimension
in a noisy fashion; no direct volatility measurement.
:param state: [price, volatility].
:return: Measured price.
"""
price, _ = state
return [Link]([price])
def record_price(self, observed_price):
"""
Records the observed price in the micro-lookback buffer.
:param observed_price: Latest observed market price.
"""
self.recent_prices.append(observed_price)
if len(self.recent_prices) > self.micro_lookback:
self.recent_prices.pop(0)
def update_filter(self, observed_price):
"""
Perform a UKF prediction and update step using the observed
,→ price.
:param observed_price: Latest observed market price.
"""
# UKF Predict step
[Link]()
# UKF Update step
[Link]([Link]([observed_price]))
def update_quotes(self):
175
"""
Adjust the bid and ask prices based on the filter's current
price and volatility estimates.
"""
est_price = [Link].x[0]
est_vol = [Link].x[1]
# Simple logic: the spread is a function of short-term
,→ volatility
spread = max(0.01, 2.0 * est_vol)
self.bid_price = est_price - (spread / 2.0)
self.ask_price = est_price + (spread / 2.0)
def handle_fills(self, true_price):
"""
A toy mechanism to simulate how our posted quotes might be
,→ filled.
If true market price is above our ask, we assume we get
,→ fully filled
on the ask (we sold).
If true market price is below our bid, we assume we get
,→ filled on the bid (we bought).
:param true_price: The 'actual' underlying market price
in the simulation.
"""
# If ask is below actual price, we sold at ask
if true_price > self.ask_price:
# Sell 1 unit
[Link] -= 1
# If bid is above actual price, we bought at bid
elif true_price < self.bid_price:
# Buy 1 unit
[Link] += 1
def simulate_market(self,
num_steps=50,
true_initial=100.0,
true_vol=0.05):
"""
Runs a simple market simulation with random influences,
updating the UKF and quotes step by step.
:param num_steps: Number of simulation time steps.
:param true_initial: Initial true price for the simulation
,→ world.
:param true_vol: Base volatility for the true price
,→ evolution.
"""
[Link](42)
true_price = true_initial
176
sim_data = []
for t in range(num_steps):
# Generate a next 'true' price with some jump
,→ possibility
jump_noise = [Link](0, true_vol)
true_price = true_price + 0.1 * true_price + jump_noise
# Create a synthetic noisy measurement
measurement_noise = [Link](0,
,→ self.measurement_noise)
observed_price = true_price + measurement_noise
# Record and update UKF
self.record_price(observed_price)
self.update_filter(observed_price)
self.update_quotes()
# Simulate possible fills
self.handle_fills(true_price)
# Store step data
step_info = {
'step': t,
'true_price': true_price,
'observed_price': observed_price,
'ukf_est_price': [Link].x[0],
'ukf_est_vol': [Link].x[1],
'bid_price': self.bid_price,
'ask_price': self.ask_price,
'inventory': [Link]
}
sim_data.append(step_info)
return sim_data
# ----------------
# Example Usage
# ----------------
if __name__ == "__main__":
# Create an instance of the UKF-based market maker
ukf_mm = UnscentedKalmanMarketMaker(
initial_price=100.0,
initial_vol=0.05,
measurement_noise=0.5,
process_noise_price=0.1,
process_noise_vol=0.01,
micro_lookback=5
)
# Run a simulation of 30 steps
results = ukf_mm.simulate_market(num_steps=30,
true_initial=100.0,
177
true_vol=0.02)
# Print out a few lines of results for inspection
for i, res in enumerate(results[:5]):
print(
f"Step: {res['step']} | True Price:
,→ {res['true_price']:.2f}, "
f"Observed: {res['observed_price']:.2f}, "
f"UKF Price: {res['ukf_est_price']:.2f}, "
f"Bid: {res['bid_price']:.2f}, Ask:
,→ {res['ask_price']:.2f}, "
f"Inventory: {res['inventory']}"
)
Here is a brief outline of the code’s core components:
• UnscentedKalmanMarketMaker class encapsulates the entire
logic for initializing, updating, and utilizing the Unscented
Kalman Filter to track (price, volatility).
• state_transition models how the true price might drift
over each time step, with added process noise.
• measurement_function reflects that the market maker only
directly measures the price (no direct volatility measurement).
• update_quotes calculates the current bid-ask spread based
on the estimated volatility.
• simulate_market demonstrates running a simplified live mar-
ket scenario, where random shocks to the true price are gen-
erated, and the filter is updated step by step.
• handle_fills simulates how the posted quotes might be
filled depending on the comparison between true price and
quoted prices.
This basic framework can be expanded with more advanced
models of price dynamics, concurrency, real-time data handling,
and stricter risk controls. However, even in this simple form, it il-
lustrates how the Unscented Kalman Filter can help manage inven-
tory and stabilize quoting by reacting instantly to updated beliefs
about fair value and volatility.
178
Chapter 26
Recurrent Profit-Loss
Anchoring
Below is an illustrative approach demonstrating how a recurrent
network (LSTM) can be used to track a running perspective on re-
alized and unrealized PnL (profit and loss). By repeatedly feeding
PnL states back into the model, the algorithm can generate quot-
ing/position decisions that adjust risk based on a trailing window
of profitability. The model “anchors” its decisions to a cumulative
PnL timeline, systematically becoming more conservative if the
profit is high (protecting gains) or if it is dropping below certain
thresholds (limiting losses).
This example is intentionally simplified to convey the core ideas.
In practice, you would replace the random market simulation and
simplistic reward function with a real trading environment, and the
LSTM network might be extended or combined with other mod-
ules (e.g., attention, parallel feature extraction) for more robust
performance.
Python Code
import torch
import [Link] as nn
import [Link] as optim
import numpy as np
import random
class MarketSimulator:
179
"""
A toy market simulator that generates random price movements
and calculates realized and unrealized PnL. Each time step:
- An action is received (ranging from -1 to +1, representing
how aggressively to hold a position).
- The price changes randomly.
- Realized and unrealized PnL are derived based on position
change and price moves.
"""
def __init__(self, initial_price=100.0, max_steps=100):
self.initial_price = initial_price
[Link] = initial_price
self.max_steps = max_steps
self.current_step = 0
[Link] = 0.0 # track position (e.g., number of
,→ shares/contracts)
self.realized_pnl = 0.0
self.unrealized_pnl = 0.0
[Link] = False
def reset(self):
[Link] = self.initial_price
self.current_step = 0
[Link] = 0.0
self.realized_pnl = 0.0
self.unrealized_pnl = 0.0
[Link] = False
return self._get_observation()
def _get_observation(self):
"""
Returns the simulator state consisting of:
- current price
- current position
- realized PnL
- unrealized PnL
"""
return [Link]([[Link], [Link],
self.realized_pnl, self.unrealized_pnl],
dtype=np.float32)
def step(self, action):
"""
Action modifies the current position: if action is 0.5,
it shifts the position by +0.5 units. If -0.8, it reduces
or inverts the position by 0.8 units, etc.
"""
if [Link]:
return self._get_observation(), 0.0, True, {}
# Adjust position based on action
old_position = [Link]
180
[Link] += action
# Random price movement (e.g., normal distribution around 0)
price_change = [Link]() * 0.5
new_price = [Link] + price_change
# Realized PnL is updated if we effectively closed or
,→ partially
# closed some portion of the old position. For a real
,→ strategy,
# the logic here would be more nuanced.
position_change = [Link] - old_position
# If position decreased, we assume there's a partial close
,→ that
# yields realized gains/losses from the difference in price.
realized_from_closing = - position_change * [Link]
# Update realized PnL
self.realized_pnl += realized_from_closing
# Update price
[Link] = new_price
# Compute new unrealized PnL
self.unrealized_pnl = [Link] * ([Link] -
,→ self.initial_price)
# Progress time
self.current_step += 1
if self.current_step >= self.max_steps:
[Link] = True
# Reward is a simple function of the net PnL changes
# (could be more sophisticated in real usage)
net_pnl = self.realized_pnl + self.unrealized_pnl
reward = net_pnl
return self._get_observation(), reward, [Link], {}
class PnLAnchoringLSTM([Link]):
"""
A simple LSTM-based network that takes in the market state
[price, position, realized PnL, unrealized PnL] and outputs
an action in [-1, 1] that adjusts the position. By remembering
past states, it can anchor decisions to a running perspective
on cumulative profitability.
"""
def __init__(self, input_size=4, hidden_size=16, num_layers=1):
super(PnLAnchoringLSTM, self).__init__()
# LSTM module
[Link] = [Link](input_size, hidden_size, num_layers,
,→ batch_first=True)
# Final fully connected layer to produce action
181
[Link] = [Link](hidden_size, 1)
# We clamp this final output to [-1, 1]
[Link] = [Link]()
def forward(self, x, hidden=None):
"""
x is of shape (batch_size, seq_length, input_size).
hidden is the (h, c) tuple for LSTM hidden state.
"""
lstm_out, hidden = [Link](x, hidden) # shape: (B, seq,
,→ hidden_size)
# We take only the last time step's output for the action
last_out = lstm_out[:, -1, :] # shape: (B,
,→ hidden_size)
out = [Link](last_out) # shape: (B, 1)
return [Link](out), hidden # shape: (B, 1),
,→ hidden state
def generate_episode(env, model, max_steps=100, training=False,
,→ optimizer=None, criterion=None):
"""
Runs one episode in the environment, optionally training the
,→ model.
Returns total reward for logging.
"""
obs = [Link]()
hidden = None
states = []
actions = []
rewards = []
for t in range(max_steps):
# Convert obs to a batch of size 1, seq_length=1
state_tensor = [Link](obs,
,→ dtype=torch.float32).view(1, 1, -1)
with torch.set_grad_enabled(training):
action_tensor, hidden = model(state_tensor, hidden)
# Action is in [-1, 1]
action_value = action_tensor.item()
# Step the environment
next_obs, reward, done, _ = [Link](action_value)
[Link](state_tensor)
[Link](action_tensor)
[Link](reward)
obs = next_obs
if done:
break
# Optionally train after generating the episode
182
if training and optimizer and criterion:
# Convert rewards to a PyTorch tensor
# Let our simplistic "loss" be the negative sum of rewards
# so that maximizing total reward is minimized negative
,→ reward
discounted_reward = 0.0
gamma = 0.99
returns = []
# Compute discounted returns (simple RL approach)
for r in reversed(rewards):
discounted_reward = r + gamma * discounted_reward
[Link](0, discounted_reward)
returns = [Link](returns, dtype=torch.float32)
# We'll compute a simple MSE loss between actions and
,→ returns
# purely as a demonstration. Real RL approaches would
,→ differ.
losses = []
for i in range(len(states)):
# We can interpret a larger return as meaning
# we want the action to keep or increase position.
# This is a superficial mapping just for example.
pred = actions[i]
ret = returns[i].view(1, 1) # shape [1, 1]
loss = criterion(pred, [Link](ret / 100.0))
# scaling ret down for demonstration
[Link](loss)
total_loss = [Link](losses).mean()
optimizer.zero_grad()
total_loss.backward()
[Link]()
return sum(rewards), len(rewards)
def main():
# Hyperparameters
max_steps_per_episode = 50
num_episodes = 30
learning_rate = 0.01
# Create environment and model
env = MarketSimulator(initial_price=100.0,
,→ max_steps=max_steps_per_episode)
model = PnLAnchoringLSTM(input_size=4, hidden_size=16)
optimizer = [Link]([Link](), lr=learning_rate)
criterion = [Link]()
# Training loop
for episode in range(num_episodes):
episode_reward, steps = generate_episode(env, model,
max_steps=max_steps_per_episode,
183
training=True,
optimizer=optimizer,
criterion=criterion)
print(f"Episode {episode+1}/{num_episodes}, Steps: {steps},
,→ Reward: {episode_reward:.3f}")
# After training, run a test episode with no gradient updates
test_reward, _ = generate_episode(env, model,
,→ max_steps=max_steps_per_episode, training=False)
print(f"Test Episode Reward: {test_reward:.3f}")
if __name__ == "__main__":
main()
• The “MarketSimulator” class simulates a basic environment
by randomizing price movements and computing realized/un-
realized PnL.
• “PnLAnchoringLSTM” is an LSTM-based PyTorch model
that processes each new state (price, position, realized PnL,
unrealized PnL) and generates an action in the range [-1, 1].
• “generate_episode” runs one simulation episode, collects states,
actions, and rewards, and optionally trains the model—demonstrating
a rudimentary reinforcement-style loop.
• In this toy loss function, we tie the model’s actions to the
discounted returns for illustration. In a more advanced ap-
proach, you could incorporate policy gradients (REINFORCE,
A2C, PPO, etc.) or advanced dynamic hedging logic.
• Finally, “main” orchestrates the training over multiple episodes
and prints out the cumulative rewards for monitoring.
This end-to-end code sample demonstrates how to incorporate a
recurrent element (LSTM) to track and respond to evolving profit-
and-loss conditions over time. The approach “anchors” quoting or
position decisions to both recent and cumulative PnL, systemati-
cally adjusting risk aversion.
184
Chapter 27
Long-Short Market
Microstructure
Transitions
Below is a detailed explanation of the approach, followed by a com-
prehensive Python code example. The code showcases how to im-
plement a dual-purpose market making algorithm that switches
between long-biased and short-biased stances based on microstruc-
ture cues and fundamental signals, employing a simple state ma-
chine and real-time signal aggregation.
The core idea revolves around:
• Maintaining two distinct modes: “LONG_BIASED” and
“SHORT_BIASED”.
• A state machine reviews order book metrics (e.g., order stack im-
balances) and optionally fundamental signals to dynamically switch
the quoting stance.
• Quoting logic updates spreads and inventory targets based on
the detected mode, aiming to capture upward drift when in
“LONG_BIASED” mode and capture downward movements when
in “SHORT_BIASED” mode.
Python Code
import numpy as np
import random
from enum import Enum
185
class MarketMode(Enum):
"""
Enumeration for the market making stance.
"""
LONG_BIASED = 1
SHORT_BIASED = 2
class LongShortMarketMaker:
"""
A dual-purpose market making system that switches between
a long-biased stance and a short-biased stance based on
microstructure cues such as rising or falling order stack
,→ imbalances.
"""
def __init__(self,
initial_inventory=0,
initial_mode=MarketMode.LONG_BIASED,
long_spread=0.01,
short_spread=0.01,
switch_threshold=0.5,
max_inventory=100):
"""
:param initial_inventory: Starting inventory position.
:param initial_mode: The initial stance of the market maker.
:param long_spread: Base spread to quote when long-biased
,→ (in %, e.g. 0.01 = 1%).
:param short_spread: Base spread to quote when short-biased
,→ (in %, e.g. 0.01 = 1%).
:param switch_threshold: Threshold for microstructure signal
,→ that triggers mode switch.
:param max_inventory: Maximum absolute units of inventory to
,→ hold.
"""
[Link] = initial_inventory
[Link] = initial_mode
self.long_spread = long_spread
self.short_spread = short_spread
self.switch_threshold = switch_threshold
self.max_inventory = max_inventory
# Internally track best bid/ask from the order book
self.best_bid = None
self.best_ask = None
# Soft parameters for adjusting quotes
self.long_quote_offset = 0.001 # Extra offset for going
,→ aggressively long
self.short_quote_offset = 0.002 # Extra offset for going
,→ aggressively short
def update_order_book(self, best_bid, best_ask):
"""
186
Update the best bid and best ask from the real-time order
,→ book.
:param best_bid: Current best bid price.
:param best_ask: Current best ask price.
"""
self.best_bid = best_bid
self.best_ask = best_ask
def decide_mode(self, microstructure_signal,
,→ fundamental_signal):
"""
This function decides whether the system remains in the
,→ current mode
or switches to the opposite mode, based on microstructure
,→ and fundamental signals.
:param microstructure_signal: A numeric value indicating
,→ order flow imbalance, etc.
:param fundamental_signal: Additional numeric or categorical
,→ input to refine decisions.
"""
# Combine signals in a simplistic manner
# e.g., Weighted sum of microstructure and fundamental
,→ signals
combined_signal = microstructure_signal + 0.1 *
,→ fundamental_signal
# If above threshold, stay or switch to LONG_BIASED
# If below negative threshold, switch to SHORT_BIASED
# Otherwise, remain in the current mode
if combined_signal > self.switch_threshold:
[Link] = MarketMode.LONG_BIASED
elif combined_signal < -self.switch_threshold:
[Link] = MarketMode.SHORT_BIASED
# If the signal is in between, we do not change mode
def generate_quotes(self):
"""
Generates current quotes (bid and ask) given the mode, best
,→ bid, and best ask.
Adjusts spreads and offsets if the inventory is nearing max
,→ levels.
"""
if self.best_bid is None or self.best_ask is None:
# If we do not have valid book data, return no quotes
return None, None
# Basic protection if inventory is too high or too low
if abs([Link]) >= self.max_inventory:
# If we are at max capacity, we might quote only on the
,→ side that
# helps reduce inventory.
return self._generate_reduction_quotes()
187
else:
# Normal quoting logic
if [Link] == MarketMode.LONG_BIASED:
return self._generate_long_biased_quotes()
else:
return self._generate_short_biased_quotes()
def _generate_long_biased_quotes(self):
"""
For long-biased stance, place a relatively more aggressive
,→ bid
and a slightly wide ask to capture upward drift.
"""
mid_price = (self.best_bid + self.best_ask) / 2.0
# More aggressive bid (slightly above typical spread-based
,→ discount)
quote_bid = mid_price * (1 - self.long_spread) +
,→ self.long_quote_offset
# More relaxed ask
quote_ask = mid_price * (1 + self.long_spread) +
,→ self.long_quote_offset
return quote_bid, quote_ask
def _generate_short_biased_quotes(self):
"""
For short-biased stance, place a relatively more aggressive
,→ ask
and a slightly wide bid to capture downward liquidity.
"""
mid_price = (self.best_bid + self.best_ask) / 2.0
# More relaxed bid
quote_bid = mid_price * (1 - self.short_spread) -
,→ self.short_quote_offset
# More aggressive ask
quote_ask = mid_price * (1 + self.short_spread) -
,→ self.short_quote_offset
return quote_bid, quote_ask
def _generate_reduction_quotes(self):
"""
If the inventory is too large in either direction, place
,→ quotes that
help reduce inventory. For instance, if we have a large
,→ positive inventory,
place an aggressive ask to sell, and keep the bid
,→ conservative.
"""
mid_price = (self.best_bid + self.best_ask) / 2.0
if [Link] > 0:
# We want to reduce positive inventory -> more
,→ aggressive ask
quote_bid = mid_price * (1 - 0.02) # wide
188
quote_ask = mid_price * (1 + 0.005) # narrower to
,→ offload
else:
# We want to reduce negative inventory -> more
,→ aggressive bid
quote_bid = mid_price * (1 - 0.005)
quote_ask = mid_price * (1 + 0.02)
return quote_bid, quote_ask
def fill_order(self, fill_price, fill_size):
"""
Update inventory on order fill events.
Positive fill_size indicates a buy fill (inventory goes up).
Negative fill_size indicates a sell fill (inventory goes
,→ down).
:param fill_price: The price at which the fill occurred.
:param fill_size: The size (quantity) of the trade.
"""
[Link] += fill_size
class MicrostructureAggregator:
"""
Aggregates real-time order book and fundamental signals to
,→ produce
numeric cues for switching regimen.
"""
def __init__(self):
pass
@staticmethod
def compute_microstructure_signal(order_book_data):
"""
Example function to compute a simple order flow imbalance
or net stacking metric.
:param order_book_data: A dict including 'bids' and 'asks'
,→ arrays.
:return: A float representing microstructure signal.
"""
bids_volume = sum([lvl['volume'] for lvl in
,→ order_book_data.get('bids', [])])
asks_volume = sum([lvl['volume'] for lvl in
,→ order_book_data.get('asks', [])])
imbalance = (bids_volume - asks_volume) / max(1,
,→ (bids_volume + asks_volume))
return imbalance
@staticmethod
def compute_fundamental_signal(fundamental_data):
"""
Example function to interpret some fundamental data
(could be sentiment, macro data, etc.) and convert it into
a numeric score.
189
:param fundamental_data: A dict or any structure with
,→ fundamental info.
:return: A float representing fundamental signal.
"""
# Example: scale a simple 'sentiment' measure
sentiment = fundamental_data.get('sentiment_score', 0)
return sentiment * 0.1
def main_simulation(num_steps=10):
"""
Simulation driver that emulates some order book updates and
,→ fundamental signals
to demonstrate how the market maker toggles between long-biased
,→ and short-biased stances.
"""
# Initialize the market maker
market_maker = LongShortMarketMaker(
initial_inventory=0,
initial_mode=MarketMode.LONG_BIASED,
long_spread=0.01,
short_spread=0.01,
switch_threshold=0.3,
max_inventory=5
)
aggregator = MicrostructureAggregator()
# Simulate a series of steps with random order book data
for step in range(num_steps):
# Random best bid/ask
best_bid = 100 + [Link](-1, 1)
best_ask = best_bid + [Link](0.5, 1.5)
market_maker.update_order_book(best_bid, best_ask)
# Random order book volumes
order_book_data = {
'bids': [{'price': best_bid - i * 0.02, 'volume':
,→ [Link](1, 10)} for i in range(3)],
'asks': [{'price': best_ask + i * 0.02, 'volume':
,→ [Link](1, 10)} for i in range(3)]
}
# Hypothetical fundamental data
fundamental_data = {
'sentiment_score': [Link](-1, 1)
}
# Compute signals
micro_sig =
,→ aggregator.compute_microstructure_signal(order_book_data)
fund_sig =
,→ aggregator.compute_fundamental_signal(fundamental_data)
190
# Decide mode
market_maker.decide_mode(micro_sig, fund_sig)
quote_bid, quote_ask = market_maker.generate_quotes()
# Randomly emulate a fill
if quote_bid is not None and quote_ask is not None:
# Suppose half the time we get a partial fill
if [Link]() > 0.5:
fill_size = [Link]([1, -1]) # random buy (1)
,→ or sell (-1)
# Use a random fill price near the quoted range
fill_price = quote_bid if fill_size < 0 else
,→ quote_ask
market_maker.fill_order(fill_price, fill_size)
# Print step info
print(f"\nStep {step+1}")
print(f"Best Bid/Ask: {market_maker.best_bid:.2f} /
,→ {market_maker.best_ask:.2f}")
print(f"Microstructure Signal: {micro_sig:.2f}, Fundamental
,→ Signal: {fund_sig:.2f}")
print(f"Mode: {market_maker.[Link]}")
if quote_bid and quote_ask:
print(f"Quoted Bid/Ask: {quote_bid:.2f} /
,→ {quote_ask:.2f}")
print(f"Current Inventory: {market_maker.inventory}")
if __name__ == "__main__":
# Run the simulation driver
main_simulation(num_steps=15)
• The “LongShortMarketMaker” class holds the inventory, cur-
rent stance (long-biased or short-biased), and relevant param-
eters.
• The “decide_mode” method fuses microstructure and funda-
mental signals, switching the stance if the signals surpass a
threshold.
• The “generate_quotes” method produces quotes based on the
stance, best bid/ask, and inventory constraints.
• The “MicrostructureAggregator” provides static utility func-
tions: one to compute net order flow imbalance (microstruc-
ture signal), the other to transform a fundamental measure
(e.g., sentiment) into a numerical score.
• The “main_simulation” function demonstrates how the algo-
rithm might run in practice, simulating random best bid/ask,
191
random fundamental sentiment, and partial fills on each it-
eration.
This example highlights the state machine approach (LONG_BIASED
vs. SHORT_BIASED), how signals are combined to trigger switch-
ing, and how a quoting engine is updated accordingly. The inven-
tory management logic further refines quotes whenever the position
size approaches the maximum allowable inventory.
192
Chapter 28
Neural PDE Market
Maker
Below is a detailed explanation of how the "Neural PDE Market
Maker" concept can be implemented. The approach leverages par-
tial differential equations (PDEs) derived from diffusion (or jump-
diffusion) processes to capture price evolution. A neural network
then approximates the PDE solutions in a fast, real-time manner,
which helps the market maker dynamically adjust quoted spreads
while anticipating directional and volatility shifts:
1. PDE Modeling:
• Define a suitable diffusion or jump-diffusion model (e.g., Black-
Scholes-type PDE or a more specialized jump-diffusion PDE).
• Translate the model into a PDE of the form V/t + L(V) =
0, where L(V) denotes spatial derivatives (like 2V/S2 for volatility
terms, V/S for drift, etc.).
2. Neural Approximation:
• Construct a feed-forward neural network (or other advanced ar-
chitectures) to approximate the solution V(s, t).
• Use automatic differentiation to compute partial derivatives of
V with respect to s and t, enabling a direct calculation of the PDE
residual (PDE mismatch).
• Train the network by minimizing a loss function that includes
the PDE residual and boundary/terminal condition mismatches.
3. Real-Time Quoting:
193
• Once trained, query the neural network for any (s, t).
• This yields a near-instantaneous estimate of “fair price” or “smooth
value” V(s, t).
• Integrate these estimates into a quoting subsystem that adjusts
bid-ask spreads based on anticipated movement and volatility, ef-
fectively merging PDE-based risk assessments with high-frequency
constraints.
4. Implementation Details:
• Generate simulated training data (e.g., random draws of (s, t)
within the domain of interest).
• Provide boundary/terminal conditions relevant to your market
making perspective (e.g., specific constraints at s=0, large s, or at
final time t=T).
• Continuously retrain or fine-tune the model as new high-frequency
data arrives, ensuring robust real-time adaptation.
Python Code
Below is a Python code snippet that demonstrates a simplified
prototype of solving a diffusion PDE with a neural network (PINN-
style) and using the solution to inform a market maker’s real-time
quoting decisions.
import torch
import [Link] as nn
import numpy as np
# ==============================
# 1. Neural Network Definition
# ==============================
class NeuralPDE([Link]):
"""
A feed-forward neural network that approximates the solution
,→ V(s, t).
In practice, deeper/larger architectures may be used.
"""
def __init__(self, input_dim=2, hidden_dim=64, output_dim=1,
,→ num_hidden_layers=3):
super(NeuralPDE, self).__init__()
layers = []
[Link]([Link](input_dim, hidden_dim))
[Link]([Link]())
for _ in range(num_hidden_layers - 1):
194
[Link]([Link](hidden_dim, hidden_dim))
[Link]([Link]())
[Link]([Link](hidden_dim, output_dim))
[Link] = [Link](*layers)
def forward(self, x):
"""
:param x: Tensor of shape (N, 2), where each row is (s, t).
:return: Tensor of shape (N, 1), the approximate PDE
,→ solution V(s, t).
"""
return [Link](x)
# =================================
# 2. PDE Residual and Boundaries
# =================================
def pde_residual(model, s, t, mu, sigma):
"""
Compute the PDE residual for a diffusion model:
dS = mu * S * dt + sigma * S * dW
PDE (risk-neutral or simplified) can be approximated as:
V/t + mu * s * V/s + 0.5 * sigma^2 * s^2 * 2V/s2 = 0
:param model: Neural network that approximates V(s, t).
:param s: Torch tensor for underlying price dimension.
:param t: Torch tensor for time dimension.
:param mu: Float, drift term.
:param sigma: Float, volatility term.
:return: Tensor representing the PDE residual at each point.
"""
# Ensure requires_grad is True for automatic differentiation
s.requires_grad = True
t.requires_grad = True
# Forward pass
inputs = [Link]((s, t), dim=1)
V = model(inputs)
# First-order derivatives
dV_ds = [Link](
V, s,
grad_outputs=torch.ones_like(V),
create_graph=True,
retain_graph=True
)[0]
dV_dt = [Link](
V, t,
grad_outputs=torch.ones_like(V),
create_graph=True,
retain_graph=True
195
)[0]
# Second-order derivatives
d2V_ds2 = [Link](
dV_ds, s,
grad_outputs=torch.ones_like(dV_ds),
create_graph=True,
retain_graph=True
)[0]
# PDE: V/t + mu*s*V/s + 0.5*sigma^2*s^2*2V/s2 = 0
pde = dV_dt + mu * s * dV_ds + 0.5 * sigma**2 * s**2 * d2V_ds2
return pde
def boundary_conditions(model, s_min, s_max, t_min, t_max):
"""
Boundary/terminal conditions for demonstration.
For real use cases, adjust these as per market making
,→ constraints.
Example:
1) V(0, t) = 0 (if price is 0, value is 0)
2) V(s, t_max) = payoff(s) (terminal condition at maturity or
,→ final time)
3) V(s_max, t) ~ some constant or formula
:param model: Neural network approximation.
:param s_min: Minimum price in domain.
:param s_max: Maximum price in domain.
:param t_min: Start time in domain.
:param t_max: End time in domain.
:return: aggregated boundary loss
"""
# Condition 1: V(s_min, t) = 0
t_lin = [Link](t_min, t_max, steps=20).unsqueeze(1)
s_min_vec = torch.full_like(t_lin, s_min)
inputs_min = [Link]((s_min_vec, t_lin), dim=1)
V_min = model(inputs_min)
loss_b1 = [Link]((V_min - 0.0) ** 2)
# Condition 2: Terminal condition: V(s, t_max) ~ e.g. some
,→ payoff
s_lin = [Link](s_min, s_max, steps=20).unsqueeze(1)
t_max_vec = torch.full_like(s_lin, t_max)
inputs_tmax = [Link]((s_lin, t_max_vec), dim=1)
# Example payoff: V(s, t_max) = s (for demonstration)
V_tmax_target = s_lin
V_tmax_pred = model(inputs_tmax)
loss_b2 = [Link]((V_tmax_pred - V_tmax_target) ** 2)
# Condition 3: V(s_max, t) ~ s_max as an example
s_max_vec = torch.full_like(t_lin, s_max)
196
inputs_max = [Link]((s_max_vec, t_lin), dim=1)
V_max = model(inputs_max)
loss_b3 = [Link]((V_max - s_max_vec) ** 2)
return loss_b1 + loss_b2 + loss_b3
# =========================================
# 3. Data Sampler for Training the Network
# =========================================
def sample_domain(s_min, s_max, t_min, t_max, n_samples=256):
"""
Uniformly sample points (s, t) in the domain for PDE residual
,→ training.
:param s_min: Minimum of price domain.
:param s_max: Maximum of price domain.
:param t_min: Earliest time in the domain.
:param t_max: Latest time in the domain.
:param n_samples: Number of training points per iteration.
:return: (s, t) as torch tensors
"""
s_sample = [Link](n_samples) * (s_max - s_min) + s_min
t_sample = [Link](n_samples) * (t_max - t_min) + t_min
return s_sample, t_sample
# ================================
# 4. Training Loop
# ================================
def train_pde_model(model, mu, sigma,
s_min=0.0, s_max=200.0, t_min=0.0, t_max=1.0,
lr=1e-3, epochs=2000, n_samples=128):
"""
Train the neural network to satisfy the PDE and boundary
,→ conditions.
:param model: NeuralPDE model.
:param mu: Drift.
:param sigma: Volatility.
:param s_min: Minimum price in domain.
:param s_max: Maximum price in domain.
:param t_min: Start time.
:param t_max: End time.
:param lr: Learning rate.
:param epochs: Number of training epochs.
:param n_samples: Number of PDE sample points per epoch.
:return: None, model is trained in-place.
"""
optimizer = [Link]([Link](), lr=lr)
for epoch in range(epochs):
optimizer.zero_grad()
# Sample domain points
197
s_batch, t_batch = sample_domain(s_min, s_max, t_min, t_max,
,→ n_samples)
# PDE residual
pde_r = pde_residual(model, s_batch, t_batch, mu, sigma)
loss_pde = [Link](pde_r**2)
# Boundary conditions
loss_bc = boundary_conditions(model, s_min, s_max, t_min,
,→ t_max)
# Total loss
loss = loss_pde + loss_bc
[Link]()
[Link]()
if (epoch + 1) % 200 == 0:
print(f"Epoch {epoch+1}/{epochs}, Total Loss:
,→ {[Link]():.6f}, PDE: {loss_pde.item():.6f}, BC:
,→ {loss_bc.item():.6f}")
# =====================================================
# 5. Real-Time Quoting Subsystem Using the Trained PDE
# =====================================================
def quote_spread(model, current_price, current_time,
,→ base_spread=0.01, vol_factor=0.5):
"""
Generate a bid-ask spread based on PDE solution (fair value)
,→ plus
adjustments for volatility factor and real-time signals.
:param model: Trained NeuralPDE model approximating V(s,t).
:param current_price: Current underlying price.
:param current_time: Current time relative to [0,1].
:param base_spread: Base spread as a fraction of the price or
,→ absolute value.
:param vol_factor: Weight factor for PDE-based 'volatility'
,→ adjustments.
:return: (bid, ask) quotes
"""
# Convert to torch
inp = [Link]([[current_price, current_time]],
,→ dtype=torch.float32)
fair_val = model(inp).item()
# A simplistic approach: spread size grows with difference from
,→ price
# and a volatility factor that can scale the spread.
# Another approach is to compute PDE-based Greeks to refine the
,→ spread.
spread_size = base_spread * (1.0 + vol_factor * abs(fair_val -
,→ current_price))
198
bid = current_price - 0.5 * spread_size
ask = current_price + 0.5 * spread_size
return bid, ask
# ============================
# 6. Demo / Main Execution
# ============================
if __name__ == "__main__":
# Hyperparameters and domain
s_min_val, s_max_val = 0.0, 200.0
t_min_val, t_max_val = 0.0, 1.0
mu_val, sigma_val = 0.1, 0.2 # example drift & volatility
# Instantiate the neural PDE model
model_pde = NeuralPDE(input_dim=2, hidden_dim=64, output_dim=1,
,→ num_hidden_layers=3)
# Train the model to satisfy PDE and boundary conditions
train_pde_model(model_pde, mu_val, sigma_val,
s_min=s_min_val, s_max=s_max_val,
t_min=t_min_val, t_max=t_max_val,
lr=1e-3, epochs=1000, n_samples=128)
# Simulate a real-time scenario
# Suppose current time is 0.3 in normalized scale, price is
,→ around 105
current_s = 105.0
current_t = 0.3
bid_quote, ask_quote = quote_spread(model_pde, current_s,
,→ current_t)
print(f"Real-time Quoting Example:\n"
f" Current Price: {current_s}\n"
f" PDE Approx Value: {model_pde([Link]([[current_s,
,→ current_t]]))[0].item():.4f}\n"
f" Generated Bid: {bid_quote:.4f}, Generated Ask:
,→ {ask_quote:.4f}\n")
This code demonstrates:
• A feed-forward neural network (NeuralPDE) that approxi-
mates the PDE solution V(s, t).
• Automatic differentiation to compute partial derivatives and
form a PDE residual (pde_residual).
• Simple boundary/terminal conditions through boundary_conditions,
which can be adapted to specific market making or payoff tar-
gets.
199
• A training process (train_pde_model) that optimizes the
neural network to satisfy both the PDE and boundary con-
ditions.
• A quoting subsystem (quote_spread) that derives market
maker bid-ask levels from the neural PDE solution. This
shows how PDE-based fair values and volatility estimates can
dynamically inform spreads.
By embedding these ideas into a robust market-making infras-
tructure—with actual real-time data feeds and continuous retrain-
ing or updating—the strategy can proactively anticipate market
swings via the PDE-driven neural network approximations, thereby
supporting more adaptive and sophisticated quoting behavior.
200
Chapter 29
Advanced Feature
Fusion Quoter
Description
The “Advanced Feature Fusion Quoter” algorithm is designed
to combine heterogeneous data sets—on-chain crypto metrics, macroe-
conomic indicators, and standard order book data—into a single
pipeline for robust short-horizon price predictions. First, data in-
gestion must ensure that timestamps from all sources align accu-
rately. Next, each data set undergoes normalization or standard-
ization to remove scale disparities. A feature fusion layer merges
these distinct data sets into unified input vectors (or embeddings)
that capture comprehensive market views.
Once merged, an ensemble learning approach, such as combin-
ing a neural network with a gradient-boosted machine or random
forest, is trained to predict short-horizon price changes. The en-
semble predictions are blended, for instance via averaging, poten-
tially weighted by validation performance. Finally, the quoting
strategy adjusts spreads according to the confidence derived from
fused signals—high-confidence predictions might prompt narrower
spreads to capitalize on anticipated moves, whereas volatile or un-
certain outlooks lead to more conservative (wider) quoting.
Python Code
Below is a Python code snippet that demonstrates a simplified,
end-to-end pipeline for the Advanced Feature Fusion Quoter:
201
import numpy as np
import pandas as pd
from datetime import datetime, timedelta
from [Link] import RandomForestRegressor
from sklearn.neural_network import MLPRegressor
from [Link] import StandardScaler
from [Link] import mean_squared_error
def simulate_onchain_data(num_points=200, start_time=datetime(2023,
,→ 1, 1)):
"""
Simulates on-chain data including metrics like transaction
,→ volume,
active addresses, and hash rate, each aligned by timestamp.
"""
times = [start_time + timedelta(minutes=i) for i in
,→ range(num_points)]
data = {
"time": times,
"activeAddresses": [Link](3000, 10000,
,→ num_points),
"transactionVolume": [Link](1000, 9000,
,→ num_points),
"hashRate": [Link](50, 150, num_points)
}
return [Link](data)
def simulate_macro_data(num_points=200, start_time=datetime(2023, 1,
,→ 1)):
"""
Simulates macroeconomic data, such as interest rates, CPI, and
,→ unemployment.
"""
times = [start_time + timedelta(minutes=i) for i in
,→ range(num_points)]
data = {
"time": times,
"interestRates": [Link](0, 5, num_points),
"cpi": [Link](110, 130, num_points),
"unemploymentRate": [Link](3, 8, num_points)
}
return [Link](data)
def simulate_orderbook_data(num_points=200,
,→ start_time=datetime(2023, 1, 1)):
"""
Simulates order book data including best bid, best ask, and
,→ volume at each side.
Also includes a synthetic 'midPrice' that we wish to predict for
,→ the next interval.
"""
202
times = [start_time + timedelta(minutes=i) for i in
,→ range(num_points)]
best_bid = [Link](50, 60, num_points)
best_ask = best_bid + [Link](0.1, 0.3, num_points)
mid_price = (best_bid + best_ask) / 2
data = {
"time": times,
"bestBid": best_bid,
"bestAsk": best_ask,
"volumeAtBid": [Link](100, 1000, num_points),
"volumeAtAsk": [Link](100, 1000, num_points),
"midPrice": mid_price
}
return [Link](data)
def merge_data(onchain_df, macro_df, orderbook_df):
"""
Merges the three datasets on the common 'time' column.
"""
merged_df = [Link](onchain_df, macro_df, on='time',
,→ how='inner')
merged_df = [Link](merged_df, orderbook_df, on='time',
,→ how='inner')
return merged_df
def build_features(merged_df):
"""
Builds a feature matrix X and target vector y (short-horizon
,→ price change).
We define 'priceChange' as the difference in midPrice after 1
,→ step (shift).
"""
# Sort by time to ensure correct shifting
merged_df =
,→ merged_df.sort_values(by='time').reset_index(drop=True)
# Create target: Next-step midPrice change
merged_df['futureMidPrice'] = merged_df['midPrice'].shift(-1)
merged_df['priceChange'] = merged_df['futureMidPrice'] -
,→ merged_df['midPrice']
# Drop the last row which has no futureMidPrice
merged_df = merged_df.dropna(subset=['priceChange'])
# Features to consider (excluding time, midPrice,
,→ futureMidPrice, priceChange)
feature_cols = [
c for c in merged_df.columns
if c not in ['time', 'midPrice', 'futureMidPrice',
,→ 'priceChange']
]
203
X = merged_df[feature_cols].values
y = merged_df['priceChange'].values
return X, y, merged_df
def train_ensemble_model(X_train, y_train):
"""
Trains two models: a Random Forest and an MLP, then returns
,→ them.
"""
rf_model = RandomForestRegressor(n_estimators=50,
,→ random_state=42)
mlp_model = MLPRegressor(hidden_layer_sizes=(32,16),
activation='relu',
solver='adam',
max_iter=500,
random_state=42)
rf_model.fit(X_train, y_train)
mlp_model.fit(X_train, y_train)
return rf_model, mlp_model
def ensemble_prediction(X, models):
"""
Performs prediction by averaging the predictions of each model
,→ in 'models'.
This can be extended to a weighted average if needed.
"""
predictions = []
for model in models:
[Link]([Link](X))
predictions = [Link](predictions)
return [Link](axis=0)
def quote_spread(row, predicted_change, base_spread=0.05):
"""
Generates a quote for a single time step, adjusting spread based
,→ on
the magnitude of predicted price change. A larger predicted
,→ change suggests
more aggressive quoting, while smaller changes lead to narrower
,→ spreads.
"""
current_price = row['midPrice']
# Confidence can be derived from predicted_change magnitude
confidence_factor = min(max(abs(predicted_change)*10, 0.5), 2.0)
# The final spread is base_spread multiplied by a confidence
,→ factor
final_spread = base_spread * confidence_factor
204
# We define a hypothetical bid, ask around midPrice
quoted_bid = current_price - final_spread / 2
quoted_ask = current_price + final_spread / 2
return quoted_bid, quoted_ask
if __name__ == "__main__":
# 1. Simulate the data
onchain_df = simulate_onchain_data()
macro_df = simulate_macro_data()
orderbook_df = simulate_orderbook_data()
# 2. Merge data
merged_df = merge_data(onchain_df, macro_df, orderbook_df)
# 3. Build features and target
X, y, merged_df_features = build_features(merged_df)
# 4. Split into train/test
split_index = int(len(X) * 0.8)
X_train, y_train = X[:split_index], y[:split_index]
X_test, y_test = X[split_index:], y[split_index:]
df_test =
,→ merged_df_features.iloc[split_index:].reset_index(drop=True)
# 5. Scale features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = [Link](X_test)
# 6. Train models and create an ensemble
rf_model, mlp_model = train_ensemble_model(X_train_scaled,
,→ y_train)
models = [rf_model, mlp_model]
# 7. Predict on test set
y_pred_test = ensemble_prediction(X_test_scaled, models)
mse = mean_squared_error(y_test, y_pred_test)
print(f"Ensemble Test MSE: {mse:.4f}")
# 8. Quoting demonstration using predictions
sample_size = 5 # Show 5 sample quotes from the test set
for i in range(sample_size):
row = df_test.iloc[i]
x_row = X_test_scaled[i].reshape(1, -1)
pred_change = ensemble_prediction(x_row, models)[0]
bid, ask = quote_spread(row, pred_change, base_spread=0.05)
print(f"Time={row['time']} | TrueChange={y_test[i]:.4f} |
,→ PredChange={pred_change:.4f} "
f"| QuotedBid={bid:.2f}, QuotedAsk={ask:.2f}")
This code contains the following major components:
205
• simulate_onchain_data, simulate_macro_data,
simulate_orderbook_data: Functions that generate synthetic
data for demonstration purposes, covering on-chain, macro,
and order book metrics.
• merge_data: Merges the data sets on a common time index
to ensure correct temporal alignment.
• build_features: Constructs the core features by merging
slices of each data set, creating the short-horizon price change
(priceChange) as the modeling target.
• train_ensemble_model: Trains both a Random Forest Re-
gressor and an MLP, capturing an ensemble approach that
leverages different model architectures.
• ensemble_prediction: Averages the predictions from indi-
vidual models to produce a fused output.
• quote_spread: Demonstrates a simplified quoting logic, ad-
justing spreads based on the magnitude of the predicted price
change.
• The main block ties these functions together, running the
entire pipeline from generating synthetic data to final quoting
decisions.
206
Chapter 30
Self-Supervised
High-Frequency
Forecaster
This chapter presents a self-supervised learning methodology de-
signed to uncover subtle predictive cues from massive tick-level
data. By masking (or removing) portions of the input and train-
ing a model (e.g., transformer or CNN) to reconstruct price and
volume signals, the system learns a deep representation of short-
horizon behavior. Once trained, these learned embeddings inform
how the market maker places quotes at precisely timed intervals.
The key idea is that by focusing on reconstruction tasks, the model
uncovers latent structural patterns—like fleeting volume anomalies
or sudden “pinches” in the order book—that can be leveraged for
optimal quoting.
When implementing in practice, the pipeline requires:
1. An efficient data ingestion step that samples large-scale tick
data (e.g., price, volume, order book depth).
2. A masking or segment removal mechanism to create a self-
supervised reconstruction objective.
3. A deep architecture, such as a transformer or a convolutional
neural network, to map partial input sequences into a learned em-
bedding.
4. A bridging logic that uses the embeddings to produce actionable
quote recommendations in real time.
207
Below is a Python code snippet that encompasses the core self-
supervised training pipeline and real-time quoting logic for a high-
frequency forecaster.
Python Code
import torch
import [Link] as nn
import [Link] as optim
import numpy as np
import random
def generate_synthetic_data(num_sequences=1000, seq_length=50,
,→ seed=42):
"""
Generate synthetic tick-level data for demonstration purposes.
Each sequence simulates price and volume over a short horizon.
:param num_sequences: Number of synthetic sequences to generate.
:param seq_length: Length of each sequence.
:param seed: Random seed for reproducibility.
:return: Numpy ndarray of shape [num_sequences, seq_length, 2]
where the last dimension is [price, volume].
"""
[Link](seed)
data = []
for _ in range(num_sequences):
# Random walk prices
prices = 100 + [Link]([Link](seq_length) * 0.1)
# Volumes as random positive values
volumes = [Link](low=1, high=100,
,→ size=seq_length)
seq = [Link]([prices, volumes], axis=-1)
[Link](seq)
return [Link](data, dtype=np.float32)
def mask_data(sequence, mask_ratio=0.2):
"""
Randomly mask a portion of the sequence for self-supervised
,→ learning.
:param sequence: A single sequence of shape [seq_length,
,→ features].
:param mask_ratio: Fraction of the sequence to mask.
:return: masked_sequence, mask_indicator
masked_sequence has the same shape as input
mask_indicator is a boolean mask indicating where data
,→ is masked.
"""
seq_length = [Link][0]
num_to_mask = int(seq_length * mask_ratio)
mask_indices = [Link](range(seq_length), num_to_mask)
208
mask_indicator = [Link](seq_length, dtype=bool)
mask_indicator[mask_indices] = True
masked_sequence = [Link]()
# Replace masked positions with zeros (or could use a special
,→ token)
masked_sequence[mask_indicator] = 0.0
return masked_sequence, mask_indicator
class SelfSupervisedTickDataset([Link]):
"""
PyTorch Dataset for self-supervised learning on tick data.
Each item returns an original sequence, a masked sequence,
and a mask indicator.
"""
def __init__(self, data, mask_ratio=0.2):
"""
:param data: Numpy array of shape [num_sequences,
,→ seq_length, features].
:param mask_ratio: Fraction of each sequence to mask.
"""
super().__init__()
[Link] = data
self.mask_ratio = mask_ratio
def __len__(self):
return len([Link])
def __getitem__(self, idx):
sequence = [Link][idx]
masked_seq, mask_indicator = mask_data(sequence,
,→ self.mask_ratio)
# Convert to torch tensors
sequence_torch = [Link](sequence, dtype=[Link])
masked_torch = [Link](masked_seq, dtype=[Link])
mask_torch = [Link](mask_indicator, dtype=[Link])
return masked_torch, sequence_torch, mask_torch
class HFTransformer([Link]):
"""
A simplified Transformer-based architecture for self-supervised
,→ reconstruction
of masked segments in tick-level data.
"""
def __init__(self, input_dim=2, model_dim=16, nhead=2,
,→ num_layers=2):
"""
:param input_dim: Number of features in each time step
,→ (price, volume, etc.).
209
:param model_dim: Embedding dimensionality in the
,→ transformer.
:param nhead: Number of attention heads in the transformer
,→ layers.
:param num_layers: Number of transformer encoder layers.
"""
super().__init__()
self.input_dim = input_dim
self.model_dim = model_dim
# Learnable linear transform for input -> model_dim
self.input_proj = [Link](input_dim, model_dim)
# Positional encoding (basic sine/cosine method or
,→ learnable)
self.positional_encoding = [Link]([Link](1, 1024,
,→ model_dim))
encoder_layer =
,→ [Link](d_model=model_dim,
nhead=nhead,
dim_feedforward=model_dim*4)
self.transformer_encoder =
,→ [Link](encoder_layer,
num_layers=num_layers)
# Output layer to reconstruct input features
self.output_layer = [Link](model_dim, input_dim)
def forward(self, x):
"""
:param x: Tensor of shape [batch_size, seq_length,
,→ input_dim].
:return: Reconstructed sequence of the same shape as x.
"""
batch_size, seq_len, _ = [Link]
# Project input to model_dim
x_embed = self.input_proj(x)
# Add trivial positional encoding (clamp to avoid
,→ out-of-bounds if seq_len>1024)
pe = self.positional_encoding[:, :seq_len, :]
x_embed = x_embed + pe
# Transformer expects [seq_length, batch_size, model_dim]
x_embed = x_embed.permute(1, 0, 2)
# Apply transformer encoder
enc_output = self.transformer_encoder(x_embed)
# Convert back from [seq_length, batch_size, model_dim] to
,→ [batch_size, seq_length, model_dim]
210
enc_output = enc_output.permute(1, 0, 2)
# Reconstruct the original input
out = self.output_layer(enc_output)
return out
def train_self_supervised(model, dataloader, epochs=5, lr=1e-3,
,→ device='cpu'):
"""
Train the model to reconstruct masked segments.
:param model: HFTransformer instance.
:param dataloader: PyTorch DataLoader over
,→ SelfSupervisedTickDataset.
:param epochs: Number of training epochs.
:param lr: Learning rate.
:param device: 'cpu' or 'cuda' device string.
:return: None
"""
[Link](device)
optimizer = [Link]([Link](), lr=lr)
criterion = [Link]()
for epoch in range(epochs):
total_loss = 0.0
[Link]()
for masked_seq, original_seq, mask_indicator in dataloader:
masked_seq = masked_seq.to(device)
original_seq = original_seq.to(device)
mask_indicator = mask_indicator.to(device)
optimizer.zero_grad()
reconstruction = model(masked_seq)
# We only compute loss on masked positions
masked_positions =
,→ mask_indicator.unsqueeze(-1).expand_as(original_seq)
loss = criterion(reconstruction[masked_positions],
,→ original_seq[masked_positions])
[Link]()
[Link]()
total_loss += [Link]()
avg_loss = total_loss / len(dataloader)
print(f"Epoch {epoch+1}/{epochs}, Loss: {avg_loss:.4f}")
def extract_embeddings(model, sequence, device='cpu'):
"""
Pass a full unmasked sequence through the model and
return the final hidden states as embeddings.
:param model: Trained HFTransformer instance.
211
:param sequence: Single sequence of shape [seq_length,
,→ features].
:param device: 'cpu' or 'cuda' string.
:return: Embedding of shape [seq_length, model_dim].
"""
[Link]()
x = [Link](sequence,
,→ dtype=[Link]).unsqueeze(0).to(device)
with torch.no_grad():
# Forward pass
reconstruction = model(x) # shape [1, seq_length,
,→ input_dim]
# Let's consider the hidden states from the second last
,→ layer
# as "embeddings". We'll do it by hooking into
# the transformer's embedding logic. For simplicity,
# just return reconstruction here, although in practice
# you'd want the transformer's internal representation.
embeddings = reconstruction[0]
return [Link]().numpy()
def place_quote(embeddings):
"""
Dummy bridging logic to place quotes based on final embeddings.
A real system could incorporate spread calculations or RL-based
,→ logic.
:param embeddings: Numpy array of shape [seq_length, input_dim].
:return: A dict representing the chosen quote parameters.
"""
# For demonstration, we'll compute the average embedding
avg_vector = [Link](embeddings, axis=0)
# If average price reconstruction is high, we place an ask quote
# else we place a bid quote. This is purely for illustration.
quote_side = "ASK" if avg_vector[0] > 100 else "BID"
# Construct a dummy quote
quote = {
"side": quote_side,
"price": float(avg_vector[0]),
"volume": max(1, int(avg_vector[1])) # ensure at least 1
,→ lot
}
return quote
if __name__ == "__main__":
# 1. Generate synthetic tick data
raw_data = generate_synthetic_data(num_sequences=2000,
,→ seq_length=60)
# 2. Create dataset and dataloader
dataset = SelfSupervisedTickDataset(raw_data, mask_ratio=0.2)
dataloader = [Link](dataset, batch_size=32,
,→ shuffle=True)
212
# 3. Initialize model
model = HFTransformer(input_dim=2, model_dim=16, nhead=2,
,→ num_layers=2)
# 4. Train model with self-supervised objective
train_self_supervised(model, dataloader, epochs=3, lr=1e-3,
,→ device='cpu')
# 5. Extract embeddings from a single unmasked sequence
sample_seq = raw_data[0] # pick the first sequence
embeddings = extract_embeddings(model, sample_seq, device='cpu')
# 6. Place a quote using bridging logic
quote_decision = place_quote(embeddings)
print("Quote decision:", quote_decision)
This code defines several key components essential for a self-
supervised high-frequency forecaster:
• generate_synthetic_data simulates tick-level data for demon-
stration.
• mask_data randomly masks a fraction of a sequence for the
reconstruction objective.
• SelfSupervisedTickDataset creates a PyTorch-compatible
dataset returning masked sequences, original sequences, and
mask indicators.
• HFTransformer is a simplified Transformer-based model that
employs an encoder architecture to learn reconstructions of
masked data.
• train_self_supervised orchestrates the training loop, com-
puting MSE loss only on masked elements.
• extract_embeddings demonstrates how to retrieve final hid-
den states (or reconstruction outputs) for subsequent tasks,
such as quoting.
• place_quote is a placeholder bridging logic that turns em-
beddings into a basic quote decision.
Through self-supervised training on large tick-level datasets,
the model learns nuanced short-term price and volume patterns.
These learned representations can significantly enhance market mak-
ing algorithms by providing timely and accurate signals for quote
placement.
213
Chapter 31
Regret Minimization
Market Making
In this section, we illustrate a comprehensive Python implementa-
tion of a regret-minimizing market making algorithm inspired by
multi-armed bandit theory. The core idea is to measure the differ-
ence between the strategy’s cumulative profit and the best possible
profit in hindsight. By iteratively updating the quoting policy with
an online gradient or weighting method, the algorithm seeks to re-
duce regret over time, therefore continuously improving the quality
of market quotes.
Below, we provide a single, self-contained code listing that sim-
ulates a simplified market environment, implements a multi-armed
bandit-inspired market making class, calculates regret, and demon-
strates how the policy can adapt to approach an optimal perfor-
mance baseline.
Python Code
import numpy as np
import [Link] as plt
class MarketEnvironment:
"""
A simplified market environment that simulates a price process
over time. The environment randomly generates a price series and
provides a reward for each potential quoting action.
"""
214
def __init__(self, n_steps=200, seed=42):
"""
Initialize the market environment with a random price
,→ series.
:param n_steps: Number of time steps to simulate.
:param seed: Random seed for reproducibility.
"""
[Link](seed)
self.n_steps = n_steps
# Generate a random-walk-like price path
[Link] = self._simulate_prices()
def _simulate_prices(self):
"""
Create a random walk for prices starting from a base value.
"""
prices = [100.0] # Start price
for _ in range(self.n_steps - 1):
# Random price move: small delta around zero
[Link](prices[-1] + [Link](loc=0.0,
,→ scale=1.0))
return [Link](prices)
def get_price(self, t):
"""
Retrieve the price at a given time step.
:param t: Time step index.
:return: Price at time t.
"""
return [Link][t]
def get_reward_for_action(self, t, action):
"""
Compute a simplified reward for a chosen action at time t.
Here, 'action' might represent different quoting spreads or
,→ offsets.
We define a simple reward model: the market maker's profit
,→ or
loss depends on how well the chosen action aligns with the
,→ price
movement between t and t+1.
:param t: Current time step index.
:param action: Index of the chosen action (representing a
,→ spread level).
:return: Reward (profit) from taking that action.
"""
if t >= self.n_steps - 1:
# No next step, so no additional reward
215
return 0.0
current_price = [Link][t]
next_price = [Link][t + 1]
# Example logic: The narrower the spread, the higher the
,→ fill probability
# but the more potential slippage if price moves
,→ unfavorably.
# 'action' {0, 1, 2, ...} mapped to different "spread
,→ widths."
# For demonstration, let's define a linearly changing
,→ reward.
# Larger action index => wider spread => typically less fill
,→ but safer profit target.
# We'll define an approximate expected reward with random
,→ noise to emulate market.
spread_width = 1.0 + action # minimal example mapping
price_move = next_price - current_price
# Reward logic: profit might degrade with larger spread
,→ widths if the market is stable,
# but might protect from losses if the move is adverse.
,→ We'll combine the effect.
base_reward = (0.8 - 0.05 * spread_width) * price_move
# Some random fill effect—narrow spreads get filled more
,→ often
fill_effect = [Link]() - (action * 0.1)
reward = base_reward + fill_effect
return reward
class RegretMinimizationMarketMaker:
"""
A market maker that uses a regret-based approach to select
,→ quoting actions.
The algorithm tracks the difference (regret) between its
,→ cumulative reward
and the best possible cumulative reward in hindsight.
"""
def __init__(self, n_actions=5, learning_rate=0.1,
,→ time_weight=0.99):
"""
Initialize the market maker with a set of possible actions.
:param n_actions: Number of discrete actions (spread levels)
,→ to choose from.
:param learning_rate: Step size for online gradient-like
,→ updates.
:param time_weight: Factor for time-weighted updates in
,→ regret tracking.
"""
self.n_actions = n_actions
216
self.learning_rate = learning_rate
self.time_weight = time_weight
# Initialize probabilities for each action (uniform
,→ distribution)
self.action_probs = [Link](n_actions) / n_actions
# To store cumulative reward for each action, helpful for
,→ regret calculation
self.cumulative_rewards_per_action = [Link](n_actions)
self.times_chosen = [Link](n_actions, dtype=int)
# Logging for overall strategy
self.cumulative_reward_strategy = 0.0
self.cumulative_best_hindsight = 0.0
self.regret_log = []
def choose_action(self):
"""
Select an action according to the current action
,→ probabilities.
:return: Index of the chosen action.
"""
return [Link](self.n_actions, p=self.action_probs)
def update_strategy(self, chosen_action, reward,
,→ best_action_reward):
"""
Update the market maker’s strategy by measuring regret and
,→ applying weight updates.
:param chosen_action: Index of the action that was selected.
:param reward: The reward (profit) obtained from the chosen
,→ action.
:param best_action_reward: The maximum reward available from
,→ any action
at this time step (for hindsight
,→ comparison).
"""
# Update cumulative reward for strategy
self.cumulative_reward_strategy += reward
# Update best possible (hindsight) cumulative reward
self.cumulative_best_hindsight += best_action_reward
# Compute current regret
current_regret = self.cumulative_best_hindsight -
,→ self.cumulative_reward_strategy
self.regret_log.append(current_regret)
# Update tracking for chosen action
self.cumulative_rewards_per_action[chosen_action] += reward
self.times_chosen[chosen_action] += 1
217
# Online gradient or time-weighted probability update:
# We'll perform a simplified gradient step that increases
# the probability of actions that lead to higher reward
# and decreases the probability of actions that lead to
,→ lower reward.
# The update scale is tempered by 'learning_rate' and
,→ multiplied
# by a time weighting factor that slightly forgets old
,→ steps.
# Estimate average reward per action
avg_rewards = [Link](self.n_actions)
for a in range(self.n_actions):
if self.times_chosen[a] > 0:
avg_rewards[a] =
,→ self.cumulative_rewards_per_action[a] /
,→ self.times_chosen[a]
# Exponential weighting to slightly discount old data
self.cumulative_rewards_per_action *= self.time_weight
self.times_chosen = [Link](self.times_chosen *
,→ self.time_weight).astype(int)
# We perform a gradient-like update using the difference
,→ from average reward
# normalized around the chosen action
baseline = [Link](avg_rewards)
for a in range(self.n_actions):
# Probability increment or decrement depends on whether
# avg_rewards[a] is better than the baseline
gradient = (avg_rewards[a] - baseline)
self.action_probs[a] += self.learning_rate * gradient
# Re-normalize probabilities to ensure they sum up to 1 and
,→ remain valid
self.action_probs = [Link](self.action_probs, 0.0)
if self.action_probs.sum() == 0.0:
self.action_probs = [Link](self.n_actions) /
,→ self.n_actions
else:
self.action_probs /= self.action_probs.sum()
def simulate_regret_minimization(env, market_maker):
"""
Run a simulation across the environment using the
,→ regret-minimizing market maker.
:param env: MarketEnvironment instance.
:param market_maker: RegretMinimizationMarketMaker instance.
:return: (list of regrets across time, final action
,→ probabilities)
218
"""
regrets = []
n_steps = env.n_steps
for t in range(n_steps - 1):
# Determine the best possible action at this step in
,→ hindsight
# by comparing rewards for all actions:
possible_rewards = [env.get_reward_for_action(t, a) for a in
,→ range(market_maker.n_actions)]
best_action_reward = max(possible_rewards)
# Market maker chooses an action
chosen_action = market_maker.choose_action()
reward = possible_rewards[chosen_action]
# Update the strategy
market_maker.update_strategy(chosen_action, reward,
,→ best_action_reward)
[Link](market_maker.regret_log[-1])
return regrets, market_maker.action_probs
def main():
# Create a market environment
env = MarketEnvironment(n_steps=200, seed=42)
# Initialize a regret-minimizing market maker with 5 possible
,→ quoting actions
mm = RegretMinimizationMarketMaker(n_actions=5,
,→ learning_rate=0.1, time_weight=0.99)
# Simulate and retrieve regrets over time
regrets, final_probs = simulate_regret_minimization(env, mm)
print("Final action probabilities:", final_probs)
print("Final cumulative regret:", regrets[-1])
# Plot regret over time
[Link](figsize=(10, 6))
[Link](regrets, label='Cumulative Regret')
[Link]('Regret Over Time')
[Link]('Time Step')
[Link]('Regret')
[Link]()
[Link]()
if __name__ == "__main__":
main()
219
Below is a summary of the key components of this code:
• The MarketEnvironment simulates a random-walk-like price
path and provides a reward function for chosen quoting ac-
tions at each time step.
• The RegretMinimizationMarketMaker determines which spread
or quoting action to use. It tracks:
– Action probabilities (initially uniform),
– Cumulative rewards for each action,
– Times each action is chosen,
– Cumulative regret, computed against the best action in
hindsight.
• The update_strategy method calculates the regret and ap-
plies an online gradient-based adjustment.
• The main function orchestrates the simulation, logging re-
gret, and finally visualizing the result, which should gener-
ally show regret growing more slowly over time as the strategy
improves.
By comparing the strategy’s cumulative profit to the maximum
profit of the best actions in restrospect, the algorithm quantifies
and minimizes regret, adapting its quoting probabilities so that it
converges toward near-optimal performance.
220
Chapter 32
Multivariate Hawkes
Inventory Control
Using multivariate Hawkes processes to model correlated arrival of
orders and trades, this technique projects how intense order flow
events will cluster over short intervals. By forecasting the arrival
rate of buys and sells, the algorithm manages quotes so that in-
ventory stays balanced even during bursts of activity. The primary
steps involve:
• Calibrating a 2-dimensional (or higher) Hawkes model on intra-
day data (times of buy and sell arrivals).
• Forecasting intensities in real time.
• Plugging those intensity forecasts into a quoting engine that
adjusts spreads and inventory targets.
Below is a self-contained Python code example that demon-
strates the core concepts of calibrating a simple 2-dimensional ex-
ponential Hawkes model and building a basic real-time quoting
strategy that smooths positions in anticipation of correlated activ-
ity clusters. While it is provided as a demonstration, the building
blocks can be adapted to production environments with additional
optimizations, validations, and safeguards.
Python Code
import numpy as np
from [Link] import minimize
221
def simulate_hawkes_2d(mu, alpha, beta, horizon=100.0, seed=42):
"""
Simulate a 2D Hawkes process with exponential kernels.
:param mu: 1D array of base intensities, shape (2,).
:param alpha: 2x2 array of interaction coefficients.
:param beta: 2x2 array of decay rates.
:param horizon: Desired simulation time horizon.
:param seed: Random seed.
:return: A list of event times for each dimension [buy_times,
,→ sell_times].
"""
[Link](seed)
# We'll store event times in lists
events = [[], []] # events[0] = buy events, events[1] = sell
,→ events
# Current intensities start at mu (baseline)
current_time = 0.0
# We'll use an iterative thinning approach
# Initialize "virtual" intensities so we can sample next events
lambda_star = max(mu)
while current_time < horizon:
# Propose the next arrival time by an exponential random
,→ variable
wait_time = [Link](scale=1.0/lambda_star)
candidate_time = current_time + wait_time
if candidate_time > horizon:
break
# Compute the actual intensities for each dimension i at
,→ candidate_time
lambdas = []
for i in range(2):
intensity_val = mu[i]
# Add contributions from past events for dimension i
for j in range(2):
# All events from stream j
for t_j in events[j]:
if t_j < candidate_time:
intensity_val += alpha[i, j] *
,→ [Link](-beta[i, j] * (candidate_time -
,→ t_j))
[Link](intensity_val)
# Sum of intensities across all dimensions
lambda_sum = [Link](lambdas)
# Accept or reject
if [Link]() <= lambda_sum / lambda_star:
222
# We pick which dimension i occurred with prob =
,→ lambdas[i]/lambda_sum
which_dim = [Link]([0, 1], p=lambdas /
,→ lambda_sum)
# Record the event
events[which_dim].append(candidate_time)
# Update current time
current_time = candidate_time
# Update lambda_star if needed
lambda_star = max(lambda_star, lambda_sum)
else:
# Rejected candidate, move time forward anyway
current_time = candidate_time
return events
def hawkes_intensity_2d(t, mu, alpha, beta, events):
"""
Compute the 2D Hawkes intensity vector at time t.
:param t: Current time at which intensity is computed.
:param mu: 1D array of base intensities, shape (2,).
:param alpha: 2x2 array of interaction coefficients.
:param beta: 2x2 array of decay rates.
:param events: [event_times_dim0, event_times_dim1].
:return: [Link] of intensities [lambda_buy, lambda_sell].
"""
intensities = [Link]([0.0, 0.0])
for i in range(2):
intensities[i] = mu[i]
# Add contributions from past events for dimension i
for j in range(2):
for t_j in events[j]:
if t_j < t:
intensities[i] += alpha[i, j] * [Link](-beta[i,
,→ j] * (t - t_j))
return intensities
def full_log_likelihood(params, all_times):
"""
Log-likelihood for 2D exponential Hawkes.
We assume 'params' is a vector containing:
mu0, mu1, alpha00, alpha01, alpha10, alpha11, beta00, beta01,
,→ beta10, beta11
:param params: Flattened array of parameters.
:param all_times: [event_times_buy, event_times_sell], each
,→ sorted.
:return: Negative log-likelihood (we'll minimize this).
"""
# Unpack parameters
mu0, mu1, a00, a01, a10, a11, b00, b01, b10, b11 = params
if any(x < 0 for x in params):
# Force negativity for invalid param
223
return 1e10
# Sort events
buy_times = [Link](all_times[0])
sell_times = [Link](all_times[1])
T = max(buy_times[-1] if len(buy_times) > 0 else 0,
sell_times[-1] if len(sell_times) > 0 else 0)
# Calculate the log-likelihood for dimension 0 (buy events)
ll_0 = 0.0
for t_i in buy_times:
intensity_val = mu0
for t_j in buy_times[buy_times < t_i]:
intensity_val += a00 * [Link](-b00 * (t_i - t_j))
for t_j in sell_times[sell_times < t_i]:
intensity_val += a01 * [Link](-b01 * (t_i - t_j))
ll_0 += [Link](intensity_val)
# Subtract integral from 0 to T
# For exponential kernel, the integral part can be computed in
,→ closed form
# \int_0^T mu dt = mu*T
# \int_0^T \sum_j alpha e^{-beta (t - t_j)} dt for each
,→ dimension j
# yields alpha/beta * (1 - e^{-beta (T - t_j)})
integral_0 = mu0 * T
for t_j in buy_times:
integral_0 += (a00 / b00) * (1.0 - [Link](-b00 * (T - t_j)))
for t_j in sell_times:
integral_0 += (a01 / b01) * (1.0 - [Link](-b01 * (T - t_j)))
# Dimension 1 (sell events)
ll_1 = 0.0
for t_i in sell_times:
intensity_val = mu1
for t_j in buy_times[buy_times < t_i]:
intensity_val += a10 * [Link](-b10 * (t_i - t_j))
for t_j in sell_times[sell_times < t_i]:
intensity_val += a11 * [Link](-b11 * (t_i - t_j))
ll_1 += [Link](intensity_val)
integral_1 = mu1 * T
for t_j in buy_times:
integral_1 += (a10 / b10) * (1.0 - [Link](-b10 * (T - t_j)))
for t_j in sell_times:
integral_1 += (a11 / b11) * (1.0 - [Link](-b11 * (T - t_j)))
log_likelihood = (ll_0 - integral_0) + (ll_1 - integral_1)
return -log_likelihood # We minimize negative LL
def calibrate_hawkes_2d(events):
"""
224
Calibrate Hawkes parameters via maximum likelihood (scipy
,→ minimize) for 2D exponential kernel.
:param events: [buy_times, sell_times].
:return: (mu, alpha, beta) for dimension=2.
"""
# Flatten initial guess: [mu0, mu1, a00, a01, a10, a11, b00,
,→ b01, b10, b11]
init_guess = [Link]([0.5, 0.5,
0.2, 0.1,
0.1, 0.2,
1.0, 1.0,
1.0, 1.0])
bounds = [(1e-6, None)] * len(init_guess) # positivity
,→ constraints
res = minimize(fun=full_log_likelihood,
x0=init_guess,
args=(events,),
bounds=bounds,
method="L-BFGS-B")
if not [Link]:
print("Calibration did not converge:", [Link])
# Unpack
mu0, mu1, a00, a01, a10, a11, b00, b01, b10, b11 = res.x
mu = [Link]([mu0, mu1])
alpha = [Link]([[a00, a01],
[a10, a11]])
beta = [Link]([[b00, b01],
[b10, b11]])
return mu, alpha, beta
def realtime_quote_engine(current_time, position, mu, alpha, beta,
,→ events):
"""
Simple real-time quoting engine based on intensities from the
,→ Hawkes process.
If intensity of buys is high, assume market might move up, so
,→ make a wider spread on the sell side.
If intensity of sells is high, do the opposite.
:param current_time: Current time in the simulation/market.
:param position: Current inventory position.
:param mu: Baseline intensity array.
:param alpha: 2x2 cross-effects array.
:param beta: 2x2 decay array.
:param events: [buy_times, sell_times].
:return: new_quotes, new_position
"""
intensities = hawkes_intensity_2d(current_time, mu, alpha, beta,
,→ events)
buy_intensity, sell_intensity = intensities
225
# Example logic: if buy_intensity > sell_intensity, we
,→ anticipate upward pressure
# so we might keep a safe distance on the ask side. Conversely,
,→ if sell_intensity is
# bigger, we might keep a safe distance on the bid side.
base_spread = 0.01 # base tick or spread
dynamic_spread = 0.01 # additional dynamic spread
if buy_intensity > sell_intensity:
# Shift ask up
bid_price = 100.0 # example reference price
ask_price = 100.0 + base_spread + dynamic_spread
else:
# Shift bid down
bid_price = 100.0 - base_spread - dynamic_spread
ask_price = 100.0
# A trivial fill simulation: if the intensities are large, we
,→ simulate partial fills
fill_size = int((buy_intensity + sell_intensity) * 0.01) # e.g.
,→ a small fraction
# If buy_intensity is high, more buy fills on the ask side =>
,→ reduce position
# If sell_intensity is high, more sell fills on the bid side =>
,→ increase position
# This is an oversimplified fill logic for demonstration.
position_delta = fill_size * (sell_intensity - buy_intensity) /
,→ max(buy_intensity + sell_intensity, 1e-6)
new_position = position + position_delta
# Return updated quotes (this is just a conceptual example)
new_quotes = {'bid_price': bid_price, 'ask_price': ask_price}
return new_quotes, new_position
def main():
# Step 1: Simulate synthetic events for demonstration
true_mu = [Link]([0.4, 0.3])
true_alpha = [Link]([[0.25, 0.10],
[0.05, 0.20]])
true_beta = [Link]([[1.5, 1.5],
[1.5, 1.5]])
events = simulate_hawkes_2d(true_mu, true_alpha, true_beta,
,→ horizon=50.0, seed=42)
buy_times, sell_times = events
print("Simulated buy events:", buy_times[:10], "...")
print("Simulated sell events:", sell_times[:10], "...")
# Step 2: Calibrate model
mu_est, alpha_est, beta_est = calibrate_hawkes_2d(events)
print("Estimated mu:", mu_est)
print("Estimated alpha:\n", alpha_est)
226
print("Estimated beta:\n", beta_est)
# Step 3: Real-time quoting over a small discrete time loop
# We'll jump in small increments and update quotes
times = [Link](0, 50, 20) # 20 discrete moments
position = 0.0 # inventory
for t in times:
new_quotes, position = realtime_quote_engine(
current_time=t,
position=position,
mu=mu_est,
alpha=alpha_est,
beta=beta_est,
events=events
)
print(f"Time={t:.2f}, Quotes={new_quotes},
,→ Position={position:.2f}")
if __name__ == "__main__":
main()
Below is a concise summary of the code sections:
• simulate_hawkes_2d: Generates synthetic correlated ar-
rival streams (buys and sells) via a thinning algorithm for a
2D exponential Hawkes process.
• hawkes_intensity_2d: Computes the instantaneous in-
tensities for each dimension (buy/sell) at any given time t, us-
ing the exponential kernel formula and recorded event times.
• full_log_likelihood: Defines the negative log-likelihood of
the 2D exponential Hawkes model, which is used for param-
eter estimation.
• calibrate_hawkes_2d: Employs a numerical optimizer to
fit the Hawkes parameters (, , ) by maximizing the log-
likelihood on historical event data.
• realtime_quote_engine: Demonstrates a basic inventory-
aware quoting approach that adjusts spreads dynamically
based on the relative intensities of buy and sell arrivals.
• main: Orchestrates the flow—simulating synthetic data, cal-
ibrating parameters, and running a simple time-stepped sce-
nario in which quotes and inventory are updated in response
to predicted intensities.
227
By integrating the intensity forecasts from the 2D Hawkes model
into a quoting engine, a market-making strategy can proactively
manage inventory and adjust its spreads in anticipation of corre-
lated order flow “bursts.” This forms the basis for more sophisti-
cated real-time trading architectures leveraging Hawkes processes
for short-term order flow predictions.
228
Chapter 33
Cross-Exchange
Liquidity Aggregator
Below is a comprehensive explanation and Python code snippet
for a cross-exchange liquidity aggregator, which monitors multiple
exchanges, aggregates order book data, detects arbitrage oppor-
tunities, and places trades while respecting inventory constraints.
This example uses threading to simulate parallel data collection
and order management, and it implements simplified functions for
interacting with exchanges, scanning for price discrepancies, and
risk management. In real-world usage, these placeholders would
be replaced with production-ready APIs, robust data handling, and
advanced risk monitoring.
Python Code
import time
import threading
import random
from collections import deque
class ExchangeClient:
"""
Simulates or connects to an exchange to fetch order book data
and execute trades. In real systems, this would wrap actual
REST/WebSocket APIs.
"""
def __init__(self, name, max_position=10):
[Link] = name
229
self.order_book = None # Will store order book as a dict
[Link] = [Link](0.05, 0.15) # Simulated
,→ network latency
[Link] = 0
self.max_position = max_position
[Link] = True
def fetch_order_book(self):
"""
Simulates fetching an order book from the exchange.
For demonstration, the best bid and ask are random.
"""
# Simulate fetching data with possible network delay
[Link]([Link])
best_bid = round([Link](99.0, 101.0), 2)
best_ask = best_bid + round([Link](0.1, 0.5), 2)
self.order_book = {
'best_bid': best_bid,
'best_ask': best_ask,
'timestamp': [Link]()
}
def get_order_book(self):
"""
Returns the latest order book data.
"""
return self.order_book
def place_order(self, side, quantity, price):
"""
Placeholder for placing an order on this exchange.
This function simply prints out the action and updates
,→ position.
"""
[Link]([Link]) # simulate order placement latency
if [Link]() == 'buy':
[Link] += quantity
print(f"{[Link]} BUY: Qty={quantity} @ Price={price}
,→ | New Position={[Link]}")
elif [Link]() == 'sell':
[Link] -= quantity
print(f"{[Link]} SELL: Qty={quantity} @ Price={price}
,→ | New Position={[Link]}")
return True
def run(self):
"""
Continuously fetch updates for the order book while running.
"""
while [Link]:
self.fetch_order_book()
def stop(self):
230
"""
Signal to stop the client.
"""
[Link] = False
class CrossExchangeAggregator:
"""
Aggregates data from multiple exchanges and monitors
for cross-exchange discrepancies.
"""
def __init__(self, exchange_clients, risk_limit=20):
self.exchange_clients = exchange_clients
self.aggregated_data = {}
[Link] = [Link]()
self.risk_limit = risk_limit
self.global_position = 0
self.trade_log = deque(maxlen=1000) # track recent trades
[Link] = True
def update_aggregated_data(self):
"""
Gathers order book data from each exchange
and stores it in a local structure.
"""
with [Link]:
for client in self.exchange_clients:
ob = client.get_order_book()
if ob is not None:
self.aggregated_data[[Link]] = ob
def detect_arbitrage(self):
"""
Simple arbitrage detection:
Compare the lowest ask vs highest bid across exchanges.
If an opportunity exists, place trades.
"""
with [Link]:
if not self.aggregated_data:
return
best_bid = None
best_bid_exchange = None
best_ask = None
best_ask_exchange = None
# Find the highest bid and lowest ask among all
,→ exchanges
for name, data in self.aggregated_data.items():
bid = data['best_bid']
ask = data['best_ask']
if best_bid is None or bid > best_bid:
best_bid = bid
231
best_bid_exchange = name
if best_ask is None or ask < best_ask:
best_ask = ask
best_ask_exchange = name
# Check if there is an arbitrage spread
if best_bid is not None and best_ask is not None:
if best_bid > best_ask:
spread = best_bid - best_ask
print(f"[Arb Detected] Buy on
,→ {best_ask_exchange} at {best_ask}, "
f"Sell on {best_bid_exchange} at
,→ {best_bid}, Spread={spread:.2f}")
# Attempt to place trades if it doesn't violate
,→ risk
self.place_arbitrage_trades(best_bid_exchange,
best_ask_exchange,
best_bid,
best_ask)
def place_arbitrage_trades(self, bid_exch_name, ask_exch_name,
,→ bid_price, ask_price):
"""
Places offsetting trades on two exchanges to capture spread,
respecting position limits.
"""
quantity = 1 # Simplified fixed quantity for demonstration
# Check risk constraints
if abs(self.global_position + quantity) > self.risk_limit:
print("Risk limit reached; skipping arbitrage trades.")
return
bid_exchange = None
ask_exchange = None
for client in self.exchange_clients:
if [Link] == bid_exch_name:
bid_exchange = client
elif [Link] == ask_exch_name:
ask_exchange = client
if bid_exchange and ask_exchange:
# Buy at ask_exchange
success_buy = ask_exchange.place_order('buy', quantity,
,→ ask_price)
# Sell at bid_exchange
success_sell = bid_exchange.place_order('sell',
,→ quantity, bid_price)
if success_buy and success_sell:
with [Link]:
# Update aggregator's global position
self.global_position += (quantity) # net buy
self.global_position -= (quantity) # net sell
232
# effectively might be flat, but real fill
,→ slippage is ignored
self.trade_log.append(([Link](), bid_exch_name,
,→ "SELL", quantity, bid_price))
self.trade_log.append(([Link](), ask_exch_name,
,→ "BUY", quantity, ask_price))
else:
print("Trade execution error. Could not complete
,→ both sides.")
else:
print("Exchange references not found for arbitrage
,→ trades.")
def monitor_inventory(self):
"""
Continuously check inventory across all exchanges to ensure
global exposures stay within risk boundaries.
"""
total_pos = 0
for client in self.exchange_clients:
total_pos += [Link]
# For demonstration, aggregator is also counting local
,→ global position
# to unify risk checks, real system might have a more
,→ sophisticated
# aggregator-based netting approach.
with [Link]:
self.global_position = total_pos
if abs(self.global_position) > self.risk_limit:
print(f"Warning: Global Position {self.global_position}
,→ exceeds risk limit {self.risk_limit}")
# Potentially reduce positions or widen quotes
def run(self):
"""
Main loop that updates aggregated data and checks for
,→ arbitrage.
"""
while [Link]:
self.update_aggregated_data()
self.detect_arbitrage()
self.monitor_inventory()
[Link](0.5)
def stop(self):
"""
Stops the aggregator logic.
"""
[Link] = False
def main():
233
"""
Sets up multiple exchange clients, aggregator, and
orchestrates each thread. In a real system, error-handling,
advanced scheduling, and robust data ingestion will be required.
"""
# Create simulated exchange clients
exchange1 = ExchangeClient("ExchangeA", max_position=10)
exchange2 = ExchangeClient("ExchangeB", max_position=10)
exchange3 = ExchangeClient("ExchangeC", max_position=10)
# Start exchange threads
t_exch1 = [Link](target=[Link], daemon=True)
t_exch2 = [Link](target=[Link], daemon=True)
t_exch3 = [Link](target=[Link], daemon=True)
t_exch1.start()
t_exch2.start()
t_exch3.start()
# Initialize aggregator
aggregator = CrossExchangeAggregator([exchange1, exchange2,
,→ exchange3],
risk_limit=5)
t_agg = [Link](target=[Link], daemon=True)
t_agg.start()
try:
# Let the aggregator run for a while
[Link](10)
finally:
# Stop systems
[Link]()
[Link]()
[Link]()
[Link]()
t_exch1.join()
t_exch2.join()
t_exch3.join()
# Graceful shutdown of aggregator
t_agg.join()
print("All threads stopped. Final aggregator data:")
print(aggregator.aggregated_data)
print(f"Global Position: {aggregator.global_position}")
print("Trade log:", list(aggregator.trade_log))
if __name__ == '__main__':
main()
Below is a brief explanation of the main components:
234
• ExchangeClient: Represents an individual exchange con-
nection. It simulates fetching an order book and placing or-
ders, and keeps track of local inventory.
• CrossExchangeAggregator: Gathers order book data from
all connected exchanges, detects arbitrage by searching for
spread opportunities, and manages a global position to en-
force risk limits.
• place_arbitrage_trades: Demonstrates how spread-based
cross-exchange orders might be placed. In a production sys-
tem, robust error handling, partial-fill logic, and advanced
order sizing would be added.
• monitor_inventory: Ensures that the aggregator’s global
position remains within a specified risk limit, thereby pre-
venting uncontrolled inventory exposures as orders fill unpre-
dictably across multiple venues.
• main: Orchestrates the startup and synchronization of mul-
tiple exchange threads and the aggregator, illustrating how
this system might run in a real-time environment.
This code is a simplified demonstration and would need signifi-
cantly more robust handling for real-world trading, including pro-
duction APIs, concurrency/cancellation logic, advanced risk man-
agement, and fault tolerance.
235