The Eveince data science team is always trying to enhance the ML pipelines behind the scenes for our self-driving funds to decide wiser and hedge smarter!
Business insights from the data team
Here we provide some high level, abstract summary about what we have achieved in data team since last report. if you are data scientist, you might find it more interesting to skip to "deep dive". - Our direction model (that get us info about market direction and give us signal to buy/sell) is now restructured and is more robust. on the other hand, it's getting more predictive power by watching different horizons of extracted information from market. - Order placement is like an agent responsible for breaking orders into series of smaller ones and execute them on the best possible time. now we have designed better simulation for this agent to learn more accurately about the consequences of her actions in placing orders. - We are now using volume indicators to give our self-driving fund a better view on market status considering trading volume of the market. a better view will result in better investment decisions consequently. - In newly deployed bet sizing, our system is calculating the risk of betting on a each market separately.it's also possible for investors to set the maximum loss they can afford, so that value at risk comply with their choice.
Deep dive into data team advances
Enhancing direction model
our direction model (which is responsible for predicting market movements) has major restructuring to comply with the system team’s new design.
Algo service prime is now in the testing stage and will be deployed afterward.
We are also in an R&D state for multi-horizon momentum to enhance our prediction power. Our current momentum model works on a fixed window and predicts if the market is going to comply with the current momentum or not. The momentum, however, can be dynamically changed based on the window size and the resolution. This task aims to refine the momentum estimation constraints by adopting multiple window sizes at once.
Order placement advances
Order placement service is responsible for generating the best plan for placing an order in the market to gain profit with a better volume-weighted average price (VWAP). We have used reinforcement learning for simulating the order placement environment and training agents to create the best placement plans in different markets.
The latest version of our order placement model is now more stable, and several parts of the RL training pipeline have improved. After redesigning the match engine and RL environment, we have been working on the training components such as replay memory and parameter schedulers. Several evaluation metrics have also been added to the pipeline for white-box analysis.
Using Q value Loss for choosing better action plans: we are now using Huber loss for q value updates which increases the stability of the policy to decide what to do next. It improves the learning agent’s behavior and results in better actions in the market.
Adding Auxiliary Task to enrich pipeline with informative embeddings: with our new direction prediction task that has been added to the placement training model, The model would learn useful embeddings from order book data and features.
Incentivize model to be sensitive about the market price change between decision and action: Implementation shortfall(IS) is employed as the reward function. Whenever the agent takes action in the environment, it gets a reward from it to learn about the consequences. IS is calculated as the difference between the price when the system makes an investment decision and the final price achieved.
Make wiser decisions with prioritized observation via replay memory: Prioritized experience replay makes the model observe informative trajectories more often. This capability is now supported based on TD-Error.
Tuning schedulers to optimize learning rate: alpha, epsilon, and gamma schedulers have been optimized.
Using Monte Carlo for episode rollout: To calculate the reward of each state, we are now using Monte Carlo simulation to find out the average of all possible rewards (in the DQN model), considering different upcoming states of the agent. Then discounted rewards are computed.
We are monitoring the end-to-end pipeline with our brand new dashboard containing auxiliary tasks and evaluation metrics, as illustrated in the following figure.
Feature space upgrade: We now predict the market much faster, with an eye on volume indicators
After testing and validation, volume indicators are added to the production algorithm. Volume-based indicators add a sense of market status regarding traded volume to the system. Traded volume can be used as a validation criterion for other technical indicators, which is also a common practice in technical analysis. We've added several volume-based indicators to the pipeline after backtesting validation (which proves enhancement in several metrics such as SR). The new feature space comprises several indicators categories, leading to a diverse and predictive embedding.
On the other hand, we are now 5x faster in feature calculations! With the new design and buffering features for inference calls, we can analyze the market faster than ever.
Investors can set the maximum loss they can afford, and our algorithms comply
Regarding our previous R&D on bet sizing and the promising results of our tests, we have deployed a new version for bet sizing, which assesses the risk associated with each market individually. It means that we now assess risks concerning each market’s behavior. It's also possible for us to let individuals set the maximum loss they can afford. Our algorithm will set VaR dynamically so that it guarantees (with 99% confidence) that this objective is met.
Parameter optimization R&D
Parameter optimization backtests failed to approve pipeline generalization. We have to confirm the algorithm's robustness across different markets and configurations and, until then, cannot move to production.