A Junior Quant's Guide to Prediction [Code Included]
The prediction game is tough, except for when it's not.
As a quant, trader, or even just a regular market participant; your job is simple: to predict things.
Sure, you also have to deal with data preparation, risk management and the million other things, but at the end of the day — you have to make a prediction.
As modern-century humans, we’ve pretty much perfected the art of prediction — heart attack risk, likelihood of loan defaults, text that accurately responds to given prompts — perfected it everywhere — except in Finance.
Take a look at this new study where even traders who were given inside information in advanced struggled to break 50/50 — ‘Crystal Ball’ Breaks as Traders Fail to Get Rich in New Study
Most market participants have resigned to the idea that it all comes down to a coin-flip, but of course, quantitative investing wouldn’t be a multi-billion dollar industry if that were the case.
So, today, we’ll be doing a deep dive into the nuts and bolts of actionable, quantitative predictions, allowing you to see that it’s a much bigger game than just predicting a 50/50 up or down.
Once Upon a Time, There Was a Regression Model
We want to start simple, so we’ll start with the simplest form of model — regression.
Regression models are simple — lower/higher values of X, our input, leads to lower/higher values of Y, our output — easy.
While these are the “trust me bro, the final value will be somewhere around this general area, probably, more likely than not…” of models, that’s not a bug, it’s a feature.
To see why, let’s quickly scrap together a regression model that we can use to predict SPX returns.
But first, we must address the most crucial core concept in predictive modeling:
Garbage In, Garbage Out.
Models are only as good as the underlying data we give it, so it’s crucial to make sure that our features (inputs) have solid fundamental sense.
So, on that thread, think about what kinds of things have reasonable reasons for why they’d be good at predicting our target — in this case, S&P 500 returns.
Seriously, think for a second about what things might be drivers of future returns. For each variable you think of, try to iron out the rationale of why it makes sense.
One option is the VIX index:
The VIX is largely derived from how expensive out-of-the-money options on SPX get
If investors are paying more for protection, it’s likely because something bad has happened, is happening, or will happen.
If investors are paying significantly less for protection, it’s a sign that the bad times are over, or at least perceptions of the bad times are easing.
If we want to model that simple relationship — higher expectations in vol = lower expectations of future SPX returns — a linear model would be perfect.
So, to start, we’ll create just 2 features from the VIX index, the 1-day return and the daily value of the index.
The 1-day return represents how much the VIX went up that_day
The daily value represents the value of the VIX that_day
that_day is defined as the given date
Our target will be the return of the S&P 500 the next day.
Now, returns are not normally distributed and we might screw with our model’s head if the VIX reaches 100 but the actual next day return is only -0.03%, so we’ll convert this into a binary classification task.
So, if the next day’s return was positive, we convert that value into a 1, if it was negative (or flat), we convert it into a 0.
Here’s a look at what this data will look like:
As you can visually inspect in this snippet, when the VIX goes up (daily return > 0), the next day return tends to be negative (0) — when it goes down (daily return < 0), the next day return tends to be positive (1).
Once we have our dataset, we’ll deploy walk-forward testing:
This is essentially going forward 1 day at a time, training the model on data only available on/before that day, then passing in that day’s data to get a prediction for the next.
As to the ideal training size for each day, we personally prefer 252 prior samples, with a max of 504.
There are 252 trading days per year, so 252 ensures your data has enough samples to establish relationships, while still being relevant enough to predict contemporary data. You don’t want to use data from 2013 to predict a 2024 outcome just because you’ll have more data points.
This helps us get a better idea of how our model would’ve performed in real-life since it is the format we would actually deploy in production.
Compared to other sciences’ methods of say, cross-fold validation, we won’t be predicting hundreds of values at once, especially since each value is a future point in time — we would only have the data before today and we would pass in just today’s data for tomorrow’s prediction.
We’re not ruling out using cross-fold validation as it can still be effective in deducing model skill, but walk-forward testing keeps things as realistic as possible.
After running the model, we’ll run a few performance metrics to see how well it was able to capture this relationship.
So, let’s see how it did:
Classification Report
First, we have a classification report used to get a few key performance indicators (KPIs) for a binary classification model:
Precision: The precision value essentially answers the question of “when the model predicted a 1/0, how often did SPX actually go up/down?”
Over our sample, from 2024-01-01 to the current day (at time of writing), the precision for the 1 class was 59%, and 43% for the 0 class.
Recall: For each class (1 or 0), this essentially answers “what percent of all instances of this class did the model pick up?” or “of all times the market went up the next day, how often did our model predict that?”
Over the sample, the model accurately spotted 88% of the times when the market went up and 12% of the times when the market went down.
Receiver operating characteristic (ROC) curve
Next, we have the ROC AUC curve. This curve is essentially a way of evaluating our model’s skill compared to a 50/50, random coin flip:
The area under the curve (AUC) represents the skill output, which in our case was 0.58 (see lower right-hand corner).
So, a skill of 0.58, paired with a ~57% accuracy over the sample lets us start with the idea that our model isn’t complete garbage and is at least better than just a random coin-flip.
Now, remember when we mentioned how the simplicity of linear models was a feature, not a bug?
Well, to see that, let’s take a look at some of the prediction outputs generated by the model:
As demonstrated, our model did not try to be ambitious and dime-tick the ups and downs of each day. For weeks at a time, the model just predicted either blanket 1s or blanket 0s.
This might seem “bad” at first, but it makes sense:
On a de facto basis, the S&P goes up.
Take a look at any SPX/SPY options chain and create an at-the-money put credit spread. The probability of this spread paying off (max_loss / spread_width) will be at least 53%. Meaning that even the options market prices-in this de-facto phenomenon.
Now, regression models are simple, but they aren’t stupid. When our VIX features increase, the model also knows when to switch gears:
So, even if a regression model won’t try to pinpoint the exact day-to-day moves, it generalizes well and captures simple relationships.
If this model was traded outright, your returns would likely be about the same as holding the index, possibly less after transaction costs. Nevertheless, the job was done.
By first assembling features that made intuitive sense, scaling the target for robustness, and finally choosing the appropriate model, we were able to walk away with a solid predictor that can make money.
Now, this simple model is just a start, but if you want to possibly outperform with it, you might have to get a bit more creative:
If you have the direction of the S&P, what about the constituents or even just correlated stocks?
At that point, if you “knew” the S&P was likely to continue going up, you can long a basket of the top 10 holdings while going short the underlying index.
This basket is likely to be highly correlated to the index, but also likely to outperform due to the higher volatility. You would flatten your exposure when the model expects markets to go down.
If your model continues to expect the S&P to go up, would that, by proxy, be a bet that the VIX is likely to go down?
Could long signals be construed as a signal to get exposed to short volatility products like SVIX?
As always, we’ll leave the code and instructions for you to replicate this on your own with even more data, with an additional file allowing you to get the predictions for the next day in real-time.
Now, in the land of predicting up/down returns, you’re really not going to see numbers much above 55%, especially for single-stocks and across longer horizons. So, while we might max out at coin-flip territory when it comes to predicting returns, what if tried to predict something a little easier?
Instead of just predicting an up/down return, we can focus more on complex, second-order relationships — “if this thing happens, it makes this other thing more likely to occur, which then makes this other thing likely to not do this”.
This is where the “edge” of having an intimate understanding of the dataset/market you’re modeling kicks in.
These second-order relationships are intuitively more complex, so while regression models have been good to us, we need a war-time model.
The Fancy Models
When you’ve got a problem too complex for a regression task, it’s time to move on to the tougher breeds of model, namely; decision trees — boosted, bagged — we love ‘em all and you’ll see why.
Decision tree models have major advantages in the quant space, but doing these justice will need to be a post of its own. So, that’s what we’ll do, when we next take a look at the land of predictions, we’ll go over some more complex experiments that give these models are better a chance to shine.
Code
The given repository contains the logic for running the backtest on your own, with the options to change the backtest period or symbols as you wish, followed by the logic for getting market predictions in real-time.
Backtest Workflow
To begin, head over to our “prediction-models” GitHub repo and download the “spx-vix-regression.py” file:
If you run this file right out of the box, the console will output the classification report and a plot.
You can change the dates in the “trading_dates” variable to change the size of the dataset. Additionally, you can change the dates in the “backtest_dates” to see the predictions over a longer/shorter sample.
The trained model and predictions are stored in their respective variables, so we recommend using the Spyder IDE to explore them and get a more intimate understanding of what’s going on:
Production Workflow
To begin, download the “spx-vix-regression-prod.py” file.
You can run this right out-of-the-box, and it will output a string of the prediction:
Still needing to scratch that mental itch? Check out a few more posts just like this:
Happy trading! 😄