16 Comments

Hi! I love your posts! :-)

Two questions:

1) How do you mathematically determine the underlying prob. distributions? I am not able to find this in the code? Also, did you compare that to results obtained from Breeden & Litzenberger? Right now, I find it hard to get a feeling for how well your algorithms perform.

2) It would be interesting to see how the change in Implied Vol's in the original trades affected the price. Did you look at the historic prices or do you know where we could ge them from?

Expand full comment

Boeing might be a good one to try this on.

Expand full comment

Good call! I just ran it a moment ago and it looks like the market is pricing in a normal distribution, with the expectation that price will be below $220 at every expiration until 02/02/2024. From there, the market is pricing in a good probability of the price being above $220 and recovering.

Btw, to find this, I ran the "implied-probability-production.py" file in the repository.

Thanks!

Expand full comment

Hi! I just noticed on the GitHub page containing the code to this project, there are API keys filled in for the variables on some of the scripts. I don’t know if that was intentional. Just wanted to let you know:)

Expand full comment

Hi Troy,

So, I started leaving the API keys in there because I wanted to make the experiments as accessible as possible to readers who may not be willing/able to purchase licenses (data is expensive haha!)

But thank you for looking out! I keep things like the database credentials obfuscated to prevent accidental deletions and things, but I’ll try to keep the API keys in there (for as long as I can).

Thanks!

Expand full comment

I love that! You’re right, I’ve been trying to follow your articles, but as a college student, it’s hard to get expensive data. Thank you for all you do, your articles are amazing and inspiring!

Expand full comment

I'm so glad you're enjoying them, thank you for the kind words! We're only just getting started :)

Expand full comment

hi, thanks for another great article. i'm curious why you use "ticker_call_contracts" and not puts too?

also, when i run the option-probability-distribution.py file, i can only see 'last_quote.last_updated' probably 9 times but i don't get any other output. i can generate the "ticker_call_contracts", "call", "put" dataframes, when i call the respective names, but nothing else. just wondering if the api has something to do with it? thanks for your help

Expand full comment

Hi there, apologies for the delay!

So, the ticker_call_contracts line is just to get the available strikes of that ticker. Some stocks have strikes increments of 2.5 (e.g., 192.5, 195), and some are single (e.g., 192, 193). Every strike has a call and a put, so if we take the 10 appropriate strikes from the calls, it will be equivalent to the 10 appropriate strikes as the puts (i.e., 110 strike has both a 110 call and a 110 put).

And you likely received no output because of the date parameter defined at the start of the code. It uses snapshots of the most recent trading day, so that parameter should be set to the last trading day. Currently, it takes today's date and subtracts 2 days from it because the post was released on Sunday, so for the people running it that day, the most recent trading day was Friday, 2 days prior. If you ran it yesterday, the date would be set to Saturday which has no data.

If you're running this during the week and/or on a trading day, simply comment out the first date variable and un-comment the second one like this:

original code:

date = (datetime.today() - timedelta(days=2)).strftime('%Y-%m-%d')

# date = datetime.today().strftime('%Y-%m-%d')

weekday code:

# date = (datetime.today() - timedelta(days=2)).strftime('%Y-%m-%d')

date = datetime.today().strftime('%Y-%m-%d')

Hope that helped!

Expand full comment

thanks a lot for replying. oddly enough i could obtain the data when i entered "calls" or "puts" in my jupyter notebook.

Expand full comment

No problem, if anything else is tripping you up just let me know!

Expand full comment

Hi, Thanks again for the detailed article. Appreciate you are sharing the knowledge. just want to probe your thought on this little more. i am trying to leverage this framework and explore to see if we track the imbalance on a intraday basis to see the impact on the price direction on short duration. do you any opinion/thoughts/suggestion on this to adapt it for short duration price direction?

Expand full comment

Hi there,

Seeing if the imbalances have an effect intraday can definitely be an interesting test! 1 minute data might be noisy, but having the imbalance per hour might show some predictability.

There might also be some edge on a higher frequency basis where, for example, if a big option trade comes in on SPY, the MM may try to hedge the through shares instantly (<5 seconds), etc.

Expand full comment

Another great article. Thanks!

Few question for ya. I'm assuming we're looking for option strikes around the red line implied least likely strike?

If I understood the code correctly, I think I saw an API call for all strike prices for 2 days being used for the model.

Does this model only require end of day (historical) data meaning no live data needed?

How much data is being pulled from Polygon per ticker used?

Trying to get an idea of if that data being pulled with the API stays within the free plan with Polygon.

Is polygon the only place to get the options data or is it just easier to get it from Polygon? Considering other sources like, Yahoo, Databento, OpenBB, or brokers with an API that may be free or lower cost than Polygon since i may need data for several tickers.

Expand full comment

Hi TJ, thanks!

The code calls Polygon’s snapshot endpoint which only has data for the last updated timestamp (usually instant). The comment for going 2 days back is that when we pass in the date, it needs to be a trading day and if running on a Sunday, Friday is the last one, being 2 days back.

The reason I don’t think this specific experiment can extend to other data vendors is because it uses a snapshot as opposed to OHLCV/Quotes. If an option might not have been traded that specific day, then no OHLCV points are anywhere, but with a snapshot, it’s what the entire option chain looked like most recently if you were looking at it through a broker. Being able to get those most recent quotes on the otherwise unavailable small, quiet stocks is highly important since the goal is to be the one who makes the first trades in those options. Outside of big names like SPY and QQQ, reliable option data can be tricky to get otherwise.

According to the pricing page it’s available on the starter plan, but polygon doesn’t charge by usage so there’s no limits. What’s important is that whatever API you end up choosing, is to pick one that you won’t outgrow — yfinance (the scraper package, not Yahoo) and openbb, while free, don’t have reseller agreements with OPRA and the exchanges, so they won’t be good for long-term, serious operations outside of the basic stuff.

And yes, the red line implies what the market thinks is the least likely strike (where investors are bidding the least), so if your thesis is that the share price has a higher likelihood of ending there, then you would know that you’re legitimately betting against what’s priced in. This is also true for areas that are in the far tail ends of the curve.

Expand full comment

That makes sense. Thanks for the response.

Expand full comment