blackswan wrote: ↑Mon Jan 25, 2021 2:36 am
OK, I've updated my simulations, too.
https://nbviewer.jupyter.org/urls/gitla ... ETFs.ipynb
NOW with over 50 leveraged ETFs! New and improved more accurate data!**
**(Probably. I hope.
)
CSV:
https://gitlab.com/doctorj/quantitative ... line=false
The main update is empirical selection of a curve-fitting model, i.e., which variables do we "fudge." TL;DR: a simple additive constant is best, and throwing out bad data is even better.
Taking Uncorrelated and siamond's suggestions, I assembled about 50 leveraged ETFs and mutual funds, fit models to the last half, and measured their accuracy on the first half. I also threw out any "rough start" data at the start of each ETF's history, determined by manually eyeballing the fit. (Wouldn't you know, this usually boils down to throwing out ProShares data before 2009 and Direxion data before 2013.)
So: model selection. I took the basic equation:
Code: Select all
lev = factor * ret - exp - (factor - 1) * borrow
And added fudge factors: scaling the leverage factor, applied to returns only (A); adding to the leverage factor that multiples borrow rates (B); scaling the borrow rate (C); and an overall additive constant (D).
Code: Select all
lev = factor * A * ret - exp - (factor + B - 1) * borrow * C + D
I then took all 16 possible subsets of fudge factors A B C D and found the best fit with that subset on the most recent half of the data, for each of the ~50 leveraged funds independently. That is, for each combination of fitting each parameter or leaving it at the default (1 for scales, 0 for increments), I found the best fit on the last half of the fund, then measured the error on the first half. I used median relative RMSE (of the telltale) as the main figure of merit; this essentially weights each fund equally. Fitting only D (an overall additive constant) gives the best out-of-sample RMSE of .0124 and the smallest IQR spread, followed closely by the combo of A and D (factor scale and additive constant) at .0129. D is typically in the range .001 to .01 (annualized).
I also tried adding a factor for T-bill rates, but it didn't really help.
One nice additional property of only using a single additive constant is that it makes it easy to spot bad data from a telltale.
The single additive constant is also very near the best when including bad data (RMSE within .004), so it's fairly robust. Incidentally most leveraged funds had large misses during the coronavirus crash in March 2020, which also tweaks the fit. I did not exclude it for the overall final fit because it didn't make a huge difference. In the "future work" department, filtering individual large outlier days (large deviations from "the formula") and/or bootstrapping could provide an even more robust fit.
However, the results are already not terrible. Using the single-constant model and fitting all the data (except bad initial data), all simulated funds are virtually always within 30% of actual over the life of the fund. Most funds are within 6%. The median (absolute) CAGR difference is .003 and the largest is .014.
So, just to be explicit, the final model / formula is:
Code: Select all
lev = factor * ret - exp - (factor - 1) * borrow + D
where D is the only curve-fit parameter.
I included leveraged mutual funds in model selection because they have a good long history. They
seem to be playing the same basic game under the hood as ETFs, but if there are reasons why they would not use the same leverage formula, let me know. Incidentally, I had to throw out many mutual funds because Yahoo just has complete garbage prices for them. Their charts are correct, but the historical data download is completely different.
Let me know what you think!