Actionable insights straight to your inbox

Equities logo

A Tale of Two Backtests

3 simple tests to employ when evaluating a trading strategy, using this promising idea from 2014 as an example of how not to do it.
CEO

During the course of my working week, I research a lot of trading strategies – some of them originated by me while others are ideas from other sources. I am constantly hunting through blogs, news articles and research papers for concepts that showed promise when published that I can update and test with the benefit of today’s data.

Developing trading algorithms through data modeling techniques can lead to robust strategies that perform year in year out for long periods of time. In my experience, however, only around 10% of these strategies turn out to be profitable in real life and few live up to the heady expectations that our backtests set for them.

There are two reasons for this, both of which apply to practitioners of technical and fundamental analysis techniques as well as data miners. We all look for patterns in the data in front of us and intuit reasons why that pattern will repeat in the future – computers and data modeling software simply allow us to identify the best performing sets of parameters to use by running through hundreds of scenarios in minutes rather than days.

This leads to the first problem, curve fitting, a process where you identify a data pattern that has been profitable in the past but won’t repeat itself in the future.

Once you have identified a real data pattern that continues to repeat itself you have the second problem that the anomaly can simply stop working at some time in the future. This could be because the concept has been published and too many people are now using it (Greenblatt’s Magic Formula is a good example which I have written about here), or market dynamics alter to dampen the strategy’s ability to perform. You just need to look at the performance of strategies focusing on value factors over the last few years to see this phenomenon in action.

Various techniques exist to identify these issues, from modeling with in-sample and out-of-sample data sets to what I call “common sense” checks (i.e. is there a logical reason why the factors your strategy uses should work together?) and paper trading.

I recently came across an article from 2014 showing a very simple ETF trading strategy with great headline returns that has underperformed in real life since publication. A few tests back in 2014 would have shown us that the promised returns were unlikely to ever materialize.

As usual, I’ve used the InvestorsEdge.net platform to perform my study.

Introducing the Simple Diversified Portfolio

The strategy authored by Investors FastTrack is a simple rotational one that each month selects the security that has the best trailing one-month performance from a universe of three Vanguard mutual funds – the Index Trust 500 Fund (VFINX), GNMA Fund (VFIIX) and the International Explorer Fund (VINEX).

The original backtests were performed from 2003-13 and showed an impressive 19% annual compound return, which I broadly agree with when recreating the model (trading commissions are included in my version):

Source: InvestorsEdge.net

This strategy seems to have it all – great headline and risk-adjusted returns, low correlation and beta, high dividend yield, and a low drawdown out of a simple momentum strategy. However, performing a few simple robustness tests would have provided some clues that the strategy wasn’t going to perform as well going forward.

One method I use to test a strategy is to rerun my analysis with a series of varying parameters – the returns of a robust strategy won’t change dramatically as a result of these changes. This strategy only has one moving part, the performance of each fund over the last 22 trading days: varying this lookback period has a negligible effect on returns, so the model passes this test.

Our first clue that the model is less than robust is provided by changing the day the strategy rebalances. A simple change from the last day of the month to the first drops its annual performance by 4% and doubles the drawdown. Examining other days, it becomes apparent that the average returns from the strategy using other trading days is around 14% with a Sharpe Ratio of around 1.0, so perhaps this is where we should set our expectations for future performance?

Our last clue comes from performing a 3-year revolving test, in which the strategy returns show a concerning degree of variance:

Test Period CAGR Sharpe Ratio
2003-5 28.9% 2.8
2004-6 19.3% 1.4
2005-7 11.3 % 0.9
2006-8 6.1 % 0.5
2007-9 10.4 % 0.8
2008-10 9.8 % 0.7
2009-11 13.4% 0.9
2010-12 9.3% 0.8
2011-13 15.6% 1.3
2012-14 11.7% 1.1

These clues would have been enough to make me wary of putting a lot of faith (let alone cash!) into this strategy. Let’s see how it actually performed.

2014-18 Results

Source: InvestorsEdge.net

You can see that if we had invested in this strategy expecting returns of around 14% we’d have been extremely disappointed. Annual returns have dropped to 2.4%, with the Sharpe ratio, falling from 1.4 to 0.6.

Looking at the breakdown of returns, the strategy saw a capital loss – total returns were only positive because of its relatively high dividend yield of 2.8%.

Source: InvestorsEdge.net

Conclusion

This was a great opportunity to test a strategy with several years of out of sample data and shows that there are some simple tests that you can perform on any trading strategy (data mined or not) to give some clues as to how it might perform in the future.

This study also highlights the importance of walk forward testing (i.e. paper trading) on a strategy before using it in anger – this is a perfect example of a situation where you’d have immediately seen the returns trailing both expectations and the benchmark, which should have given pause for thought before investing in this model.

You can see further statistics, charts and analysis about this model by clicking here.

Part 2 of this series will analyze another model that we’ve found from a few years ago, with very different results.

The astronomer Carl Sagan said, “It was easy to predict mass car ownership but hard to predict Walmart.”