I'm a Survivor

I'm a Survivor

For the last couple of years I’ve routinely stated how important dealing with survivorship bias is in testing. I hope you agree with me that it’s a critical issue that makes it nearly impossible to run a historical test and then being able to repeat those results in the future.

For the last couple of years I’ve routinely stated how important dealing with survivorship bias is in testing. I hope you agree with me that it’s a critical issue that makes it nearly impossible to run a historical test and then being able to repeat those results in the future. The only problem has been that it has historically been really hard to run tests without survivorship bias. Well, we now have the ability to do this!

First - what is Survivorship Bias?

Survivorship bias or survival bias is the logical error of concentrating on the people or things that made it past some selection process—and overlooking those that did not—typically because of their lack of visibility. This can lead to false conclusions in several different ways.

Wikipedia Definition

Target Zones

The initial outworking of this was in World War II when the Engineers were looking at bullet-riddled planes which had returned from missions. They focused on strengthening the armour in the locations that had the most bullet holes. But, these planes were the ones that made it back, so obviously those areas were not critical to the survival of the plane. Instead, the armour should have been added to places where none of the survivors had been hit in the hope that it meant that more planes would survive the mission.

In our application, when we run a back test on equities, we often say that we want to focus our test on the members of a popular index like the S&P500, FTSE100 or ASX200. We collect together the securities that make up that index and do our test on those over the last ten years. The trouble is that members of the S&P500 today are not the same as the members ten years ago. Lehman Bros ring a bell? How can we run a test without including “LEH” in the list?

Similarly, there are a number of current names in the S&P that were not included ten years ago. E.g. Netflix was added to the S&P500 in 2010, so we should not be considering any signals in Netflix before 2010 when it was not part of the index.

Grading Bias

Both these conditions—ignoring companies that are no longer in the index and including those that had not yet “made it”—leads to survivorship bias which skews our tests positively.

To learn how to create tests using the bias-free data see this KnowledgeBase article.

Now for the results

Ok, so now that that is out of the way, let’s have a look at this in real life. Following is a simple test of a 50-period moving average crossing above a 200-period moving average run over the last ten years. Here is the script:

MA(BARS=50) CrossesAbove MA(BARS=200)

50-period moving average crossing above a 200-period moving average

Remember in this chart the Blue shaded plot is our equity from the test. Obviously not a lot of alpha, but it shows a moderate return over the index (red line). The issue is that we have only used the current 504 stocks in our test. We need to set this up to include all the companies that were ever in the index. Not only that, but we also need to adjust our script so we tell Optuma to only take a signal if the company was in the index when we got our signal.

Historical Comparison Chart Historical Comparison Chart

In the properties of the test I need to change the new “Membership” property from “Current” to “Historical”. Note that this will only show for Optuma Symbol Lists where we have set up the survivorship data. At this stage, all we are doing is including all the 700(ish) companies that were in the S&P 500 over the last ten years.

To make sure that we only take a signal when the company was actually in the Index, I need to update my script to add the new “IsMember()” function. Note that we don’t change the exit script since we want to exit regardless of membership in the index (although you could exit on removal from the index if you wanted to).

Here is that script:

MA(BARS=50) CrossesAbove MA(BARS=200) AND IsMember()

Using IsMember Function Using IsMember Function

Suddenly this does not look so good anymore. Our idea did not “beat” the market. Anyone who has ever tried trading a MA crossover like this knows that it’s a great strategy in theory, but the results are really hard to replicate. Finally, the tests are telling us what we already know.

The positive side of this is for those who are working on Short or Long/Short strategies. Removing survivorship bias gives a “lift” to Short results.

The main point of this is to highlight to you how important survivorship bias is, and to ensure that you don’t ignore it in testing. If you have ever been frustrated by your inability to repeat test results in real-life, then this will help you see why that has happened.

A simple rule of thumb, for when you don’t have access to correct survivorship bias-free data, is to subtract around 3% per annum from your results. That will give you a better idea of what you can expect. Just don’t plan your trading strategy by only looking at the survivors. Make sure you properly consider the securities that didn’t make it.

Considering what didn't make it Considering what didn’t make it

Note: at present this will only work for the US S&P500 index back to 2000, and the Australian size indices (eg ASX200, ASX300) back to 2012 - it’s not easy to find historical changes to these indices, so please contact us if you can help!