This post contains some of the research I have done regarding how to measure the performance of a trading strategy.
The immediate goal of conducting this research is to build out a collection of performance measure tools in R to evaluate, compare, and measure the performance of trading strategies. The motivation behind creating these tools is to aid in the development of systematic or quantitative trading strategies in the future.
1. Defining a Quantitative Trading Strategy and Trading Signal
Before identifying performance measurement tools, I define a simple quantitative trading strategy. The output of a quantitative trading strategy is one or more trading signals. A trading signal identifies a security and how much to go long or short that particular security.
A simple strategy involving one security would have one signal bounded between -1 and +1, where -1 means to go fully short the security and +1 means to go fully long the security. Intermediate values between -1 and +1 are possible, where 0 means take no position in the security.
The signal can be continuous (the signal can take values like +0.5 which means enter into a long position with only 50% of allocated capital) or discrete (the signal can only take values -1, 0, and +1, for example).
Once you have a time series containing the trading signal and a time series containing the return of the underlying security, you can multiply the two time series to get a time series of the return of the trading strategy.
For the purposes of this post, I will measure the performance of a simple trading strategy that consists of going fully long Chipotle Mexican Grill (CMG) equity for the first nine months of a year and going fully short for the remaining three months of a year. I label this strategy as M01 in the plots below. This trading strategy is for illustrative purposes.
2. The Equity Curve Versus Benchmark
The equity curve is the standard tool for measuring performance of a strategy and the first one that I will look at. The equity curve refers to the cumulative return of the strategy over time. Here I plot the performance of the M01 strategy against SPY, the benchmark that I selected for this strategy.
When observing equity curve plots, I look for the performance of the strategy versus the benchmark, the correlation of the strategy versus the benchmark, the overall return over time, how volatile the strategy is, what the historical drawdowns are like, and identifying periods of underperformance.
For this particular strategy, I make the following observations:
- The strategy has not consistently outperformed the benchmark which means that the strategy may not be very good. It is important to compare the strategy to some benchmark to determine how much alpha the strategy generates.
- The strategy is highly correlated to the benchmark and for the most part seems like a more volatile version of the benchmark which also suggests the strategy may not be very good. Good strategies have high return with low volatility and low correlation to the benchmark. The alternative to any active investment strategy is always passively tracking some index, so the active strategy needs to be a more attractive investment than the index for it to be worthwhile.
- Maximum drawdown occurs in 2008 and is roughly 75%. This is not a good number.
3. The Trading Signal
A plot of the trading signal and the price of the underlying security is the second plot that I look at. The purpose of this plot is to ensure that the trading signal is well behaved and to identify and investigate periods where the trading signal did not do well. This provides a starting point to investigate the logic and data behind the trading signal and eventually make improvements.
Regarding the shape of the trading signal, I recommend constructing a trading signal that has no net long or short bias if you plan on investing other people’s capital and are charging fees. Not only does this mean that you can profit on the long and short side, but it ensures that the strategy will have low correlation to the benchmark. A strategy that is highly correlated with the benchmark is useless because that means the strategy can be replicated cheaply.
As a reminder, the trading signal above is to go fully long the first nine months of a year and fully short the last three months of a year.
An alternative way to view the two plots is to map the trading signal onto the price of the underlying using a color gradient. This plot allows me to understand the signal in the context of the underlying’s closing price. It can clearly show periods where the signal was wrong.
4. Rolling Returns
Rolling returns are the cumulative return of the strategy over a specified window of time. Compared to looking at the equity curve, rolling return plots highlight periods of underperformance, volatility, and drawdowns over smaller periods of time.
Below I look at the three month and one year rolling returns. These are logical windows that provide a short-term and long-term view on how the strategy has performed.
The extreme volatility of the strategy is clear in these plots. The strategy underperformed the benchmark from 2012 to 2015. Three month returns of -25% or more have occurred several times in the past, and an annual return of -75% occurred in 2009.
Drawdown refers to the percentage decline in the strategy from the historical peak profit at each point in time. This plot focuses on the downward potential and volatility of the strategy. Drawdowns measures the pain, fear, and uncertainty that a trader would feel while the strategy is operational.
I examine the frequency of drawdowns, the size of the maximum drawdown, and the time it takes to recover from drawdowns. In this case, the maximum drawdown was around -80% and it took almost 10 years to recover from the maximum drawdown.
6. Sharpe Ratio
The sharpe ratio measures the risk-adjusted performance of a strategy. This ratio is commonly reported by asset managers and provides an easy way to compare the performance between strategies or asset managers.
There are many subtle variations on how to calculate the sharpe ratio, but in it’s most common form, it is the excess return of the strategy (the return in excess of either a benchmark or the risk-free rate) divided by the standard deviation of the returns. The interpretation of the ratio is the amount of excess return you get per unit of risk, with higher levels being more desirable.
Here I plot the 12-month rolling sharpe ratio for the strategy and the benchmark. I smooth the sharpe ratio a little bit so that the interpretation is more clear.
The interpretation is that the M01 strategy only rarely outperforms the benchmark. Between 2012 and 2016 was a long period of underperformance.
An additional note regarding negative sharpe ratio values — the sharpe ratio doesn’t really apply when the excess returns are negative because low volatility (which is good) leads to a more negative sharpe ratio, but more negative excess returns (which is bad) also leads to a more negative sharpe ratio.
There are a lot of other risk-adjusted measures (information ratio, treynor ratio, sortino ratio, Jensen’s alpha, and so on), but they largely attempt to capture the same thing. These ratios are used more to compare strategies and to communicate to others, and have a somewhat limited use to a designer of a trading strategy, so I don’t really spend a lot of time thinking about the sharpe ratio. A designer should be able to develop a better understanding of the strategy using other tools and plots.
7. Further Research
The code to generate these plots is only 100 lines in R and can be found at my Github. I plan to further develop this performance reporting code into generalized functions later on. This should help in developing some simple quantitative trading systems that I plan on building.
I have an email list where I occasionally send updates to readers on the trading systems that I’m developing. If you are interested, please enter your email below.