I was just reading this stuff in Stalla, and my impression is that if your portfolio strategy involves using a substantial amount of non-systematic risk compared to systematic risk (say, a market neutral strategy), then Treynor is not so good, because beta will be low and it will tend to underestimate the true amount of risk you are taking. However, if you are well diversified and aren’t banking on non-systematic risk, Treynor is good to use.
The main advantage of Treynor (thank you Stalla), is that it is much easier to compute the beta of a portfolio (weighted average of betas of the components), than to compute the standard deviation of a portfolio, particularly if the portfolio has a short history. Therefore it is simpler to compute Treynor and not deal with Sharpe so much unless you have a long enough history to generate historical standard deviations.
But what about this M^2 measure. As far as I can tell, it’s just Sharpe reformulated. It’s nifty and all to see on a chart, but why would one use it (other than CFAI asks for it)?