Skip to content

Portfolio Construction and Risk Management with Unequal Returns Histories

Peter Carl edited this page Mar 13, 2017 · 3 revisions

Summary

Standard quantitative portfolio analysis techniques, including mean-variance and mean-ES optimization, and portfolio risk and performance analysis, implicitly require all assets under consideration to have the same length of return histories. Unfortunately, it is often the case that a portfolio of interest is based on assets with variable length histories, e.g., a fund-of-hedge funds with managers hired at different starting times, or a portfolio of ETF’s that came into existence at different time. Truncating the portfolio returns to have the same length as the shortest asset history is wasteful of information. There exist three different methods, one of them quite new, of avoiding the loss of information due to truncation by backfilling the missing returns in a way that reflects fat-tailed and skewed non-normality of the returns. The goal of this project is to implement the three methods in an “Unequal Histories” package that: (1) facilitates use of the methods in portfolio optimization and risk analysis applications, and (2) supports the comparative study of the efficacy of the three different methods.

Description

We first briefly describe the three backfilling methods, each of which involves fitting a multi-factor model of groups of short history returns on longer history returns in iterative manner.

Multiple Imputation (MI)

The “Multiple Imputation” (MI) method is a missing data method that is tailored for the monotone missing-ness that occurs in the context of a portfolio whose assets have unequal histories. The method consists of using the fitted regression models of short histories of returns on long histories to predict the missing returns of the short histories, and then filling in the missing error term by bootstrap resampling (Efron and Tibshirani, 1994) of the residuals from the fitted models. For each such backfill the portfolio quantities of interest (individual asset risk and performance measures, portfolio risk and performance measures, efficient frontier, etc.) are estimated, and then the estimates are averaged over a large number B of such backfills to obtain the final estimates. A more detailed description of MI for portfolios of asset returns is provided by Page (2013), who provides a few simple but convincing applications of MI to risk analysis. A general treatment of MI for monotone missing-ness is provided in Little and Rubin (2002).

The strength of the MI method is its conceptual simplicity. However, several disadvantages of the MI method are discussed in Jiang and Martin (2016), including the following: (1) There are no simple rules for determining how many backfills B are needed for a given portfolio problem, (2) Generating a large number of backfills, e.g., B = 1,000 or 10,000 and computing an estimate (e.g., an optimized portfolio) for each backfill, can be expense in terms of time required, and (3) As implemented in Page (2013) there are no standard errors provided for the final estimates.

Factor Model Monte Carlo (FMMC)

A second and relatively new approach is the “Factor Model Monte Carlo” (FMMC) method described in Jiang and Martin (2015). A major difference between FMMC and MI is that while MI is focused on the assets in a specified portfolio, FMMC regresses short histories of assets in a portfolio on much longer histories of relevant factors, e.g., various indexes, thereby “borrowing strength” from those factors to improve the accuracy of the prediction step in backfill, relative to that obtainable by MI. The backfill is then completed by adding a special fixed sample of regression residuals to the predictors (not an arbitrarily large bootstrap sample). A second difference between FMMC and MI is that FMMC computes standard errors of risk and performance measures based on the backfill by making use of a special bootstrap method. A third difference between FMMC and MI is that FMMC is consistent with the Stambaugh (1997) maximum-likelihood method of computing means and covariances for unequal histories portfolios, but MI is not.

Combined Backfilling (CBF)

A third and very new method called “Combined Backfilling” (CBF) is described in Jiang and Martin (2016). This method is quite different than both MI and FMMC in that it creates a fixed complete data table that parsimoniously represents a highly redundant backfill of all possible combinations of predictor values and residuals from regressing shorter histories on longer histories. As such, the CBF method is a parsimonious representation of the empirical joint distribution of the predictor and regression residuals conditional on all the observed data. The advantages of CBF include: (1) CBF removes the ambiguity of the MI method associated with getting different results for different averaging methods (such as averaging the Sharpe ratios for all MI backfills versus averaging the excess returns and volatilities, and then computing the Sharpe ratio as the ratio of those two averages), (2) CBF is more computationally efficient than MI, and (3). CBF is consistent with the Stambaugh (1997) method. A method for computing standard errors for risk and performance estimators computed with the CBF method has not yet been implemented.

Project Goals

The student who works on this project will accomplish the following, working closely with the mentors listed below:

Create an R package “Unequal Histories” that implements all three of the methods (MI, FMMC, CBF), with suitable input and output data structures to support separate or simultaneous of the three method for: (a) optimization of portfolios whose returns have unequal histories, and (b) risk and performance analysis of portfolios of assets and of optimized portfolios.

The package should have the feature that if a complete data set of returns is provided, along with an artificially created set of returns with unequal histories, then the output should provide a convenient comparison of the MI, FMMC, CBF backfill methods with the results of using the complete data set of returns. 1. The MI method should be implemented with a method of computing standard errors for the risk and performance estimators obtained with the MI backfill.

Standard errors for CBF risk and performance measures can be computed with a bootstrap method similar to that used by the FMMC method, and this should be implemented for CBF.

For the MI and FMMC methods, an option needs to be provided to fit one of several distributions, including normal, symmetric t, and skewed t-distributions to the regression residuals, and replace the bootstrapped residuals with draws from the fitted distribution (as had previously been done by Eric Zivot for equal histories time series factor models in the factorAnalytics package).

Create a very extensive and detailed vignette for the “Unequal Histories” package that includes a substantial number of usage examples.

Project Milestones

Phase 1

Complete initial R unequal histories package, with commented R code implementation and testing of the CBF method, that satisfies goals 1-a, 1-b, 2, 4 above.

Phase 2

Complete next version of the R unequal histories package, with commented R code implementation and testing of the MI method that satisfies goals 1-a, 1-b, 2, 3, 5 above.

Final

Complete final version of the R unequal histories package, with commented code implementation and testing of all three of the methods (CBF, MI and FFMC) that satisfies all of the above goals, with a very solid package vignette.

References

Efron, B. and Tibshirani, R, J. (1994). An introduction to the bootstrap. CRC Press.

Jiang, Y. and Martin, R. D. (2015). “Better risk and performance estimates with factor-model Monte Carlo”, Journal of Risk, pp. 1–38.

Jiang, Y. and Martin, R. D. (2016). “Turning Long and Short Return Histories into Equal Histories: A Better Way to Backfill Returns”, https://ssrn.com/abstract=2833057.

Little, Roderick JA and Rubin, Donald B (2002). Statistical Analysis with Missing Data, 2nd ed., John Wiley & Sons.

Page, Sébastien (2013). “How to Combine Long and Short Return Histories Efficiently”, Financial Analysts Journal, pp. 45–52.

Stambaugh, R. F. (1997). “Analyzing investments whose histories differ in length”, Journal of Financial Economics, 45, pp. 285–331.

Skills Required

Applicants should have:

Familiarity with the factorAnalytics package.

Proficiency with R and experience in developing in R.

Knowing or comfort in quickly learning tools such as Github, Roxygen2 and LaTeX.

An undergraduate degree in math, natural sciences or engineering.

Ability to understand and use mathematical and statistical aspects of references.

Graduate education in quantitative finance.

Test

A successful applicant will:

Discuss the proposed package functionality.

Write a development timeline for code implementation, documentation and testing.

Provide a complete code example of a function with documentation and a test package that demonstrates familiarity with R, Github and Roxygen2.

Identify any personal commitments that conflicts for their time during summer 2017.

Mentors

Doug Martin (martinrd@comcast.net)

Yindeng Jiang (yindeng@uw.edu)

Peter Carl (pcarl@gsb.uchicago.edu)

Brian Peterson (brian@braverock.com)

Clone this wiki locally