Quantitative Investment Analysis from the CFA institute is a basic textbook on models and stats.
Regression and Time Series
- for linear regression, it recommends testing significance of terms t-stat sufficiently high
- for time series he recommends checking autocorrelation of error terms, that they have sufficiently low t-stat abs value so that the hypothesis that error terms have 0 correlation cannot be rejected, alternatively, durbin watson statistic value near 2 is ok.
- if serial correlation is observed, using AR(1) or more to get rid of serial correlation.
- an AR(1) model has mr $\lambda=b_0/(1-b_1)$
- as number of parameters is increased, the estimation can result in very different parameters depending on sample size
- when $b_1$ is near 1, we have a unit root process, the Augmented Dickey-Fuller test can test for this (see also Philips-Perron.
- using SMA can get rid of seasonality before applying AR
- the ARMA(p,q) model is supposed to allow moving average and AR, but turns out to be unstable.
- if ARCH(1) is relevant as exhibited by $e_t^2$ being correlated to $e_{t-1}^2$, the estimation of t-stat on error params is not valid, GLS needs to be used.
- regression of a series $Y_t$ by a series $X_t$ can only be done if they don't have 1 unit root, or if they both have unit root, one needs to show that they are cointegrated by applying ADF test to the residual $e_t$ of $Y_t = b_0 + b_1 X_t + e_t$.
- check out of sample performance of the model
Factor Models
- CAPM (Sharpe), problems summarized in Bodie Kane Marcus 2014
- APT (Ross 76) $R_t = a_0 + \sum_k b_k I_k(t) + e_t$, with $E(I_k) = \lambda_k$
- Carhart (1997) model, an extension of Fama French (1992) model uses 3 factors: SMB, HML, WML (the latter is relative momentum of 30% winners and losers), WML was PR1YR in Carhart paper
- we distinguish: macroeconomic factor model, fundamental factor model, statistical factor model
- macro model can lead to growth and inflation factor asset matrix
- Connor (1994) compared fundamental and macroecon model, the econ model explained 10% of variance whereas the fundamental one, with industry encoded as 55 dimensional one-up factors explained 42% of variance