Markowitz Portfolio Optimizer

Wiki Wiki Web

Hand-On Machine LEarning with Scikit-lean, Keras and TensorFlow

Aurelien Geron, 2019

corr_matrix = df.corr(), corr_matrix["regressand"] allows to see which feature might be of interest
scatter_matrix help see which features have predictive power while not necessarily having linear correl
features can be removed, na can be replaced with median, or na rows can be dropped
SimpleImpuuter: dropna(subset=[("feature"]), drop("feature"), median=df["feature"].median, df["feature"].fillna(median)
renormalisations such as house/households or bedroom/room can help uncovering better features
string data needs to be converted to ordinals (OrdinalEncode), or to bool array OneHotEncoder, the latter is expensive in space, use sparringly
min-max scaling (normalisation), and standardisation (scale by stdev)