Dropbox Paper

XXFeature Importance

Normalizing the output will not affect shape of ff, so it's generally not necessary.

Probabilitist inference from tensorflow ?

The way I calculate non-linear, non-monotonic "correlation" (dependency?, association?) is very simple: I just train a non-linear model between the two variables and see if they can predict each other. You can do this for each pair of features in a dataset and then range-standardize between the highest and lowest RMSE and subtract values from 1 so you get what looks like a typical correlation matrix.

RFE - Closer to causality than feature importance

from sklearn.feature_selection import RFE

from lightgbm import LGBMClassifier

estimator = LGBMClassifier()

selector = RFE(estimator, 8, step=1)

selector = selector.fit(X, y)

## For now I will do this

X_df = X.T

X_df["Value"] = selector.ranking_

X_df = X_df[X_df["Value"]==1]

X_df = X_df.T

X_df.head()

Recursive Feature Selection makes sense in some regard - as it might eliminate a correlated feature that might have been powerful but share its power with another. It is very similar to leaving highly correlated features.

This is the trap you stepped in, predictors is the prediction model, predictor variables are the variables to the predictors, instead of saying predictor variables, you can just say features.

It does not matter if you over fit for feature importance as long as you are note using it for feature selection

To be truthful the prediction excercise is not that interesting, what is more interesting

fitting the data to the response, i.e. training the model and looking at the feature

interactions it proposes. Feature importance is only interesting to the extent it is

different to correlation plot, in that regard, higher dimensional interactions become

important.

Feature Importance + Selection Draft

There are just 4 worthwhile approaches to measure feature importance.

Backward Induction - Drop Feature Retrain Model - Identify Performance

Permutation Importance - Randomly shuffle feature instances - Identify Performance