You can add a start date too, thats left out here.
end = datetime.datetime(2016, 1, 1)
start = datetime.datetime(1990, 1, 1)
spx = data.get_data_yahoo('^GSPC',start,end)
vix = data.get_data_yahoo('^VIX',start,end)
You are always leaking when you are doing preprocessing before cross-validation. Easy because the train and test splits has been overcontaminated.
This pipeline makes a lot more sense. And should be used in the cross validation - I infact don’t see it used in cross validation in this mates work, so there might be an issue in his work.
flexible arguments *args ~> args basically means adding a list of things
purpose3 = pd.read_csv("purpose3.csv")
purpose3.head()
deet = {}
for r,v in zip(purpose3["previous"],purpose3["new"]):
deet[r] = v
df_bor['loan_purpose3'] = df_bor['loan_purpose1'].map(deet)
import pandas as pd
pd.core.common.is_list_like = pd.api.types.is_list_like
from pandas_datareader import data, wb
import fix_yahoo_finance as yf
yf.pdr_override()
import numpy as np
import datetime
You can add a start date too, thats left out here.
end = datetime.datetime(2016, 1, 1)
start = datetime.datetime(1990, 1, 1)
spx = data.get_data_yahoo('^GSPC',start,end)
vix = data.get_data_yahoo('^VIX',start,end)
flexible arguments *args ~> args basically means adding a list of things
def f(*args):
for i in args:
print(i)
f(1)
print("")
f(1,2,3,4)