notes in machine learning with python (1)


Environment Problem

Regression

  • reference
  • dataframe.shift(num) just move the dataset num times rightwards
  • dataframe.iloc[num] Selection by Position - is primarily integer position based (from 0 to length-1 of the axis)
  • dataframe.loc[text] Selection by Label
  • python list - list[-1] the last element of the list
  • python date:

    • time string : time.ctime() ‘Mon Dec 17 21:02:55 2012’
    • datetime tuple(datetime obj) : datetime.now() datetime.datetime(2012, 12, 17, 21, 3, 44, 139715)
    • time tuple(time obj) : time.struct_time(tm_year=2008, tm_mon=11, tm_mday=10, tm_hour=17, tm_min=53, tm_sec=59, tm_wday=0, tm_yday=315, tm_isdst=-1)
    • timestamp : 时间戳类型:自1970年1月1日(00:00:00 GMT)以来的秒数
    • NOT FOUND!
  • python iteration np.nan for _ in range() means nothing,the variable is not going to be used - [nan, nan, nan, nan, nan, 781.29480101781269]

Matplotlib

1
2
3
4
5
6
7
8
style.use('ggplot')

df['Adj. Close'].plot()
df['Forecast'].plot()
plt.legend(loc=4)
plt.xlabel('Date')
plt.ylabel('Price')
plt.show()

Pickle

1
2
3
4
5
6
7

# clf.fit(X_train, y_train)
# with open('linearregression.pickle','wb') as f:
# pickle.dump(clf, f)

pickle_in = open('linearregression.pickle','rb')
clf = pickle.load(pickle_in)