Finance With Python and MPT
Finance With Python and MPT
Nathan George
Data Science Professor
DataCamp Machine Learning for Finance in Python
DataCamp Machine Learning for Finance in Python
Joining data
stocks = ['AMD', 'CHK', 'QQQ']
full_df = pd.concat([amd_df, chk_df, qqq_df], axis=1).dropna()
full_df.head()
Calculating returns
# calculate daily returns of stocks
returns_daily = full_df.pct_change()
print(returns_monthly.tail())
Covariances
# daily covariance of stocks (for each monthly period)
covariances = {}
for i in returns_monthly.index:
rtd_idx = returns_daily.index
# mask daily returns for each month (and year) and calculate covariance
mask = (rtd_idx.month == i.month) & (rtd_idx.year == i.year)
covariances[i] = returns_daily[mask].cov()
print(covariances[i])
portfolio_returns.setdefault(date, []).append(returns)
portfolio_volatility.setdefault(date, []).append(volatility)
portfolio_weights.setdefault(date, []).append(weights)
DataCamp Machine Learning for Finance in Python
Nathan George
Data Science Professor
DataCamp Machine Learning for Finance in Python
DataCamp Machine Learning for Finance in Python
DataCamp Machine Learning for Finance in Python
DataCamp Machine Learning for Finance in Python
# loop through dates and get sharpe ratio for each portfolio
for date in portfolio_returns.keys():
for i, ret in enumerate(portfolio_returns[date]):
volatility = portfolio_volatility[date][i]
sharpe_ratio.setdefault(date, []).append(ret / volatility)
# get the index of the best sharpe ratio for each date
max_sharpe_idxs[date] = np.argmax(sharpe_ratio[date])
DataCamp Machine Learning for Finance in Python
Create features
# calculate exponentially-weighted moving average of daily returns
ewma_daily = returns_daily.ewm(span=30).mean()
targets = np.array(targets)
features = np.array(features)
DataCamp Machine Learning for Finance in Python
cur_returns = portfolio_returns[date]
cur_volatility = portfolio_volatility[date]
plt.scatter(x=cur_volatility,
y=cur_returns,
alpha=0.1,
color='blue')
best_idx = max_sharpe_idxs[date]
plt.scatter(cur_volatility[best_idx],
cur_returns[best_idx],
marker='x',
color='orange')
plt.xlabel('Volatility')
plt.ylabel('Returns')
plt.show()
DataCamp Machine Learning for Finance in Python
DataCamp Machine Learning for Finance in Python
Get Sharpe!
DataCamp Machine Learning for Finance in Python
Nathan George
Data Science Professor
DataCamp Machine Learning for Finance in Python
test_features = features[train_size:]
test_targets = targets[train_size:]
print(features.shape)
(230, 3)
DataCamp Machine Learning for Finance in Python
print(rfr.score(train_features, train_targets))
print(rfr.score(test_features, test_targets))
0.8382262317599827
0.09504859048985377
DataCamp Machine Learning for Finance in Python
# calculate and plot returns from our RF predictions and the QQQ returns
test_returns = np.sum(returns_monthly.iloc[train_size:] * test_predictions,
axis=1)
plt.plot(test_returns, label='algo')
plt.plot(returns_monthly['QQQ'].iloc[train_size:], label='QQQ')
plt.legend()
plt.show()
DataCamp Machine Learning for Finance in Python
DataCamp Machine Learning for Finance in Python
for r in test_returns:
cash *= 1 + r
algo_cash.append(cash)
Final thoughts
Nathan George
Data Science Professor
DataCamp Machine Learning for Finance in Python
Toy examples
Python 3 multiprocessing
Dask
Spark
AWS or other cloud solutions
DataCamp Machine Learning for Finance in Python
satellite images
sentiment analysis (e.g. PsychSignal)
analyst predictions
fundamentals data
DataCamp Machine Learning for Finance in Python