Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vasilis/local linear #13

Merged
merged 63 commits into from
Apr 6, 2019
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
f089231
api
vasilismsr Mar 7, 2019
342ff6c
doc dml
vasilismsr Mar 7, 2019
f1f05fa
removed second model_t
vasilismsr Mar 7, 2019
7fb788f
pr comments:
vasilismsr Mar 7, 2019
48ed712
removed legacy
vasilismsr Mar 7, 2019
0e44588
spec forests
vasilismsr Mar 8, 2019
dc9ed7c
forests doc
vasilismsr Mar 9, 2019
a47096f
forest doc
vasilismsr Mar 9, 2019
d345f06
forest docs
vasilismsr Mar 10, 2019
c1b31c0
forest docs
vasilismsr Mar 10, 2019
a4b3e90
local-linear correction for ortho forest
vasilismsr Mar 10, 2019
8245041
local linear ortho forest
vasilismsr Mar 10, 2019
dedc17b
local linear multiple treatments fix
vasilismsr Mar 10, 2019
1c8056c
added ell_2 regularization of the offset part
vasilismsr Mar 10, 2019
7519ca0
some fixes in the ell2 regularization
vasilismsr Mar 10, 2019
bbd3d42
made lambda in the ridge penalty parameterizable
vasilismsr Mar 10, 2019
208a313
replaced inv with pinv in parameter_estimate for stability and also r…
vasilismsr Mar 11, 2019
93969d4
reverted back to incoprorating second order parameters in the splitti…
vasilismsr Mar 11, 2019
a5eefea
enabled separate parameter estimate for first and second stage and se…
vasilismsr Mar 12, 2019
92faa26
changed default subsample ratio for local linear forestS
vasilismsr Mar 12, 2019
7ff98a9
improvements on causal tree runtime
vasilismsr Mar 14, 2019
4d4c448
causal tree optimization
vasilismsr Mar 14, 2019
78eab3a
removed internal storing of data in causal tree class
vasilismsr Mar 14, 2019
b00e54a
causal tree optimizations
vasilismsr Mar 14, 2019
bc07bcf
added local linear correction to discrete treatment ortho forest
vasilismsr Mar 15, 2019
a6492a2
merged local linear and continuous treatment forest into a single cla…
vasilismsr Mar 15, 2019
9dcb889
changed default subsample rate to reduce bias for better confidence i…
vasilismsr Mar 15, 2019
88d3a3d
removed cython stuff
vasilismsr Mar 15, 2019
c3d40ae
removed estimate field from node structure in causal tree as it is no…
vasilismsr Mar 15, 2019
4baae7a
lint formatting
vasilismsr Mar 15, 2019
4019265
lint formatting
vasilismsr Mar 15, 2019
2c7caec
linting formats
vasilismsr Mar 15, 2019
957c9cf
removed benchmark rst
vasilismsr Mar 15, 2019
9773f4b
added weighted wrappers to the first stage nuisance estimators for co…
vasilismsr Mar 15, 2019
edd53b6
some comments in the new additions for local linear correction
vasilismsr Mar 15, 2019
789dac0
comments on new code
vasilismsr Mar 15, 2019
6adf20f
lint errors
vasilismsr Mar 15, 2019
2e7023a
changed the comments on the input parameters of the changed classes a…
vasilismsr Mar 15, 2019
d05145c
notebook for ortho forest re-run
vasilismsr Mar 16, 2019
bc0e514
Merge branch 'master' into vasilis/local-linear
vasilismsr Mar 16, 2019
d1c856b
Update metalearners.rst
vasilismsr Mar 16, 2019
291b752
Update forest.rst
vasilismsr Mar 16, 2019
d1cfd02
linting errors
vasilismsr Mar 16, 2019
1aaecde
Merge branch 'vasilis/local-linear' of github.com-microsoft:Microsoft…
vasilismsr Mar 16, 2019
fa0ab64
added random seed in the data generating function of the orf tests
vasilismsr Mar 16, 2019
2bc30e2
added random seed in the data generating function of the orf tests
vasilismsr Mar 16, 2019
7c6e7d5
added thread based parallelism in the predict part of the ortho fores…
vasilismsr Mar 17, 2019
04f8bf8
some further improvements on computation time when deciding splits in…
vasilismsr Mar 17, 2019
7291ead
small fix when importing data from ihdp. was not using os.path.join f…
vasilismsr Mar 17, 2019
c15111c
Update econml/data/dgps.py
kbattocchi Mar 20, 2019
5ee3f3c
Update econml/ortho_forest.py
kbattocchi Mar 20, 2019
419626f
updated doc string in ortho forests with new default values
vasilismsr Mar 20, 2019
8ea437a
updated find_tree_node function to be a single public one.
vasilismsr Mar 20, 2019
797d2dc
better version to get coordinates from flat index in proposals of cau…
vasilismsr Mar 20, 2019
1a70e52
comment in docstring for balancedness_tol
vasilismsr Mar 20, 2019
45a16e5
pr comments
vasilismsr Apr 5, 2019
9a4a260
pr comments
vasilismsr Apr 5, 2019
2dc6913
pr comments
vasilismsr Apr 5, 2019
bb57e04
notebook final run
vasilismsr Apr 5, 2019
1a13254
linting
vasilismsr Apr 5, 2019
62f543d
notebook final run
vasilismsr Apr 5, 2019
6acf451
Merge branch 'master' into vasilis/local-linear
kbattocchi Apr 5, 2019
9f956f5
Fix docstring issues
kbattocchi Apr 6, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
pr comments
  • Loading branch information
vasilismsr committed Apr 5, 2019
commit 45a16e52c50ba4a1c120e1fa606025c7e93ec380
19 changes: 14 additions & 5 deletions econml/causal_tree.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,15 @@ def __init__(self, sample_inds, estimate_inds):
self.right = None

def find_tree_node(self, value):
"""
Recursively find and return the node of the causal tree that corresponds
to the input feature vector.

Parameters
----------
value : array-like, shape (d_x,)
Feature vector whose node we want to find.
"""
if self.feature == -1:
return self
elif value[self.feature] < self.threshold:
Expand Down Expand Up @@ -65,7 +74,7 @@ class CausalTree:
min_leaf_size : integer, optional (default=10)
The minimum number of samples in a leaf.

max_splits : integer, optional (default=10)
max_depth : integer, optional (default=10)
The maximum number of splits to be performed when expanding the tree.

n_proposals : int, optional (default=1000)
Expand All @@ -91,7 +100,7 @@ def __init__(self,
parameter_estimator,
moment_and_mean_gradient_estimator,
min_leaf_size=10,
max_splits=10,
max_depth=10,
n_proposals=1000,
balancedness_tol=.3,
random_state=None):
Expand All @@ -101,7 +110,7 @@ def __init__(self,
self.moment_and_mean_gradient_estimator = moment_and_mean_gradient_estimator
# Causal tree parameters
self.min_leaf_size = min_leaf_size
self.max_splits = max_splits
self.max_depth = max_depth
self.balancedness_tol = balancedness_tol
self.n_proposals = n_proposals
self.random_state = check_random_state(random_state)
Expand Down Expand Up @@ -137,7 +146,7 @@ def create_splits(self, Y, T, X, W):
node, depth = node_list.pop()

# If by splitting we have too small leaves or if we reached the maximum number of splits we stop
if node.split_sample_inds.shape[0] // 2 >= self.min_leaf_size and depth < self.max_splits:
if node.split_sample_inds.shape[0] // 2 >= self.min_leaf_size and depth < self.max_depth:

# Create local sample set
node_X = X[node.split_sample_inds]
Expand Down Expand Up @@ -223,7 +232,7 @@ def create_splits(self, Y, T, X, W):

# calculate the average influence vector of the samples in the left child
left_diff = np.matmul(rho.T, valid_side)
# calculate the average influence vector of the samples in the left child
# calculate the average influence vector of the samples in the right child
right_diff = np.matmul(rho.T, 1 - valid_side)
# take the square of each of the entries of the influence vectors and normalize
# by size of each child
Expand Down
8 changes: 4 additions & 4 deletions econml/ortho_forest.py
Original file line number Diff line number Diff line change
Expand Up @@ -497,8 +497,8 @@ def parameter_estimator_func(Y, T, X,
diagonal[:T_res.shape[1]] = 0
reg = lambda_reg * np.diag(diagonal)
# Ridge regression estimate
param_estimate = np.matmul(np.linalg.pinv(np.matmul(weighted_XT_res.T, XT_res) + reg),
np.matmul(weighted_XT_res.T, Y_res.reshape(-1, 1))).flatten()
param_estimate = np.linalg.lstsq(np.matmul(weighted_XT_res.T, XT_res) + reg,
np.matmul(weighted_XT_res.T, Y_res.reshape(-1, 1)), rcond=None)[0].flatten()
# Parameter returned by LinearRegression is (d_T, )
return param_estimate

Expand Down Expand Up @@ -787,8 +787,8 @@ def parameter_estimator_func(Y, T, X,
diagonal[0] = 0
reg = lambda_reg * np.diag(diagonal)
# Ridge regression estimate
param_estimate = np.matmul(np.linalg.pinv(np.matmul(weighted_X_aug.T, X_aug) + reg),
np.matmul(weighted_X_aug.T, pointwise_params)).flatten()
param_estimate = np.linalg.lstsq(np.matmul(weighted_X_aug.T, X_aug) + reg,
np.matmul(weighted_X_aug.T, pointwise_params), rcond=None)[0].flatten()
# Parameter returned by LinearRegression is (d_T, )
return param_estimate

Expand Down