-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial implicit/truncated backward modes #29
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Left some minor comments, after which we should merge this and try to debug the implicit grad offset issue in a separate PR (unless you already fixed it).
Something failed in the GPU test. Might be a CI issue, don't think it was related to any commits here.
Hi @mhmukadam and @luisenp, I just went through and addressed all the inline comments. Let me know if there's anything else! Otherwise this is ready to merge from my perspective, and I'll keep debugging some of the gradient scaling issues separately for #39. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, some minor comments and a question!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Thanks for the pass @luisenp! I just addressed your comments and pushed the latest that fixes the scaling issues and passes the backward tests with |
* Initial WIP commit of implicit/truncated backward modes * spacing * add numdifftools requirement * fix mypy and GPU issues * import BackwardMode as part of the main thesus module * add ValueError messages * add comments to backward_modes and add it to examples/README * Remove error_increase_induces * move converged_indices from the info back into the optimizaiton loop * fix gradient scaling for facebookresearch#39 * update backward tests * add type hints/remove unused track_best_solution * remove erroneous update
Migrating this from the private repo (where we have some more context. I've updated the example/test to use the quadratic cost fitting example as I like it's simplicity to the older example we were using with the cost weights. Here's what the current backward pass mode example looks like:
This is almost ready to merge but still WIP as I'm still debugging some scaling issues in the implicit mode, which should be easier to debug with this simpler example.