Add diagonal scaling method and re-enable LM adaptive + ellipsoidal #393

luisenp · 2022-12-02T21:59:31Z

This adds a linearization diagonal scaling method and implementations for dense and sparse linearization. The sparse implementation could be faster with a custom kernel, to avoid python loop over rows, but this should be good enough to get an initial implementation.

theseus/optimizer/sparse_linearization.py

fantaosha

LGTM. Only some minor comments.

theseus/optimizer/dense_linearization.py

fantaosha · 2022-12-03T05:06:20Z

theseus/optimizer/nonlinear/levenberg_marquardt.py

+        assert isinstance(self._damping, torch.Tensor)
+        damping = self._damping.view(-1, 1)
+        if ellipsoidal_damping:
+            damping = linearization.diagonal_scaling(damping)
        # Deliberately using Atb before updating the variables, according to
        # the LM reference above
        den = (delta * (damping * delta + linearization.Atb.squeeze(2))).sum(dim=1) / 2


In trust region method, den is also called predicted reduction. In addition, a test can be added for den as we discussed before.

I'll add this to TrustRegion class in the Dogleg PR (remind me if I forget). We'll unify these two in a later PR.

theseus/optimizer/sparse_linearization.py

luisenp · 2022-12-03T13:15:33Z

@fantaosha That's correct. Out of our 3 sparse solvers, 2 don't need to explicitly construct AtA. The only exception is LUCudaSparseSolver, which does it here. But note that this class has no access to our linearization structures. Also note that if you try to call lin.AtA on a sparse linearization it actually raises NotImplementedError. I'd prefer to leave like this for now until we have an actual use for such method.

On the other hand, your question made me realize that caching the solution of linear solvers could also be useful now that we are allowing retries. I'll add this in a later PR.

fantaosha

LGTM. Only one minor comment.

theseus/optimizer/sparse_linearization.py

theseus/optimizer/dense_linearization.py

theseus/optimizer/sparse_linearization.py

mhmukadam

LGTM!

…acebookresearch#393) * Add a linearization method to do hessian scaling. * Add caching for diag(AtA).

luisenp added the enhancement New feature or request label Dec 2, 2022

luisenp requested a review from fantaosha December 2, 2022 21:59

luisenp self-assigned this Dec 2, 2022

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 2, 2022

fantaosha reviewed Dec 3, 2022

View reviewed changes

theseus/optimizer/sparse_linearization.py Outdated Show resolved Hide resolved

fantaosha approved these changes Dec 3, 2022

View reviewed changes

theseus/optimizer/sparse_linearization.py Outdated Show resolved Hide resolved

theseus/optimizer/dense_linearization.py Show resolved Hide resolved

fantaosha reviewed Dec 3, 2022

View reviewed changes

theseus/optimizer/sparse_linearization.py Show resolved Hide resolved

luisenp force-pushed the lep.diagonal_scaling branch from 4bbea0f to 9dd36ba Compare December 5, 2022 16:46

luisenp mentioned this pull request Dec 5, 2022

Add masked linearization #395

Open

luisenp force-pushed the lep.sparse_matrix_vector_prod branch from 146e0bf to 1d703b1 Compare December 5, 2022 16:59

luisenp force-pushed the lep.diagonal_scaling branch 3 times, most recently from 122798c to 8c5c348 Compare December 5, 2022 22:54

luisenp force-pushed the lep.sparse_matrix_vector_prod branch from 4a78a03 to be37500 Compare December 8, 2022 16:59

luisenp force-pushed the lep.diagonal_scaling branch from 8c5c348 to bed3dab Compare December 8, 2022 17:04

luisenp force-pushed the lep.sparse_matrix_vector_prod branch from be37500 to 1e9dcb9 Compare December 8, 2022 17:26

Base automatically changed from lep.sparse_matrix_vector_prod to main December 8, 2022 18:17

luisenp added 3 commits December 8, 2022 10:58

Add a linearization method to do hessian scaling.

8880571

Fix unit test.

c25d735

Add caching for diag(AtA).

b4535c7

luisenp force-pushed the lep.diagonal_scaling branch from bed3dab to b4535c7 Compare December 8, 2022 18:58

luisenp merged commit 1d3ef23 into main Dec 9, 2022

luisenp deleted the lep.diagonal_scaling branch December 9, 2022 15:01

mhmukadam mentioned this pull request Jan 2, 2023

Masking logic for batched linearization #419

Closed

mhmukadam reviewed Jan 2, 2023

View reviewed changes

suddhu pushed a commit to suddhu/theseus that referenced this pull request Jan 21, 2023

Add diagonal scaling method and re-enable LM adaptive + ellipsoidal (f…

3a5c7ed

…acebookresearch#393) * Add a linearization method to do hessian scaling. * Add caching for diag(AtA).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add diagonal scaling method and re-enable LM adaptive + ellipsoidal #393

Add diagonal scaling method and re-enable LM adaptive + ellipsoidal #393

luisenp commented Dec 2, 2022

fantaosha left a comment

fantaosha Dec 3, 2022

luisenp Dec 3, 2022

luisenp commented Dec 3, 2022

fantaosha left a comment

mhmukadam left a comment

Add diagonal scaling method and re-enable LM adaptive + ellipsoidal #393

Add diagonal scaling method and re-enable LM adaptive + ellipsoidal #393

Conversation

luisenp commented Dec 2, 2022

fantaosha left a comment

Choose a reason for hiding this comment

fantaosha Dec 3, 2022

Choose a reason for hiding this comment

luisenp Dec 3, 2022

Choose a reason for hiding this comment

luisenp commented Dec 3, 2022

fantaosha left a comment

Choose a reason for hiding this comment

mhmukadam left a comment

Choose a reason for hiding this comment