Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Region Averaged Acquisition Functions #2699

Open
wants to merge 25 commits into
base: main
Choose a base branch
from

Conversation

SaiAakash
Copy link
Contributor

@SaiAakash SaiAakash commented Jan 25, 2025

Motivation

This PR addresses #2684. LogREI and qLogREI have been implemented in the botorch_community folder.

Have you read the Contributing Guidelines on pull requests?

Yes

Test Plan

Test for both the acquisition functions have been added in test_community/acquisition/test_rei.py.

Related PRs

NA

@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Jan 25, 2025
@facebook-github-bot
Copy link
Contributor

@esantorella has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link

codecov bot commented Jan 27, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.98%. Comparing base (acae12d) to head (281cba7).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2699   +/-   ##
=======================================
  Coverage   99.98%   99.98%           
=======================================
  Files         202      203    +1     
  Lines       18604    18667   +63     
=======================================
+ Hits        18602    18665   +63     
  Misses          2        2           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@esantorella esantorella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting this in! A couple things:

  • Could you add a module-level docstring to rei.py giving a broad overview of what acquisition functions are in this file and what they do, with pointers to the relevant academic literature? And could you also add docstrings to the acquisition functions?
  • Some usage examples in the docstrings would be particularly helpful, especially regarding X_dev. How should one choose those points? Can this be used in a similar way to other acquisition functions, or do the trust regions make things very different?
  • If possible, it might be nice to have an input constructor in botorch_community/acquisition/input_constructors.py (following the pattern of the constructors in botorch/acquisition/input_constructors.py`), but I don't think we are requiring that.

Comment on lines 29 to 34
best_f: Union[float, Tensor],
X_dev: Union[float, Tensor],
posterior_transform: Optional[PosteriorTransform] = None,
maximize: bool = True,
length: float = 0.8,
bounds: Optional[Union[float, Tensor]] = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: PEP 604

Suggested change
best_f: Union[float, Tensor],
X_dev: Union[float, Tensor],
posterior_transform: Optional[PosteriorTransform] = None,
maximize: bool = True,
length: float = 0.8,
bounds: Optional[Union[float, Tensor]] = None,
best_f: float | Tensor,
X_dev: float | Tensor,
posterior_transform: PosteriorTransform | None = None,
maximize: bool = True,
length: float = 0.8,
bounds: float | Tensor | None = None,

Comment on lines 55 to 60
self.maximize = maximize

dim = X_dev.shape[1]
self.n_region = X_dev.shape[0]
self.X_dev = X_dev.reshape(self.n_region, 1, 1, -1)
self.length = length
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.maximize = maximize
dim = X_dev.shape[1]
self.n_region = X_dev.shape[0]
self.X_dev = X_dev.reshape(self.n_region, 1, 1, -1)
self.length = length
self.maximize: bool = maximize
dim: int = X_dev.shape[1]
self.n_region: int = X_dev.shape[0]
self.X_dev: Tensor = X_dev.reshape(self.n_region, 1, 1, -1)
self.length: float = length

maximize: bool = True,
length: float = 0.8,
bounds: Optional[Union[float, Tensor]] = None,
):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
):
) -> None:

Comment on lines 166 to 167
(obj - self.best_f.unsqueeze(-1).to(obj))
.clamp_min(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SebastianAment , would it make sense to use botorch.utils.safe_math.fatminimum or something for differentiability here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on what the goal is:

  • If the goal is to add REI as it was proposed in the REI paper (code), we should keep the hard non-linearities.
  • If we seek to make this acquisition function as performant as it could be, you could add a qLogRegionalExpectedImprovement that computes the log acquisition value based on smoothed fat-tailed non-linearities similar to qLogExpectedImprovement (code, NeurIPS '23 spotlight). It seems like you should be able to take the existing qLogEI code and adapt it to this use case with few modification. If you decide to do this, let me know if you have questions about this.

Also, @Nobuo-Namura great work on REI! cc'ing you here since this conversation on leveraging qLogEI techniques for qLogREI might be of interest to you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @SebastianAment ! Theoretically, any improvements that qLogEI can provide over qEI should translate to qLogREI as well. So, I think it's worth trying modifying qREI to qLogREI. I'll try to do this now. Thanks !

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have modified qREI to qLogREI and the corresponding tests. Let me know if that looks alright.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks for working on this, left a few more comments on the conversion, it's close!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the mention and for implementing REI. I’ll definitely refer to your qLogREI implementation.

@SaiAakash
Copy link
Contributor Author

Hi @esantorella @SebastianAment ! I have made the requested changes. I have also modified qREI to qLogREI to benefit from robustness of Log based acquisition functions against numerical instabilities. I have not written an example usage in the docstring for the module because the usage of these acquisition functions cannot be readily seen in a short code snippet (requires a bit of boilerplate code). We can demonstrate this in a notebook in the notebooks_community folder that implements efficient TuRBO with the new acquisition functions in a separate PR if that's okay. Thanks for all your inputs ! Let me know if you have any further comments.

@facebook-github-bot
Copy link
Contributor

@esantorella has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@esantorella
Copy link
Member

Thanks for these changes; this is looking good! I'm reluctant to give @SaiAakash even more to-dos, but it would be ideal to check that performance is not worse after some of the more substantive changes such as using logmeanexp. Anyone have any thoughts on how to most efficiently confirm that this is still working? It looks like some benchmarking functionality is provided at https://github.com/Nobuo-Namura/regional-expected-improvement/

@SaiAakash
Copy link
Contributor Author

Hi @esantorella ! After these changes, I have tested both of these acquisition functions on a problem I am working on at the moment about which I cannot disclose any details. But I can confirm that there are no significant performance degradations since these changes have been made. I can put some convergence plots here if you're interested though.

@facebook-github-bot
Copy link
Contributor

@esantorella has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@SebastianAment
Copy link
Contributor

Hi @SaiAakash, just making sure you’ve seen the pending suggestions above. Once we have these in and tests pass, we should be ready to land this!

@SaiAakash
Copy link
Contributor Author

Hi @SebastianAment ! Those changes are already in. I can see the Outdated flag on those conversations. So, those can be resolved ?

@SebastianAment
Copy link
Contributor

Hi @SaiAakash, there are a few more that are still pending, e.g. replacing MCAcquisitionFunction with LogImprovementMCAcquisitionFunction changing forward to _sample_forward etc.

Attaching a screenshot of the first comment that is still pending above:
Screenshot 2025-01-29 at 10 19 47 PM

@SaiAakash
Copy link
Contributor Author

SaiAakash commented Jan 30, 2025

Aaha I see ! This is weird, because for some reason I am unable to see these comments shown in the screenshot on this PR ! Anyways, I will make these changes and get back to you. Thanks !

return logrei


class qLogRegionalExpectedImprovement(MCAcquisitionFunction):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making this a derived class of LogImprovementMCAcquisitionFunction will allow you to automatically leverage the smooth log-space reductions over the q-batch (number of candidates) and the Monte-Carlo dimension (see additional comment below).

Suggested change
class qLogRegionalExpectedImprovement(MCAcquisitionFunction):
class qLogRegionalExpectedImprovement(LogImprovementMCAcquisitionFunction):


@concatenate_pending_points
@t_batch_mode_transform()
def forward(self, X: Tensor) -> Tensor:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using LogImprovementAcquisitionFunction as the parent class, you can replace the forward call here with a _sample_forward (similar to here), and you can get rid of the decorators.

Suggested change
def forward(self, X: Tensor) -> Tensor:
def _sample_forward(self, X: Tensor) -> Tensor:

obj = _log_improvement(obj, self.best_f, self.tau_relu, self.fat).reshape(
-1, self.n_region, batch_shape, q
)
q_log_rei = obj.max(dim=-1)[0].mean(dim=(0, 1))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the reduction over q (the max) and the Monte Carlo samples (the mean), and needs to be smoothed and converted to log-space as well. Fortunately, LogImprovementMCAcquisitionFunction takes care of this automatically and you can simply remove these operations here, only returning obj, similar to here.

Suggested change
q_log_rei = obj.max(dim=-1)[0].mean(dim=(0, 1))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SaiAakash There are a few pending suggestions left here regarding qLogREI. Could you have look?

Comment on lines 166 to 167
(obj - self.best_f.unsqueeze(-1).to(obj))
.clamp_min(0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks for working on this, left a few more comments on the conversion, it's close!

@SebastianAment
Copy link
Contributor

SebastianAment commented Jan 30, 2025

Aaha I see ! This is weird, because for some reason I am unable to see these comments shown in the screenshot on this PR ! Anyways, I will make these changes and get back to you. Thanks !

@SaiAakash I forgot to submit the review, sorry about that! Hopefully it's visible now.

@SaiAakash
Copy link
Contributor Author

@SebastianAment I have incorporated your suggested changes. qLogREI now inherits from LogImprovementMCAcquisitionFunction and implements a _sample_forward method. Let me know if you have any further comments.

@SaiAakash
Copy link
Contributor Author

@SebastianAment I have a question regarding this change though. The _sample_forward method in general, expects the MC objective function values instead of X in other places. Here, we want to provide X to the evaluation function and compute the objective function value within the evaluation function. Do you feel this inconsistency is okay ? I feel sometimes this can lead to some dimension mismatches of the tensors maybe. Let me know if you have any thoughts about this ?

@SebastianAment
Copy link
Contributor

The _sample_forward method in general, expects the MC objective function values instead of X in other places. Here, we want to provide X to the evaluation function and compute the objective function value within the evaluation function.

@SaiAakash That's right, so this is more subtle, thanks for raising this! There are two adjustments that would solve this issue:

  1. We should move the code that generates the samples and objectives to a custom _get_samples_and_objectives, and can just reuse the existing _sample_forward implemented by qLogExpectedImprovement (you could even derive from this class). Something like adding
def _get_samples_and_objectives(self, X: Tensor) -> tuple[Tensor, Tensor]:
       """ doc-string ... """
        # region-averaged EI specific code 
        q = X.shape[1]
        d = X.shape[2]

        # make N_x samples in design space
        X_min = (X - 0.5 * self.length).clamp_min(self.bounds[0]).unsqueeze(0)
        X_max = (X + 0.5 * self.length).clamp_max(self.bounds[1]).unsqueeze(0)
        Xs = (self.X_dev * (X_max - X_min) + X_min).reshape(-1, q, d)

        # calling the original method with the modified inputs
        return super()._get_samples_and_objectives(Xs)
  1. I overlooked that the mean call goes over both (0, 1) before, i.e. the function-sample and the input-sample dimension. This can be taken care of with an adjustment in the constructor:
# difference to `SampleReducingMCAcquisitionFunction` constructor being the "+1"
sample_dim = tuple(range(len(self.sample_shape) + 1))   
self._sample_reduction = partial(logmeanexp, dim=sample_dim)

It is important to do this assignment after the super().__init__ call, which itself assigns this variable here.

Let me know if you have thoughts or questions on this!

@SaiAakash
Copy link
Contributor Author

Hi @SebastianAment ! Thanks for the detailed code snippets and the explanation. I implemented the _get_samples_and_objectives function. I think the second operation of updating the sample_dim is not required because the original code was reshaping the tensor to 4 dimensions (see here). Thus, it required an additional dimension over which it was taking the mean. This is not required now. Things are working fine, without overwriting the sample_dim value. Let me know if you have any further comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Do not delete this pull request or issue due to inactivity.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants