-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable models to emit lambda functions. #5026
Comments
You mean this would be like a promise that can the be computed asynchronously? Could you give an example of a lambda you would want to pass please? From the example you give would it be something like: observe(i, "oob_error", "Out-of-bag Error", [=](){return get_oob_error()}; And then you tell the observer whether to compute it? |
@gf712 yes, the example you wrote is exactly what I had in mind :) Then if the user wants to observe the parameter I guess the second solution would be harder to implement because of the interfaces (SWIG support for lambdas seems quite limited at the moment http://www.swig.org/Doc4.0/SWIGDocumentation.html#CPlusPlus11_lambda_functions_and_expressions) |
We wouldnt want to store the lambda. The class just offers the lambda to the observer, which can decide to call it (and store the information), or not. It shouldnt be valid after the call is over |
The use case is that a user has applied a filter to the observer to choose what to store....in that case we dont always want to compute the oob error here |
so the observe function becomes a blocking call right? Because now there is a synchronisation issue, the main routine either copies the data and lets the observer do what it wants with it (for example call it in a seperate thread), or awaits for the observer to do something (and in this case training is stopped until it is computed)
So the user passes a string? Or is this with the AnyParameter flags? |
basically we want this semantic (inside the RF)
as opposed to not having the if statement (i.e. always computing the oob error even if the user doesnt care about it. |
Something that would be more OOP, but quite a shift in design, is that we have |
The user can pass a list of strings (e.g., |
I would say just pass the function pointer, like watch_method. The lambda here might not bring much benefit, but you can try. Lambdas can be negative cost :) |
btw, I was just thinking about this and would this condition be declared once? And then when the algorithm adds a new oob index set it computes the oob error? |
not sure what you mean |
in this case it makes sense to store a function pointer, rather than passing the oob error itself (a scalar) |
I see. That seems like more work to me :) |
I am just not sure I understand why you need to pass the function pointer in that case?
you will achieve the same no? Unless what you want is to compute oob error in a separate thread? And this |
No need if this is done. But this is just the semantics. In practice, we would like to not have to do this if statement in the train code. Rather, just the observe call. And then the observer decides. Sorry should have been more clear |
OK, I think in that case we could look into the thing I posted above? It is more effort, but it would be neat that a combination of events triggers the observer. If this is not possible with Rxcpp you could always extend the functionality using conditional variables. Reread your comment and I think I finally get it. I think that would work fine. You would have to pass the function pointer at the start. Store that (inside it has the reference to the vectors with machines and indices) and then a second call to the observer inside the loop triggers the function if one has been registered. |
|
I am sorry if I go back to the original idea, but why do we have to store twice the information for calling the // Once the observer receives this observation, it checks if he wants to observe
// "oob_error" objects and then, since it is a lambda, it executes it and stores the result.
train()
this->observe("oob_error", "Out-of-bag Error", [&](){
return get_oob_error();
}); From the observer point of view we will have something like: on_next(observed_value)
if (observed_value can be executed) {
// here we execute the lambda and we pass the result
// to the real implementation of the observer (e.g. ParameterObserverCV,
// ParameterObserverScalar, etc.)
result = observed_value()
on_next_impl(result)
} I apologize if this is a silly question :) Anyway, if I look at the |
From what I understand this observe call would have to be wrapped by a condition inside a loop right? This is the moment you are telling the observer to emit something right? Or is emitting the value by this observer triggered by something else?
That is why I was thinking that rather than having an std::mutex kObserverMutex;
std::condition_variable cv;
worker() {
std::unique_lock<std::mutex> lk(kObserverMutex);
// the Observer thread is now locked until the two class members change
cv.wait(lk, []{return m_bags_changed() && m_oob_indices_changed();});
// both parameters changed -> execute callback
execute_callback();
} This could even lock any other call to the callback to avoid some undefined behaviour. Granted this is quite complex logic, but IMO this would be really neat, and I think this falls within the observer pattern? |
Gil, I agree this is neat, but what about a simpler thing where we could just do (not wrapped in if)
|
However, if the observer has no way of filtering/selecting values/functions that it receives, all hope is lost anyways, because we either need to always compute the oob, or always store the whole cloned objects ... |
So, the observer (e.g. |
So then what I wrote above (passing the lambda/pointer) should work right? |
Yes, it should work perfectly fine (aside from possible issues about copying memory around and synchronization of the various threads). By passing the lambda, we can defer the execution of expensive functions and demand it to the observer (rather than the model) and only if the user has decided to do so. It is the most viable option imho |
I guess in terms of sync issues we should just have a helper function for oob error that uses arguments to compute the error rather than using the state of the object (so we store the parameters in the lambda which are just memory management objects (SGVector and std::shared_ptr), so quite cheap to store/copy). |
We might want to emit some information from a given model which may be costly to compute and it could slow down the training ( see #4655 for an example of such scenario ). This is important because those kinds of information would be computed even if the user wants to observe different parameters.
The solution here would be to emit lambdas instead, which can then be executed to obtain the real value. I think there are two possible ways to proceed:
ParameterObserver
(if the user wants to observe that particular parameter). The result of the lambda is then stored and shown to the user. In such a case, the end-user will not manage directly the emitted lambdas and it will see only the final result. We might experience anyway reduced training speed in this scenario since the observer will need to execute the lambda anyway;ParameterObserver
. The end-user will need to execute this lambda in order to retrieve its value. In such a case, the end-user will be aware of the existence of these lambdas. Basically, besides emitting scalars/vectors/matrices, we introduce also the possibility of emitting functions.@karlnapf @gf712 what do you think is the best way to implement this?
The text was updated successfully, but these errors were encountered: