-
Notifications
You must be signed in to change notification settings - Fork 495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HITL - Data collection #1967
HITL - Data collection #1967
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarification questions mostly.
I skipped the files which largely looked the same as #1965. I couldn't tell the diff, LMK if you want me to look again at the diff, if any, once the previous one is merged.
"t": elapsed_time, | ||
"users": [], | ||
"object_states": self.get_objects_state(), | ||
"agent_states": self.get_agents_state(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add a "world-graph" API here if you want to save those object-to-furniture/agent/receptacle relations here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be perfect.
"task_completed": u.episode_finished, | ||
"task_succeeded": u.episode_success, | ||
"camera_transform": u.cam_transform, | ||
"held_object": u.ui._held_object_id, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've discussed this already but this ID if passed to grasp_mgr, in a way understood by sim, would solve what habitat-llm needs.
# Register UI callbacks | ||
self.ui.on_pick.registerCallback(self._on_pick) | ||
self.ui.on_place.registerCallback(self._on_place) | ||
self.ui.on_open.registerCallback(self._on_open) | ||
self.ui.on_close.registerCallback(self._on_close) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To use these, would I need to pass on ui
object to LLMController
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can probably just do it from rearrange_v2
initialization code, which has access to both the UI and LLMController
.
Something like this would work:
self._user_data[0].ui.on_pick.registerCallback(llm_controller.on_pick())
If we ever want to scale this to N users, we would just pass the user index in the event data. For now, this does the job.
def _on_open(self, e: UI.OpenEventData): | ||
self.ui_events.append( | ||
{ | ||
"type": "open", | ||
"obj_handle": e.object_handle, | ||
"obj_id": e.object_id, | ||
} | ||
) | ||
|
||
def _on_close(self, e: UI.CloseEventData): | ||
self.ui_events.append( | ||
{ | ||
"type": "close", | ||
"obj_handle": e.object_handle, | ||
"obj_id": e.object_id, | ||
} | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is irrelevant to PR, but do we need open/close for single-learn data collection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to do it right now. That will most likely change with the inclusion of object states.
def on_exit(self): | ||
super().on_exit() | ||
|
||
episode_success = all( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
episode_success
here means "user thinks episode was done/success", right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current state, success is the only outcome.
In a following PR, I'll be adding a way for users to report either success or failure via a form.
@@ -486,9 +589,12 @@ def sim_update(self, dt: float, post_sim_update_dict): | |||
|
|||
# Collect data. | |||
self._elapsed_time += dt | |||
# TODO: Always record with non-human agent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In single-user and multi-user modes, we can skip recording of frames when there's no user input.
With the LLM agent, we'll most likely have to record every frame.
def record_frame( | ||
self, | ||
frame_data: Dict[str, Any], | ||
): | ||
self.data["end_timestamp"] = timestamp() | ||
self.data["frame_count"] += 1 | ||
|
||
self.data["episodes"][-1]["end_timestamp"] = timestamp() | ||
self.data["episodes"][-1]["frame_count"] += 1 | ||
self.data["episodes"][-1]["frames"].append(frame_data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this frame different from the frame you already have the FrameRecorder
for? Why twice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I trust you've run this before so this works :)
Code LGTM, a couple nits.
if os.path.exists(output_folder): | ||
shutil.rmtree(output_folder) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just making sure this will not delete useful data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only contains the .json.gz
file. The directory is expected to contain more data in the future (e.g. replay file, screenshots, etc).
You could however change the path in the config to any directory 🤔
* Add session management. * Formatting changes. * Add clarifications to episode resolution. * Document temporary hack to check for client-side loading status. * Add session recorder, ui events and data upload. * Change path handling in session upload code.
* Add session management. * Formatting changes. * Add clarifications to episode resolution. * Document temporary hack to check for client-side loading status. * Add session recorder, ui events and data upload. * Change path handling in session upload code.
* Add session management. * Formatting changes. * Add clarifications to episode resolution. * Document temporary hack to check for client-side loading status. * Add session recorder, ui events and data upload. * Change path handling in session upload code.
* Add session management. * Formatting changes. * Add clarifications to episode resolution. * Document temporary hack to check for client-side loading status. * Add session recorder, ui events and data upload. * Change path handling in session upload code.
Motivation and Context
This changeset enables the
rearrange_v2
app to record data for HITL experiments.How it works
This adds a session recorder, which has its lifetime tied to a single HITL session.
When reaching the
End Session
state (see this PR), the session is uploaded to S3 before being destroy.Configuration:
The following configuration will do the following:
output/session.json.gz
Placeholder/[completed/incomplete]/session.json.gz
.The bucket is defined via the environment variable
S3_BUCKET
.Notes
Depends on:
How Has This Been Tested
Tested on EC2 instances running single and multi user applications.
Types of changes
Checklist