This folder contains training, validation, and unlabeled test sets for HellaSwag, in .jsonl
format. Here's what each dataset example contains:
ind
: dataset IDactivity_label
: The ActivityNet or WikiHow label for this example- context: There are two formats. The full context is in
ctx
. When the context ends in an (incomplete) noun phrase, like for ActivityNet, this incomplete noun phrase is inctx_b
, and the context up until then is inctx_a
. This can be useful for models such as BERT that need the last sentence to be complete. However, it's never required. Ifctx_b
is nonempty, thenctx
is the same thing asctx_a
, followed by a space, thenctx_b
. endings
: a list of 4 endings. The correct index is given bylabel
(0,1,2, or 3)split
: train, val, or test.split_type
:indomain
if the activity label is seen during training, elsezeroshot
source_id
: Which video or WikiHow article this example came from