You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, thanks for releasing the code! I want to use the video emotion recognition network, and I found a question in its used module TransformerEncoder. It seems that the newly computed encoded_feature have overwritten the encoded_feature previously calculated using the alibi mask. This does not correspond to the description in the paper.
I also wanted to ask, how long do you usually set the sequence length T when using it?
The text was updated successfully, but these errors were encountered:
Can you be a bit more specific with your question? What lines of code are you referring to?
Thanks to alibi, the transformer should be fairly robust to varying T. In training we set T=150 (i.e up to 6s). Most MEAD videos are shorter than that, though.
Hi, thanks for releasing the code! I want to use the video emotion recognition network, and I found a question in its used module TransformerEncoder. It seems that the newly computed encoded_feature have overwritten the encoded_feature previously calculated using the alibi mask. This does not correspond to the description in the paper.
I also wanted to ask, how long do you usually set the sequence length T when using it?
The text was updated successfully, but these errors were encountered: