A platform for quick and easy development of deep learning networks for recognition and detection in videos. Includes popular models like C3D and SSD.
Check out our wiki!
Model Architecture | Dataset | ViP Accuracy (%) |
---|---|---|
I3D | HMDB51 (Split 1) | 72.75 |
C3D | HMDB51 (Split 1) | 50.14 ± 0.777 |
C3D | UCF101 (Split 1) | 80.40 ± 0.399 |
Model Architecture | Dataset | ViP Accuracy (%) |
---|---|---|
SSD300 | VOC2007 | 76.58 |
Model Architecture | Dataset | ViP Accuracy (%) |
---|---|---|
DVSA (+fw, obj) | YC2-BB (Validation) | 30.09 |
fw: framewise weighting, obj: object interaction
Please cite ViP when releasing any work that used this platform: https://arxiv.org/abs/1910.02793
@article{ganesh2019vip,
title={ViP: Video Platform for PyTorch},
author={Ganesh, Madan Ravi and Hofesmann, Eric and Louis, Nathan and Corso, Jason},
journal={arXiv preprint arXiv:1910.02793},
year={2019}
}
Dataset | Task(s) |
---|---|
HMDB51 | Activity Recognition |
UCF101 | Activity Recognition |
ImageNetVID | Video Object Detection |
MSCOCO 2014 | Object Detection, Keypoints |
VOC2007 | Object Detection, Classification |
YC2-BB | Video Object Grounding |
DHF1K | Video Saliency Prediction |
Model | Task(s) |
---|---|
C3D | Activity Recognition |
I3D | Activity Recognition |
SSD300 | Object Detection |
DVSA (+fw, obj) | Video Object Grounding |
- Python 3.6
- Cuda 9.0
- (Suggested) Virtualenv
# Set up Python3 virtual environment
virtualenv -p python3.6 --no-site-packages vip
source vip/bin/activate
# Clone ViP repository
git clone https://github.com/MichiganCOG/ViP
cd ViP
# Install requirements and model weights
./install.sh
Run train.py
and eval.py
to train or test any implemented model. The parameters of every experiment is specified in its config.yaml file.
Use the --cfg_file
command line argument to point to a different config yaml file.
Additionally, all config parameters can be overriden with a command line argument.
Run eval.py
with the argument --cfg_file
pointing to the desired model config yaml file.
Ex: From the root directory of ViP, evaluate the action recognition network C3D on HMDB51
python eval.py --cfg_file models/c3d/config_test.yaml
Run train.py
with the argument --cfg_file
pointing to the desired model config yaml file.
Ex: From the root directory of ViP, train the action recognition network C3D on HMDB51
python train.py --cfg_file models/c3d/config_train.yaml
Additional examples can be found on our wiki.
New models and datasets can be added without needing to rewrite any training, evaluation, or data loading code.
To add a new model:
- Create a new folder
ViP/models/custom_model_name
- Create a model class in
ViP/models/custom_model_name/custom_model_name.py
- Complete
__init__
,forward
, and (optional)__load_pretrained_weights
functions
- Complete
- Add PreprocessTrain and PreprocessEval classes within
custom_model_name.py
- Create
config_train.yaml
andconfig_test.yaml
files for the new model
Examples of previously implemented models can be found here.
Additional information can be found on our wiki.
To add a new dataset:
- Convert annotation data to our JSON format
- Create a dataset class in
ViP/datasets/custom_dataset_name.py
.- Inherit
DetectionDataset
orRecognitionDataset
fromViP/abstract_dataset.py
- Complete
__init__
and__getitem__
functions - Example skeleton dataset can be found here
- Inherit
Additional information can be found on our wiki.
A detailed FAQ can be found on our wiki.