This code is based on the official code base for "VSE++: Improving Visual-Semantic Embeddings with Hard Negatives" (Faghri, Fleet, Kiros, Fidler. 2017).
We recommended to use Anaconda for the following packages.
import nltk
nltk.download()
> d punkt
Download the Multi30K the caption data by cloneing the official repo:
git clone https://github.com/multi30k/dataset
The pre-computed image-features for Multi30K are available on google-drive.
To run expmerinemts on COCO and F30K download the dataset files and pre-trained models. Splits are the same as Andrej Karpathy. The precomputed image features are from here and here. To use full image encoders, download the images from their original sources here, here and here.
wget http://www.cs.toronto.edu/~faghri/vsepp/vocab.tar
wget http://www.cs.toronto.edu/~faghri/vsepp/data.tar
wget http://www.cs.toronto.edu/~faghri/vsepp/runs.tar
All commands should be concatenated with
--data_name m30k --img_dim 2048 --max_violation --patience 10
and given a --seed
.
Method | Arguments |
---|---|
Monolingual English | --lang en --num_epochs 1000 |
Monolingual German | --lang de --num_epochs 1000 |
Bilingual | --lang en-de |
Bilingual + c2c | --lang en-de --sentencepair |
Method | Arguments |
---|---|
Monolingual English | --lang en1 --num_epochs 1000 |
Monolingual German | --lang de1 --num_epochs 1000 |
Bi-translation | --lang en1-de1 |
Bi-translation + c2c | --lang en1-de1 --sentencepair |
Bi-comperable | --lang en-de --undersample |
Bi-comperable + c2c | --lang en-de --undersample --sentencepair |
Method | Arguments |
---|---|
Full Monolingual English | --lang en --num_epochs 1000 |
Full Monolingual German | --lang de --num_epochs 1000 |
Half Monolingual English | --lang en --half --num_epochs 1000 |
Half Monolingual German | --lang de --half --num_epochs 1000 |
Bi-aligned | --lang en-de --half |
Bi-aligned + c2c | --lang en-de --half --sentencepair |
Bi-disjoint | --lang en-de --half --disaligned |
Method | Arguments |
---|---|
Monolingual English | --lang en1 --num_epochs 1000 |
Monolingual German | --lang de1 --num_epochs 1000 |
Monolingual French | --lang fr --num_epochs 1000 |
Monolingual Czech | --lang cs --num_epochs 1000 |
Multi-translation | --lang en1-de1-fr-cs |
Multi-translation + c2c | --lang en1-de1-fr-cs --sentencepair |
Multi-comperable | --lang en-de-fr-cs --undersample |
Multi-comperable + c2c | --lang en-de-fr-cs --undersample --sentencepair |
Method | Arguments |
---|---|
Monolingual French | --lang fr --num_epochs 1000 |
Monolingual Czech | --lang cs --num_epochs 1000 |
Multilingual French | --lang en1-de1-fr-cs --primary fr |
Multilingual Czech | --lang en1-de1-fr-cs --primary cs |
+ Comparable French | --lang en-de-fr-cs --primary fr |
+ Comparable Czech | --lang en-de-fr-cs --primary cs |
+ Comparable + c2c French | --lang en-de-fr-cs --primary fr --sentencepair |
+Comparable + c2c Czech | --lang en-de-fr-cs --primary cs --sentencepair |
If you found this code useful, please cite the following paper:
@article{kadar2018lessons,
title={Lessons learned in multilingual grounded language learning},
author={K{'a}d{'a}r, {'A}kos and Elliott, Desmond and C{^o}t{'e}, Marc-Alexandre and Chrupa{\l}a, Grzegorz and Alishahi, Afra},
journal={arXiv preprint arXiv:1809.07615},
year={2018}
}