CrossModalLearning

Introduction

The recent work Text2Shape and Y2Seq2Seq bridges the gap between the Natural language description and 3D shapes to learn a cross modal representation. For the work of shape-to-text(S2T) retrieval and text-to-shape(T2S) retrieval, to further learn the correlation between the 3D shapes and text description we will use both cross modal reconstruction and metric learning approach, and produce a joint representation that captures the many-to-many relations between language and physical properties.

Video

Important considerations:

The whole code has been written to run on Google Colab to use GPUs and speed-up computations. You might need to do minor changes to adapt it for a different environment.
I really recommend using GPUs. Otherwise, adversarial attacks generation and model predictions become really slow.
Using these scripts, all results can be reproduced. Also data has been provided to avoid repeating the whole process and help future researchers.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
AttnDecoderRNN.py		AttnDecoderRNN.py
CrossModal.ipynb		CrossModal.ipynb
CrossModal_Triplet.ipynb		CrossModal_Triplet.ipynb
CrossModal_Triplet_new.ipynb		CrossModal_Triplet_new.ipynb
PointNet.py		PointNet.py
PointNet1.py		PointNet1.py
PointNet2.py		PointNet2.py
README.md		README.md
TextDecoder.py		TextDecoder.py
descriptions.csv		descriptions.csv
embedding.py		embedding.py
embeddings.py		embeddings.py
test.csv		test.csv
text_extraction_torch.py		text_extraction_torch.py
vocabulary.py		vocabulary.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CrossModalLearning

Introduction

About

Releases

Packages

Languages

shrey-1995/CrossModalLearning

Folders and files

Latest commit

History

Repository files navigation

CrossModalLearning

Introduction

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages