Skip to content

Commit

Permalink
update README
Browse files Browse the repository at this point in the history
  • Loading branch information
jadore801120 committed Jun 20, 2017
1 parent 5d69e6f commit 89a0743
Showing 1 changed file with 28 additions and 18 deletions.
46 changes: 28 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,40 +3,50 @@
This is a PyTorch implementation of the Transformer model in "[Attention is All You Need](https://arxiv.org/abs/1706.03762)" (Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, arxiv, 2017).


A novel sequence to sequence framework utilizes the *self-attention mechanism*, instead of Convolution operation or Recurrent structure, and achieve the state-of-the-art performance on **WMT 2014 English-to-German translation task**. (2017/06/12)
A novel sequence to sequence framework utilizes the **self-attention mechanism**, instead of Convolution operation or Recurrent structure, and achieve the state-of-the-art performance on **WMT 2014 English-to-German translation task**. (2017/06/12)

> The official Tensorflow Implementation can be found in: [tensorflow/tensor2tensor](https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/models/transformer.py).
> To learn more about self-attention mechanism, you could read "[A Structured Self-attentive Sentence Embedding](https://arxiv.org/abs/1703.03130)".
<img src="http://imgur.com/1krF2R6.png" width="250">

The project is still in WIP, now only support training.
The project support training and translation with trained model now.

Note that this project is still a work in progress.


If there is any suggestion or error, feel free to fire an issue to let me know. :)


Translating (Beam search) will be available soon.
# Requirement
- python 3.4+
- pytorch 0.1.12
- tqdm
- numpy

The official Tensorflow Implementation can be found in: [tensorflow/tensor2tensor](https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/models/transformer.py).

# Usage

## 0) Prepare the data
```bash
python preprocess.py -train_src train.src.txt -train_tgt train.tgt.txt -valid_src valid.src.txt -valid_tgt valid.tgt.txt -output output.pt
python preprocess.py -train_src train.src.txt -train_tgt train.tgt.txt -valid_src valid.src.txt -valid_tgt valid.tgt.txt -output data.pt
```

## 1) Training
```bash
python train.py -data output.pt -embs_share_weight -proj_share_weight
python train.py -data data.pt -save trained.chkpt -save_mode best -embs_share_weight -proj_share_weight
```
## 2) Testing
### TODO
- **Beam search**

# Requirement
- python 3.4+
- pytorch 0.1.12
- tqdm
- numpy

# Acknowledgement
- The project structure is heavily borrowed from [OpenNMT/OpenNMT-py](https://github.com/OpenNMT/OpenNMT-py)
## 2) Testing
```bash
python translate.py -model trained.chkpt -vocab data.pt -src test.src.txt
```

---
If there is any suggestion or error, feel free to fire an issue to let me know. :)
### TODO
- Evaluation on the generated text.
- Attention weight plot.
---
# Acknowledgement
- The project structure and some scripts are heavily borrowed from [OpenNMT/OpenNMT-py](https://github.com/OpenNMT/OpenNMT-py)

0 comments on commit 89a0743

Please sign in to comment.