MetaBiLSTM

Pytorch implementation of Meta BiLSTM sequence tagger from this paper https://arxiv.org/pdf/1805.08237v1.pdf alongwith additions of GRU cells comparing both their performance.

Requirements

conllu

Usage Instructions

The dataset download links and insructions are in scripts directory. The rar file needs to be extracted into the data/embeddings directory and the CoNLLU dataset needs to be placed in the root of the dir directory.

Add gru config in JSON file

Fix the metaBiLSTM layer, it’s loss is horrible compared to character and word based models

Add graphs and plots for performance n

Slides:

Name ID - 1
Title, affiliation of authors - 1
Description - Aim, Methodology, Outcome - 3/4
Concepts - 3/4
Dataset Details - 2/3
Alloted tasks and progress - 1/2
Implementation details, pseudocode - 4/6
Results/discussions - 3/4
Comparison of results - 1/2
Challenges - 2/3
Scope - 1
Experience/Learning Outcomes - 1/2

Interim Notes

Ref 1: Embeddings

Need for char based models? No word segmentations in some languages and handling informal language.
Benefits of char based models:
- Generate embeddings for unknown words.
- Similar words have similar embeddings.
Subword Models a. BYTE PAIR ENCODING: looking for the most frequent sequence of 2 bytes and then you add that sequence of 2 bytes as a new element to your dictionary of possible values. Essentially character n-grams. It encapsulates the most frequent n-gram pairs into a new n-gram. b. WORDPIECE/SENTENCEPIECE: Greedy approximation to maximize language model log-likelihood to choose the pieces and add n-gram that maximally reduces perplexity. Wordpiece tokenizes inside words, it tokenizes words first then applies BPE. In sentencepiece model, the whitespace is retained as a special token and grouped normally.
Hybrid character and word level models
The main issue with one-hot encoding is that the transformation does not rely on any supervision. We can greatly improve embeddings by learning them using a neural network on a supervised task. The embeddings form the parameters — weights — of the network which are adjusted to minimize loss on the task.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
config		config
scripts		scripts
src/meta_bilstm		src/meta_bilstm
.gitignore		.gitignore
LICENSE		LICENSE
README.org		README.org
run1.txt		run1.txt
run2.txt		run2.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MetaBiLSTM

Requirements

Usage Instructions

Add gru config in JSON file

Fix the metaBiLSTM layer, it’s loss is horrible compared to character and word based models

Add graphs and plots for performance n

Slides:

Interim Notes

About

Releases

Packages

Languages

License

brongulus/MetaBiLSTM

Folders and files

Latest commit

History

Repository files navigation

MetaBiLSTM

Requirements

Usage Instructions

Add gru config in JSON file

Fix the metaBiLSTM layer, it’s loss is horrible compared to character and word based models

Add graphs and plots for performance n

Slides:

Interim Notes

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages