Skip to content

brongulus/MetaBiLSTM

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MetaBiLSTM

Pytorch implementation of Meta BiLSTM sequence tagger from this paper https://arxiv.org/pdf/1805.08237v1.pdf alongwith additions of GRU cells comparing both their performance.

Requirements

  1. conllu

Usage Instructions

The dataset download links and insructions are in scripts directory. The rar file needs to be extracted into the data/embeddings directory and the CoNLLU dataset needs to be placed in the root of the dir directory.

Add gru config in JSON file

Fix the metaBiLSTM layer, it’s loss is horrible compared to character and word based models

Add graphs and plots for performance n

Slides:

  1. Name ID - 1
  2. Title, affiliation of authors - 1
  3. Description - Aim, Methodology, Outcome - 3/4
  4. Concepts - 3/4
  5. Dataset Details - 2/3
  6. Alloted tasks and progress - 1/2
  7. Implementation details, pseudocode - 4/6
  8. Results/discussions - 3/4
  9. Comparison of results - 1/2
  10. Challenges - 2/3
  11. Scope - 1
  12. Experience/Learning Outcomes - 1/2

Interim Notes

Ref 1: Embeddings

  1. Need for char based models? No word segmentations in some languages and handling informal language.
  2. Benefits of char based models:
    • Generate embeddings for unknown words.
    • Similar words have similar embeddings.
  3. Subword Models a. BYTE PAIR ENCODING: looking for the most frequent sequence of 2 bytes and then you add that sequence of 2 bytes as a new element to your dictionary of possible values. Essentially character n-grams. It encapsulates the most frequent n-gram pairs into a new n-gram. b. WORDPIECE/SENTENCEPIECE: Greedy approximation to maximize language model log-likelihood to choose the pieces and add n-gram that maximally reduces perplexity. Wordpiece tokenizes inside words, it tokenizes words first then applies BPE. In sentencepiece model, the whitespace is retained as a special token and grouped normally.
  4. Hybrid character and word level models
  5. The main issue with one-hot encoding is that the transformation does not rely on any supervision. We can greatly improve embeddings by learning them using a neural network on a supervised task. The embeddings form the parameters — weights — of the network which are adjusted to minimize loss on the task.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.1%
  • Shell 0.9%