1 unstable release
| 0.1.3 | Feb 15, 2023 |
|---|---|
| 0.1.2 |
|
| 0.1.1 |
|
| 0.1.0 |
|
#9 in #word-piece
37 downloads per month
Used in bert_create_pretraining
225KB
373 lines
This crate is a Rust port of Google's BERT GoogleBERT WordPiece tokenizer.
bert_tokenizer
The crate provides the port of the original BERT tokenizer from the Google BERT repository.
License
MIT license. See LICENSE file for full license.
Dependencies
~2.5MB
~48K SLoC