Rustling is a blazingly fast library for computational linguistics. It is written in Rust, with Python bindings.
- N-grams
- Language models
- Hidden Markov model
- Word segmentation
- Part-of-speech tagging
- CHAT parsing for TalkBank and CHILDES data
| Component | Task | Speedup | vs. |
|---|---|---|---|
| Language Models | Fit | 10x | NLTK |
| Score | 1.9x | NLTK | |
| Generate | 106--114x | NLTK | |
| Word Segmentation | LongestStringMatching | 9x | wordseg |
| POS Tagging | Training | 5x | NLTK |
| Tagging | 18x | NLTK | |
| HMM | Fit | 13x | hmmlearn |
| Predict | 0.9x | hmmlearn | |
| Score | 5x | hmmlearn | |
| CHAT Parsing | Reading from a ZIP archive | 43x | pylangacq |
| Reading from strings | 70x | pylangacq | |
| Parsing utterances | 15x | pylangacq | |
| Parsing tokens | 9x | pylangacq |
See benchmarks/ for reproduction scripts.
pip install rustlingcargo add rustlingMIT License