Releases: huggingface/transformers
Releases · huggingface/transformers
4x speed-up using NVIDIA apex, new multi-choice classifier and example for SWAG-like dataset, pytorch v1.0, improved model loading, improved examples...
New:
- 3-4 times speed-ups in fp16 (versus fp32) thanks to NVIDIA's work on apex (by @FDecaYed)
- new sequence-level multiple-choice classification model + example fine-tuning on SWAG (by @rodgzilla)
- improved backward compatibility to python 3.5 (by @hzhwcmhf)
- bump up to PyTorch 1.0
- load fine-tuned model with
from_pretrained - add examples on how to save and load fine-tuned models.
Added two pre-trained models and one new fine-tuning class
This release comprise the following improvements and updates:
- added two new pre-trained models from Google:
bert-large-casedandbert-base-multilingual-cased, - added a model that can be fine-tuned for token-level classification:
BertForTokenClassification, - added tests for every model class, with and without labels,
- fixed tokenizer loading function
BertTokenizer.from_pretrained()when loading from a directory containing a pretrained model, - fixed typos in model docstrings and completed the docstrings,
- improved examples (added
do_lower_caseargument).
Small improvements and a few bug fixes.
Improvement:
- Added a
cache_diroption tofrom_pretrained()function to select a specific path to download and cache the pre-trained model weights. Useful for distributed training (see readme) (fix issue #44).
Bug fixes in model training and tokenizer loading:
- Fixed error in CrossEntropyLoss reshaping (issue #55).
- Fixed unicode error in vocabulary loading (issue #52).
Bug fixes in examples:
- Fix weight decay in examples (previously bias and layer norm weights were also decayed due to an erroneous check in training loop).
- Fix fp16 grad norm is None error in examples (issue #43).
Updated readme and docstrings
First release
This is the first release of pytorch_pretrained_bert.