To learn more, refer to the following resources:
- Check out the Hugging Face transformers documentation, available at https://huggingface.co/transformers/model_doc/bert.html.
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding by Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova available at https://arxiv.org/pdf/1810.04805.pdf.