Search icon CANCEL
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon

Microsoft Bling introduces Fire: a Finite state machine and regular expression manipulation library

Save for later
  • 2 min read
  • 18 Apr 2019

article-image

A Microsoft team named Bling (Beyond Language Understanding) announced a Finite State machine and regular expression manipulation library called Fire, yesterday.

Fire has been developed to use in case of different linguistic operations inside Bing including Tokenization, Multi-word expression matching, Unknown word-guessing, and Stemming/Lemmatization among others.

Under Fire comes a tokenizer, which has been designed for fast-speed and quality tokenization of Natural Language text. Fire tokenization uses the tokenization logic of NLTK (Natural Language Toolkit), with an exception that hyphenated words can be split and only a few errors can be fixed. Also, when compared with other popular NLP libraries, Bling Fire becomes 10X faster speed in tokenization task.

The latest release of Bling Fire model is enabled to support most languages including East Asian (Chinese Simplified, Traditional, Japanese, Korean, Thai). The tokenizer’s high-level API is friendly to use from languages such as Python, Perl, C#, Java, etc. Also, the tokenizer has been designed in a way that it requires 0 zero configurations, or initialization, or additional files. The reason Tokenizer is very fast is because it makes use of deterministic finite state machines underneath.

In order to use the Bling Fire Library and Finite State Machine manipulation tools, the project can be built on Windows/Linux using CMake, which allows you to create your own tokenization/segmentation, stemming, etc. To use the Bling Fire Library in Python, users can install the release with the help of using: pip install blingfire

For more information, check out Bling Fire on GitHub.


Microsoft reveals certain Outlook.com user accounts were hacked for months

Microsoft makes the first preview builds of Chromium-based Edge available for testing

Microsoft announces the general availability of Live Share and brings it to Visual Studio 2019

Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime