Last week, Google launched Android 9 pie, the latest machine learning based Android operating system after Android Oreo. One of the features in Android 9 pie, named, smart linkify, a new version of the existing Android Linkify API adds clickable links on identifying entities such as dates, flights, addresses, etc, in content or text input via TextClassifier API.
Smart linkify API is trained in TensorFlow which uses a small feedforward neural network. This enables it to figure out whether or not a series of numbers or words is a phone number or address, just like Android Oreo’s Smart Text Selection feature. But, what’s different with this new feature is that instead of just making it easier to highlight and copy the associated text manually, it adds a relevant actionable link allowing users to immediately take action with a just a click.
Smart linkify follows three basic steps:
Let’s have a quick look at each of the above-mentioned steps.
The underlying process for detecting entities within texts is not an easy task. It poses many problems as people follow different ways to write addresses and phone numbers. There can also be confusion regarding the type of entity. For instance, “Confirmation number: 857-555-3556” can look like a phone number even though it’s not.
So, to fix this problem, an inference algorithm with two small feedforward neural networks was designed by the Android team. The two feedforward neural networks look for context surrounding words and perform all kinds of entity chunking beyond just addresses and phone numbers.
The first input text is split into words and then all the possible combination of entries, named “candidates” are analyzed. After analyzing the candidates, a score is assigned on a scale of validity. Any overlapping candidates are removed, favoring the ones with the higher score. After this, the second neural network takes over and assigns a type of entity, as either a phone number, address or in some cases, a non-entity.
Smart Linkify finding entities in a string of text
After the entities have been located in the text, it’s time to process it. The neural networks determine whether the given entity candidate in the input text is valid or not. After knowing the context surrounding the entity, the network classifies it. With the help of machine learning, the input text is split into several parts and each is fed to the network separately.
Smart linkify processing the input text
Google uses character n-grams and a binary capitalization feature to “represent the individual words as real vectors suitable as an input of the neural network”.
Google has a training algorithm in place for datasets. It involves collecting lists of addresses, phone numbers and named entities (such as product, place, business names, etc). These are then used to synthesize the data for training neural networks.
“We take the entities as they are and generate random textual contexts around them (from the list of random words on Web). Additionally, we add phrases like “Confirmation number:” or “ID:” to the negative training data for phone numbers, to teach the network to suppress phone number matches in these contexts”, says the Google team.
There are a couple of other techniques that Google used for training the network such as:
Currently, Smart Linkify offers support for 16 languages and plans to support more languages in the future.
Google still relies on traditional techniques using standard regular expressions for flight numbers, date, times, IBAN, etc, but it plans to include ML models for these in the future.
For more coverage on smart linkify, be sure to check out the official Google AI blog.
All new Android apps on Google Play must target API Level 26 (Android Oreo) or higher
Android P Beta 4 is here, stable Android P expected in the coming weeks!
Is Google planning to replace Android with Project Fuchsia?