Splitting text using the Boost Tokenizer library
The
boost::split
algorithm, we saw in the last section, splits a string using a predicate and puts the tokens into a sequence container. It requires extra storage for storing all the tokens, and the user has limited choices for the tokenizing criteria used. Splitting a string into a series of tokens based on various criteria is a frequent programming requirement, and the Boost.Tokenizer library provides an extensible framework for accomplishing this. Also, this does not require extra storage for storing tokens. It provides a generic interface to retrieve successive tokens from a string. The criterion to split the string into successive tokens is passed as a parameter. The Tokenizer library itself provides a few reusable, commonly used tokenizing policies for splitting, but, most importantly, it defines an interface using which we can write our own splitting policies. It treats the input string like a container of tokens from which successive...