Normalizing a string and performing Unicode comparisons
We want to make a filename or URL based on an article title. To do this, we'll have to limit the size to an appropriate number of characters, strip out improper characters, and format the string in a consistent way. We also want it to remain in the valid UTF-8 format.
How to do it…
Let's normalize a string by executing the following steps:
Use
std.uni.normalize
to get Unicode characters into a consistent format.Use
std.string.toLower
to convert everything to lowercase for consistency.Use
std.regex
to strip out all but a small set of characters.Use
std.string.squeeze
to collapse consecutive whitespace.Use
std.array.replace
to change spaces into dashes.Use
std.range.take
to get the right number of characters, then convert the result back to string.
The code is as follows:
void main(){ string title = "The D Programming Language: Easy Speed!"; import std.uni, std.string, std.conv, std.range,std.regex; title = normalize(title); title...