Swapping noun cardinals
In a chunk, a cardinal word, tagged as CD
, refers to a number, such as 10. These cardinals often occur before or after a noun. For normalization purposes, it can be useful to always put the cardinal before the noun.
How to do it...
The swap_noun_cardinal()
function is defined in transforms.py
. It swaps any cardinal that occurs immediately after a noun with the noun so that the cardinal occurs immediately before the noun. It uses a helper function, tag_equals()
, which is similar to tag_startswith()
, but in this case, the function it returns does an equality comparison with the given tag:
def tag_equals(tag): def f(wt): return wt[1] == tag return f
Now we can define swap_noun_cardinal()
:
def swap_noun_cardinal(chunk): cdidx = first_chunk_index(chunk, tag_equals('CD')) # cdidx must be > 0 and there must be a noun immediately before it if not cdidx or not chunk[cdidx-1][1].startswith('NN'): return chunk noun, nntag = chunk[cdidx-1] chunk[cdidx-1]...