Creating a shallow tree
In the previous recipe, we flattened a deep Tree
by only keeping the lowest level subtrees. In this recipe, we'll keep only the highest level subtrees instead.
How to do it...
We'll be using the first parsed sentence from the treebank
corpus as our example. Recall from the previous recipe that the sentence Tree
looks like this:
The shallow_tree()
function defined in transforms.py
eliminates all the nested subtrees, keeping only the top subtree labels:
from nltk.tree import Tree def shallow_tree(tree): children = [] for t in tree: if t.height() < 3: children.extend(t.pos()) else: children.append(Tree(t.label(), t.pos())) return Tree(tree.label(), children)
Using it on the first parsed sentence in treebank
results in a Tree
with only two subtrees:
>>> from transforms import shallow_tree >>> shallow_tree(treebank.parsed_sents()[0]) Tree('S', [Tree('NP-SBJ', [('Pierre', 'NNP'), ('Vinken', 'NNP'), (',', ','), ('61', 'CD'), (...