Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Mastering Python Regular Expressions

You're reading from   Mastering Python Regular Expressions For Python developers, this concise and down-to-earth guide to regular expressions

Arrow left icon
Product type Paperback
Published in Feb 2014
Publisher Packt
ISBN-13 9781783283156
Length 110 pages
Edition Edition
Languages
Arrow right icon
Toc

Overlapping groups


Throughout Chapter 2, Regular Expressions with Python, we've seen several operations where there was a warning about overlapping groups: for example, the findall operation. This is something that seems to confuse a lot of people. So, let's try to bring some clarity with a simple example:

>>>re.findall(r'(a|b)+', 'abaca')
['a', 'a']

What's happening here? Why does the following expression give us 'a' and 'a' instead of 'aba' and 'a'?

Let's look at it step by step to understand the solution:

Overlapping groups matching process

As we can see in the preceding figure, the characters aba are matched, but the captured group is only formed by a. This is because even though our regex is grouping every character, it stays with the last a. Keep this in mind because it's the key to understanding how it works. Stop for a moment and think about it, we're requesting the regex engine to capture all the groups made up of a or b, but just for one of the characters and that's the key...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image