Python Regular Expressions
The re
module in Python provides full support for regular expressions — powerful tools used to search, match, and manipulate strings based on specific patterns.
Import the Module
import re
1. re.search()
– Find First Match
import re
text = "The rain in Spain"
match = re.search("rain", text)
print(match)
Output:
<re.Match object; span=(4, 8), match='rain'>
Explanation:re.search()
searches for the first occurrence of the pattern and returns a Match object.
2. re.findall()
– Find All Matches
import re
result = re.findall("ai", "The rain in Spain")
print(result)
Output:
['ai', 'ai']
Explanation:re.findall()
returns all occurrences of the pattern as a list.
3. re.match()
– Match at Start of String
import re
result = re.match("The", "The rain in Spain")
print(result)
Output:
<re.Match object; span=(0, 3), match='The'>
Explanation:re.match()
checks for a match only at the beginning of the string.
4. re.sub()
– Replace Using Pattern
import re
result = re.sub("Spain", "India", "The rain in Spain")
print(result)
Output:
The rain in India
Explanation:re.sub()
replaces all occurrences of the pattern with a new string.
5. re.split()
– Split by Pattern
import re
result = re.split("\s", "The rain in Spain")
print(result)
Output:
['The', 'rain', 'in', 'Spain']
Explanation:re.split()
splits the string using the given pattern (\s
means any whitespace).
6. Meta Characters & Patterns
Symbol | Meaning | Example | Matches... |
---|---|---|---|
. | Any character except newline | a.b | acb , arb , etc. |
^ | Start of string | ^Hello | 'Hello there' |
$ | End of string | world$ | 'my world' |
[] | Set of characters | [aeiou] | Matches vowels |
\d | Digit (0–9) | \d+ | '123' , '4567' , etc. |
\w | Word character | \w+ | 'word123' , etc. |
\s | Whitespace | \s+ | ' ' , '\t' , '\n' |
+ | One or more repetitions | \d+ | '12' , '9' , etc. |
* | Zero or more repetitions | a* | 'a' , 'aaa' , or '' |
{m,n} | Between m and n repetitions | a{2,3} | 'aa' or 'aaa' |
` | ` | OR operator | `cat |
() | Group | (ab)+ | 'ab' , 'abab' |
7. Grouping and Extracting
import re
text = "My phone number is 987-654-3210"
match = re.search(r"(\d{3})-(\d{3})-(\d{4})", text)
print(match.group())
print(match.group(1)) # Area code
Output:
987-654-3210
987
Explanation:
Use parentheses ()
to group parts of the match and group(n)
to extract them.
8. Flags (e.g., re.IGNORECASE
, re.MULTILINE
)
import re
result = re.search("hello", "Hello World", re.IGNORECASE)
print(result)
Output:
<re.Match object; span=(0, 5), match='Hello'>
Explanation:re.IGNORECASE
makes the match case-insensitive.
9. Raw String r''
in Regex
Always use raw strings for regex patterns to avoid escaping backslashes:
import re
pattern = r"\d{3}"
Summary Table
Function | Description |
---|---|
search() | Finds first match |
match() | Checks match at start |
findall() | Returns all matches |
sub() | Replaces matches with new string |
split() | Splits string by pattern |
group() | Gets matched group from result |
re.IGNORECASE | Case-insensitive matching |