Skip to content

Python Regex

Regular Expression is shortly called as RegEx which helps to search a string which contain the specified search pattern. re module helps to perform regex operations in python

Metacharacters

Metacharacters are used to specify the regular expressions. The list of metacharacters are []{}()\|.?$^*+. Each character's usage explained in the below sections

Note : https://pythex.org/ is a best website to test the regular expressions

Square Brackets [ ]

With the help of square bracket, we can specify a set of characters to match with the given string

Expression String Number of Matches
a 1 Match
au 2 Matches
[aeiou] rathna schools 4 Matches
myth No Match
ieauo 5 Matches
  • We can specify the range of characters using - such as [a-e] is consider as [abcde]
  • similarly [0-5] is consider as [012345]
  • Complement operators inverse the character list such as [^a] is consider as any characters except a
  • similarly [^0-9] is consider as any non digit characters

Period .

1 period matches 1 characters incase 2 period means, it wil match 2 characters and so on. But, it won't consider the newline \n

Expression String Number of Matches
. a 1 Match
.. au 2 Matches
. rathnaschools 13 Matches
.. rathnaschools 6 Matches
... ieauo 1 match

Caret ^

Caret symbol is used to check the prefix of the given string and returns whether it match or not

Expression String Number of Matches
^a rathnaschools No match
^r au Match
^R rathnaschools No match
^rs rathnaschools No match
^ra rathnaschools match

Dollar $

Dollar symbol is used to check the suffix of the given string and returns whether it match or not

Expression String Number of Matches
a$ rathnaschools No match
s$ au Match
S$ rathnaschools No match
rs$ rathnaschools No match
ls$ rathnaschools match

Star *

Star symbol used to match the zero or multiple occurrences of the pattern left to the given string

Expression String Number of Matches
pytho*n pythn Match
pytho*n python Match
pytho*n pythoooon Match
pytho*n pythoan No match
pytho*n learnpython match

Plus +

Plus symbol used to match the zero or multiple occurrences of the pattern right to the given string

Expression String Number of Matches
pytho+n pythn No match
pytho+n python Match
pytho+n pythoooon Match
pytho+n pythoan No match
pytho+n pythonlearn match

Question Mark ?

Question Mark symbol used to match the zero or multiple occurrences of the pattern left to the given string

Expression String Number of Matches
pytho?n pythn Match
pytho?n python Match
pytho?n pythoooon No Match (more than one 'o')
pytho?n pythoan No match
pytho?n learnpython match

Braces { }

Braces are used to find the repetitive pattern in the given string with {at least, at most} specifications. example if it {2, 3} means minimum two time character should repeat and at most three time in the given string

Expression String Number of Matches
o{2,3} verbose No Match
o{2,3} verboose Match
o{2,3} verboose schools 2 Match
o{2,3} verbooose schools 2 Match
o{2,3} verbooooose schools 3 Match

Vertical Bar |

Vertical bar | performs the same operation of OR operators in programming language

Expression String Number of Matches
p|o python 2 match
p|o learn No Match
p|o python python 4 Match

Parentheses ( )

Parentheses are used to match the sub-patterns in the given input string

Expression String Number of Matches
(a|h|n)on python 1 match
(a|e|n)on python No Match
(a|h|r)on learon python 2 Match

Backslash

Backslash is act as an escape characters including all metacharacters. If we want to search $50 in the given string means, we need to skip the $ symbol like below

Expression String Number of Matches
$50 The balance is $50. No match
\$50 The balance is $50. 1 Match

Special Sequence

The regularly using patterns can be easily written with special sequence. The list of special sequences are

Expression Description String Number of Matches
\Alearn Check the start of the string learn python 1 Match
python\Z Check the end of the string learn python 1 Match
\blearn Check the start of the string learn python 1 Match
python\b Check the end of the string learn python 1 Match
\Blearn Check the not start of the string learn python No Match
python\B Check the not end of the string learn python No Match
\d returns the integers in the string learn python 123 123
\D returns the non integers in the string learn python 123 learn python
\s returns the white space in the string learn python 2 white spaces
\S returns the non white space in the string learn python learnpython123
\w returns the alpha numeric in the string learn python 123 learnpython123
\W returns the non alpha numeric in the string learn python 123 2 white spaces

re findall method

Findall method in re module helps to return the matching pattern in the given string

import re

input_data = "Learn 123 python 456"
pattern = "\d+"

result = re.findall(pattern, input_data)
print(result)

Output

['123', '456']

re split method

Split method in re module helps to split the string based on the matching pattern in the given string

import re

input_data = "Learn 123 python 456"
pattern = "\d+"

result = re.split(pattern, input_data)
print(result)

Output

['Learn ', ' python ', '']

re sub method

Sub method in re module helps to replace the string based on the matching pattern in the given string

import re

input_data = "Learn 123 python 456"
pattern = "\d+"
replace = "***"

result = re.sub(pattern, replace, input_data)
print(result)

Output

Learn *** python ***

re subn method

Subn method in re module helps to replace the string based on the matching pattern and also returns the count of number of occurrence in the given string

import re

input_data = "Learn 123 python 456"
pattern = "\d+"
replace = "***"

result = re.subn(pattern, replace, input_data)
print(result)

Output

('Learn *** python ***', 2)

re search method

Search method in re module helps to search the string based on the matching pattern and returns none if the pattern is not available in the given string

import re

input_data = "Learn 123 python 456"
pattern = "\d+"

result = re.search(pattern, input_data)
if result:
    print("Integer available in the given string")
else:
    print("No integer in the given string")

Output

Integer available in the given string