Python Regular Expression Tutorial for Beginners

In this post:

  • python regex match
  • python regex search
  • python regular expression findall
  • python non regex search
  • most used wild cards in regular expressions

python regex match

The most basic use for regular expressions that every beginner should know is using method match. This one is doing a simple match of a string sequence:

import re

pattern = r"Cheese"
text = "Cheese is a dairy product derived from milk that is produced in a wide range of flavors.."
print(re.match(pattern, text).span())

result:
(0, 6)

As you can see from the result we get the start and end of the match word in the string. If the matching word is in the middle of this sentence than the returned result of: re.match(pattern, text) - will be None and .span() will produce error. This is because for searching in python you need to use search and not match.

python regex search

Searching for a given word in python is easy by using method:

re.search(pattern, text)

Below you can find the basic usage of this method which returns the start and the end of the find word in the string sequence:

import re

pattern = r"Cheese"
text = "Do you know that: Cheese is a dairy product derived from milk that is produced in a wide.."
print(re.search(pattern, text).span())

result:
(18, 24)

python regular expression findall

Let say that you want to search for a pattern and then to extract the whole sentence containing the word. This can be achieve again with regular expression and method findall. The example below search for word milk and extract the sentence:

import re

pattern = r"([^.]*milk[^.]*)"
text = "Cheese is a dairy product derived from milk that is produced in a wide range of flavors.."
print(re.findall(pattern, text))

result:
['Cheese is a dairy product derived from milk that is produced in a wide range of flavors']

python non regex search

This example will show how to perform the same search but this time with non regular search. The example is:

text = "Do you know that: Cheese is a dairy product derived from milk that is produced in a wide.."
print([sentence + '.' for sentence in text.split('.') if 'milk' in sentence])

result:
['Do you know that: Cheese is a dairy product derived from milk that is produced in a wide.']

Most used wild cards in regular expressions

This list contains wild cards or special characters used for special search in regular expressions:

  • . - matches single character (except newline)
  • \w - matches any letter, digit or underscore (just one)
  • \s - matches exactly on whitespace character
  • \t - tab
  • \d - matches digits from 0 to 9
  • \n - newline
  • [xy] - x or y
  • [a-zA-Z] - matches letters from (a to z) or (A to Z)
  • \A - matches the start of the string
  • \b -matches the beginning or end of the string.

Related Article