Python 3 How to remove white spaces

Python 3 offers several different ways for removing spaces from strings like:

  • words.strip()
  • words.replace(" ", "")
  • re.sub(r"\s+", "", words, flags=re.UNICODE)

The could be also difference depending on your needs like:

  • remove leading and ending spaces
  • remove all spaces
  • remove duplicated spaces
  • dealing with encoding

Python 3 Remove spaces at the start and end of a string

You have two options if you want to get rid of the leading and ending spaces. The first one is by using the built in function strip:

words = ' Python is powerful... and fast; '
print(words.strip())

result:

Python is powerful... and fast;

You can achieve exactly the same with regular expression like:

import re
new_words = re.sub("^\s+|\s+$", "", words, flags=re.UNICODE)
print(new_words)

result:

Python is powerful... and fast;

If you want to remove only the spaces at the beginning you can do:

import re
words = ' Python is powerful... and fast; '
new_words = re.sub(r"^\s+", "", words, flags=re.UNICODE)
print(new_words)

result:

Python is powerful... and fast; 

Or only at the end of a string by:

import re
words = ' Python is powerful... and fast; '
new_words = re.sub(r"\s+$", "", words, flags=re.UNICODE)
print(new_words)

result:

 Python is powerful... and fast;

Python 3 Remove all spaces from a string

You have two options if you want to get rid of the leading and ending spaces. The first one is by using the built in function strip:

words = ' Python is powerful... and fast; '
print(words.replace(' ', ''))

result:

Pythonispowerful...andfast;

The same can be achieve also by regex:

import re
words = ' Python is powerful... and fast; '
new_words = re.sub(r"\s+", "", words, flags=re.UNICODE)
print(new_words)

result:

Pythonispowerful...andfast;

Python 3 Remove consecutive spaces

The last example will show you how to remove only the consecutive spaces in a string. Again you have 2 options in doing that. The first one split all the words by a space and join them with spaces:

words = ' Python   is powerful...      and fast; '
print(" ".join(words.split()))

result:

Python is powerful... and fast;

The second one by the regular expression is working in a different way. It's search for duplicated spaces and remove them. The difference between the two is that the leading and ending spaces will be preserved in the second case:

import re
words = ' Python   is powerful...      and fast; '
new_words = " ".join(re.split("\s+", words, flags=re.UNICODE))
print(new_words)

result:

 Python is powerful... and fast; 

If you care about performance in the regex vs non regex case than you can check next result:

import cProfile

def before():
    for i in range(1, 1000000):
        words = ' Python   is powerful...      and fast; '
        " ".join(words.split())


def after():
    for i in range(1, 1000000):

        words = ' Python   is powerful...      and fast; '
        new_words = " ".join(re.split("\s+", words, flags=re.UNICODE))

import re
cProfile.run('before()')
cProfile.run('after()')

result:

  • non regex - 0.555 seconds
  • regex - 2.890 seconds

The whole performance result for removing spaces from string. So the regex is much more customizable but at the cost of performance and memory:

2000002 function calls in 0.555 seconds

   Ordered by: standard name
2000002 function calls in 0.555 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.555    0.555 <string>:1(<module>)
        1    0.247    0.247    0.555    0.555 ProfilingSimple.py:3(before)
        1    0.000    0.000    0.555    0.555 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
   999999    0.103    0.000    0.103    0.000 {method 'join' of 'str' objects}
   999999    0.205    0.000    0.205    0.000 {method 'split' of 'str' objects}


         4000205 function calls (4000202 primitive calls) in 2.890 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    2.890    2.890 <string>:1(<module>)
        1    0.398    0.398    2.890    2.890 ProfilingSimple.py:9(after)
       34    0.000    0.000    0.000    0.000 enum.py:267(__call__)
       34    0.000    0.000    0.000    0.000 enum.py:517(__new__)
        3    0.000    0.000    0.000    0.000 enum.py:797(__or__)
       14    0.000    0.000    0.000    0.000 enum.py:803(__and__)
   999999    0.292    0.000    2.366    0.000 re.py:204(split)
   999999    0.261    0.000    0.261    0.000 re.py:286(_compile)
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.555    0.555 <string>:1(<module>)
        1    0.247    0.247    0.555    0.555 ProfilingSimple.py:3(before)
        1    0.000    0.000    0.555    0.555 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
   999999    0.103    0.000    0.103    0.000 {method 'join' of 'str' objects}
   999999    0.205    0.000    0.205    0.000 {method 'split' of 'str' objects}


         4000205 function calls (4000202 primitive calls) in 2.890 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    2.890    2.890 <string>:1(<module>)
        1    0.398    0.398    2.890    2.890 ProfilingSimple.py:9(after)
       34    0.000    0.000    0.000    0.000 enum.py:267(__call__)
       34    0.000    0.000    0.000    0.000 enum.py:517(__new__)
        3    0.000    0.000    0.000    0.000 enum.py:797(__or__)
       14    0.000    0.000    0.000    0.000 enum.py:803(__and__)
   999999    0.292    0.000    2.366    0.000 re.py:204(split)
   999999    0.261    0.000    0.261    0.000 re.py:286(_compile)

Related Article