The Python programming language uses regular expressions for pattern matching. Programmers often use regular expressions to search text for patterns of letters, symbols and numbers. The power of regular expressions comes from the fact that they aren't used to search for something specific, like the word 'dog'; instead, they search for words that match a certain pattern, such as email domain names. For example, you can use Python to replace the domain names of a list of email addresses using regular expressions.
Things You'll Need
- Computer with Python programming language installed
Open the IDLE text editor that comes bundled with the Python language by clicking on its icon. The IDLE text editor icon is located in the Python directory in your installed programs list (located under All Programs in the Windows Start Menu, and within the Applications folder in OSX). A blank source code file opens in the main editor window.
Include the 're' module by writing this line at the top of the source code file:
Declare a string and assign some email addresses to it, such as this:
emailAddresses = 'William@amail.com, John@bmail.com, Bruce@cmail.com'
Create a regular expression that searches for all possible text permutations in valid email addresses. Regular expressions work by searching for a pattern of characters in a string of text. The pattern you are interested in is any two words joined by an @ symbol. Since email addresses have many valid characters, you want to match all possible characters in each word before and after the @ symbol. This is accomplished with the regular expression [\w.-], and by adding a + to the end of it, you can repeat this for all the characters. The completed regular expression can be saved to a string like this:
regexPattern = r'([\w.-]+)@([\w.-]+)'
Create a regular expression that replaces all of the domain names with "zmail.com." In this regular expression, the backreference character sequence \1 is used to replace the domain of the email addresses. The backreference refers to a location in a regular expression surrounded in parenthesis. By applying the regular expression to the first backreference, you save the email address but discard the old domain name. You can then add a new domain name, like '@zmail.com.' To save this second regular expression to a variable, you can write this:
regexReplacement = r'\email@example.com'
Apply the regular expressions to the string containing the email addresses like this:
emailAddresses = re.sub(regexPattern, regexReplacement, emailAddresses)
Print out the email addresses using this line of code. Python 3 uses this syntax for printing: print(emailAddresses), while Python 2 uses this syntax: print emailAddresses.
Run the program by pressing the F5 key. The program output is:
William@zmail.com, John@zmail.com, Bruce@zmail.com
How to Use Regular Expressions to Block Websites
Regular expressions use special characters, or metacharacters, to describe possible combinations of text in strings. For example, you could create a regular...