Regular Expressions in Python 3
^
Matches the beginning of a line$
Matches the end of the line.
Matches any characters
Matches whitespaceS
Matches any non-whitespace character*
Repeats a character zero or more times*?
Repeats a character zero or more time (non-greedy)+
Repeats a character one or more times+?
Repeats a character one or more times (non-greedy)[aeiou]
Matches a single character in the listed set[^XYZ]
Matches a single character not in the listed set[a-z0-9]
The set of character can include a range(
Indicates where string extraction is to start)
Indicates where string extraction is to end
Import the library
“import re” for the regular expression library
re.search()
To see if a string matches a regular expression, similar to using the find() method of strings
re.findall()
To extract portions of a string that match your regular expression, similar to a combination of find() and slicing.
Extracting Data
1 |
import re |
1 |
x = 'My 2 favourite numbers are 9 and 32.' |
Greedy and None-Greedy Matching
The repeat characters (* and +) return the largest (longest) results (greedy).
1 |
a = 'From: Using : character' |
The regular expression wants to find parts that start with “F” and end with “:”. Because it has “+”, it is greedy.
There are two results matching the regular expression, “From:” and “From: Using :”. Because it is greedy, it returns the largest possible result, which is the second one.
The second example, b_not_greedy uses “+?”, which is not greedy. So it returns shorter ones.
Parentheses
1 |
c = 'From [email protected] Sat Jan 5 11:16 pm' |
Tips
[^ ]: match non-blank character
近期评论