regular expression

syntax

d = 1 number
w = 1 alphanum
s = 1 space
* = any # chars ( including 0 char )
+ = at least one char
? = 0 or 1 char
{n} = n chars
{n, m} = n ~ m chars

e.g.

00d can be 007, but not 00A
ddd can be 010
www can be py3
py. can be pyc, pyo, py!, etc
. = any char

e.g.

d{3}s+d{3,8} = 任意空格隔开的带区号的电话号码

d{3} = 3 numbers
s+ = at least one space
d{3, 8} = 3 ~ 8 numbers

[0-9a-zA-Z_] = 1 number or char or underscore

[0-9a-zA-Z_]+ = at least one number or char or underscore

[a-zA-Z_][0-9a-zA-Z_]* = start with 1 char or underscore, follow with any # of number, char or underscore, which is python valid variable name.

[a-zA-Z_][0-9a-zA-Z_]{0, 19} = len 1 ~ 20

A|B = A or B

[P|p]ython = Python or python

^ = beginning of line

^d = must start with number

$ = end of line

d$ = must end with number

python re module

>>> import re
>>> re.match(r'^d{3}-d{3,8}$', '010-12345')
Match object
>>> re.match(r'^d{3}-d{3,8}$', '010 12345}
None

match() method return Match object if match, None if not match.

split string

>>> 'a b    c'.split(' ')
['a', 'b', '', '', '', 'c']
# can't handle continues spaces

>>> re.split(r's+', 'a b   c')
['a', 'b', 'c']

>>> re.split(r'[s,;]+', 'a,b;; c  d')
['a', 'b', 'c', 'd']

regular expression

syntax

python re module

split string

近期文章

近期评论

标签

热门

文章归档

分类目录

功能