regex cheatsheet

Important concepts:

  • [] square bracket, is a character class. Character class stands for “or”. Hence, [abc] means it matches a OR b OR c. [abc] is equavalaent to [a-c] or (a|b|c)
  • append in the front of the character, if it is part of the regex (ex. d match digits, we excape d as \d

Characters

s empty space
s+ one or more empty spaces
p{Punct} punctutation
d digit
[^] ^ inside square bracket means not

public String[] split(String regex)

1
"String[] ops = s.split(""regex"")
1
2
3
4
5
6
7
d  // any digits, short for [0-9]
D // a non-digit, short for [^0-9]
s // a whitespace character
S // a non-whitespace character
w // a word character, sort for [a-zA-Z0-9_], word char conatins underscore(_)
W // a non-word character [^w]
S+ // one or more non-whitespace characters

ref: https://www.vogella.com/tutorials/JavaRegularExpressions/article.html

Examples:

1
2
3
4
String[] nums = s.split(""[-+*/]""); // split on any math operator
String[] ops = s.split(""[0-9]+""); // split on any digit
String[] ops = s.split(""\s+""); // split one or more space
String[] ops = s.split(""\W+""); // split one or more non-word character

public String[] split(String regex, int limit)

  • limit parameter controls the number of times the pattern is applied
  • split(String regex, n)
  • if n = 0: pattern will be applied as many time as possible, ALL trailing empty strings (eg. “”) will be removed
  • if n < 0: pattern will be applied as many time as possible
  • if n > 0: pattern will be applied at most n - 1 times

Quantifiers

  • + one or more
  • * zero or more
  • {3} exactly three times

References:

Stackoverflow: square brackets: https://stackoverflow.com/a/26565735
cheat sheet https://www.rexegg.com/regex-quickstart.html
Java regular expressions cheat sheet https://zeroturnaround.com/rebellabs/java-regular-expressions-cheat-sheet/

Logs

  • 05/04/2019: update split() example + more regex example