
\b(?!\b(?:ADV|AICE|MYP|PYP|[A-z]2)\b)([A-z])([A-z]*?)\b
The above gobbledygook is a real regex string I used in one of my projects, MyGrades, to identify and format course names. By the end of this article, you'll not only be able to read and understand what it means, but also create regex patterns of your own!
I presented a workshop on this topic in 2021 at a Google DSC Event.

What is Regex?
Regex, or regular expressions, are patterns used to match strings. Regex is commonly used for searching/filtering strings for information, input validation, and web scraping. "Real-world" examples include everything from validating email addresses to formatting class names in a grades app.
Regex is incredibly powerful, but due to its seemingly unintelligible nature, it's also often intimidating to learn and difficult to remember.

But today you're gonna learn it!

Outline
In this article, we'll
- "Flapdoodle" Flags
- "Gibberish" Characters
- "Bafflegab" Special Characters
- "Rigmarole" Ranges
- "Jargon" Quantifiers
- "Gobbledygook" Groups
- "Malarkey" Anchors

"Balderdash" Basics (of Regex)
How does Regex work?
(Besides potentially making your code intelligible)
Regex, or regular expressions, are based on logic. Regex follows two primary rules:
- Regex engines move from left to right
- Regular expressions start and end with "delimiters." For example, Javascript regex literals generally have "slash" characters /, and Python regex usually begins with "r" and ends with ". (While Python doesn't necessarily have Regex literals perse, Regex is written more easily using raw strings to avoid worrying about string escapes).
- Patterns return the first case-sensitive match they find by default.
Therefore: given the sample string I scream, you scream, we all SCREAM for ice cream, /scream/ matches the first instance of "scream."
Another example:
regex string: /mon
test string: the mopey monkey stole my money
This behavior can be modified with flags.

Regex Syntax
AKA: How to parse gibberish

🚩 "Flapdoodle" Flags
Regex includes several flags that are appended to the end of the expression to change behavior. Using the string I scream, you scream, we all SCREAM for ice cream, the updated regex /scream/gi will now return scream scream SCREAM.
Syntax | Flag | Behavior | Example |
---|---|---|---|
g | global | Returns additional matches | /foo/g |
i | insensitive | Allows case-insensitive matches | /foo/i |
x | verbose | Ignore whitespace & allow comments | /foo/x |
u | unicode | Expressions are treated as Unicode (UTF-16) | /foo/u |
s | singleline | Treats entire string as one line (allows . to match newline) | /foo/s |
m | multiline | Start & end anchors now trigger on each line | /foo/m |
n | nth match | Matches text returned by nth group | /foo/n |
✏️ "Gibberish" Characters
Now we're on to the meat of regular expressions; selecting characters. In regex, a character can refer to either a letter, digit, or symbol. If you're looking to use regex, chances are you'll include some of these in your string:
Syntax | Character | Matches | Example String | Example Expression | Example Match |
---|---|---|---|---|---|
. | any | Literally any character (except line break) | a-c1-3 | a.c | a-c |
\w | word | ASCII character (Or Unicode character in Python & C#) | a-c1-3 | \w-\w | a-c |
\d | digit | Digit 0-9 (Or Unicode digit in Python & C#) | a-c1-3 | \d-\d | 1-3 |
\s | whitespace | Space, tab, vertical tab, newline, carriage return (Or Unicode seperator in Python, C#, & JS) | a b | a\sb | a b |
\W | NOT word | Anything \w does not match | a-c1-3 | \W-\W | 1-3 |
\D | NOT digit | Anything \d does not match | a-c1-3 | \D-\D | a-c |
\S | NOT whitespace | Anything \s does not match | a-c1-3 | \S-\S | a-c |
🖋️ "Bafflegab" Special Characters
Regex also allows you to select special chracters like tabs or newlines.
Syntax | Special Character | Matches | Example String | Example Expression | Example Match |
---|---|---|---|---|---|
\ | escape | The following when preceding them: [{()}].*+?$^/\ | )$[]*{ | \[\] | [] |
Syntax | Substitute | Behavior |
---|---|---|
\n | newline | Insert a newline character |
\t | tab | Insert a tab character |
\r | carriage return | Insert a carriage return character |
\f | form-feed | Insert a form feed character |
🖌️ "Rigmarole" Ranges
Ranges allow you to support several potential matches:
Syntax | Range | Matches | Example String | Example Expression | Example Match |
---|---|---|---|---|---|
[pog] | word list | Either p , o , or g | awesomePOSSUM123 | [awesum]+ | awes |
[^pog] | NOT word list | Any character except p , o , or g | awesomePOSSUM123 | [^awesum]+ | o |
[a-z] | word range | Any character between a and z , inclusive | awesomePOSSUM123 | [a-z]+ | awesome |
[^a-z] | NOT word range | Any character not between a and z , inclusive | awesomePOSSUM123 | [^a-z]+ | 123 |
[0-9] | digit range | Any character between 0 and 9 , inclusive | awesomePOSSUM123 | [0-9]+ | 123 |
[^0-9] | NOT digit range | Any character not between 0 and 9 , inclusive | awesomePOSSUM123 | [^0-9]+ | awesomePOSSUM |
[a-zA-Z] | word range | Any character not between a and z , inclusive | awesomePOSSUM123 | [a-zA-Z]+ | awesomePOSSUM |
[a-zA-Z] | word range | Any character not between a and z , inclusive | awesomePOSSUM123 | [a-zA-Z]+ | awesomePOSSUM |
There are also a few (mostly) semantically identical patterns in Golang and PHP. These do not appear to be supported in JS or Python:
Syntax | Range | Matches | Example String | Example Expression | Example Match |
---|---|---|---|---|---|
[[:alpha:]] | alpha class | Any character between a and z , inclusive, not case sensitive | Woodchuck could chuck 33 wood logs. | [[:alpha:]]+ | Woodchuck |
[[:digit:]] | digit class | Any digit 0-9 | Woodchuck could chuck 33 wood logs. | [[:digit:]]+ | 33 |
[[:alnum:]] | alphanumeric class | Any character between a and z , inclusive, not case sensitive, and any digit 0-9 | Woodchuck could chuck 33 wood logs. | [[:alnum:]]+ | Woodchuck |
[[:punct:]] | punctuation class | Any of ?!.,:; | Woodchuck could chuck 33 wood logs. | [[:punct:]]+ | . |
In some flavors of regex, the above are also called "Character Classes."
🖊️ "Jargon" Quantifiers
Syntax | Quantifier | Matches | Example String | Example Expression | Example Match |
---|---|---|---|---|---|
? | optional | 0 or 1 of the preceding expression | ccc | c? | c |
{X} | X | X of the preceding expression | ccc | c{2} | cc |
{X,} | X+ | X or more of the preceding expression | ccc | c{2,} | ccc |
{X,Y} | range | Between X and Y of the preceding expression | ccc | c{1,3} | ccc |
Beyond standard quantifiers, there are a few additional modifiers: greedy, lazy, and possessive.
Syntax | Quantifier | Matches | Example String | Example Expression | Example Match |
---|---|---|---|---|---|
* | 0+ greedy | 0 or more of the preceding expression, using as many chars as possible | abccc | c* | ccc |
+ | 1+ greedy | 1 or more of the preceding expression, using as many chars as possible | abccc | c+ | ccc |
*? | 0+ lazy | 0 or more of the preceding expression, using as few chars as possible | abccc | c*? | c |
+? | 1+ lazy | 1 or more of the preceding expression, using as few chars as possible | abccc | c+? | c |
*+ | 0+ possessive | 0 or more of the preceding expression, using as many chars as possible, without backtracking (Not supported in JS or PY) | abccc | c*+ | ccc |
++ | 1+ possessive | 1 or more of the preceding expression, using as many chars as possible, without backtracking (Not supported in JS or PY) | abccc | c++ | ccc |
Put simply, greedy quantifiers match as much as possible, lazy as little as possible and possessive as much as possible without backtracking.
What this means in practice is that possessive quantifiers will always return either the same match as greedy quantifiers or if backtracking is required they will return no match. Therefore, posessive quantifiers should be used when you know backtracking is not necessary, allowing increased performance.

🖍️ "Gobbledygook" Groups
Groups allow you to pull out specific parts of a match. For example, given the string Peter Piper picked a peck of pickled peppers and the regex literal _[peck]+ of (\w+) _, an additional "capturing group" group 1 is returned.
By default, the whole match begins at group 0, and then every group after is n where n is 1 + the previous capturing group.
Syntax | Group | Matches | Example String | Example Expression | Example Match |
---|---|---|---|---|---|
| | alternate | Either the preceding or following expression | truly rural | truly|rural | truly |
(...) | isolate | Everything enclosed; treats as separate capture group | truly rural | truly (rural) | truly , rural |
(?:...) | include | Everything enclosed; enables using quantifiers on part of regex | truly ruralrural | truly (?:rural)+ | truly ruralrural |
(?|...) | combine | Everything enclosed; treats all matches as same group | truly rural | (?|(rural)|(truly)) | truly |
(?>...) | atomic | Longest possible string without backtracking | truly rural | (?>rur) | rur |
(?#...) | comment | Everything enclosed; treats as comment and ignores | truly #rural | truly (?#rural) | truly |
⚓ "Malarkey" Anchors
Syntax | Anchor | Matches | Example String | Example Expression | Example Match |
---|---|---|---|---|---|
^ | start | Start of string | she sells seashells | ^\w+ | she |
$ | end | End of string | she sells seashells | \w+$ | seashells |
\b | word boundary | Between a character matched and not matched by \w | she sells seashells | s\b | s |
\B | NOT word boundary | Between two characters matched by \w | she sells seashells | \w+$ | seashells |
There are additional anchors available that are unaffected by multiline mode m.
Syntax | Anchor | Matches | Example String | Example Expression | Example Match |
---|---|---|---|---|---|
\A | multi-start | Start of string | she sees cheese | \A\w+ | she |
\Z | multi-end | End of string | she sees cheese | \w+\Z | cheese |
\Z | absolute end | Absolute end of string, ignoring trailing newlines | she sees cheese | \w+\Z | cheese |
Regex in the Real World
Regular expressions are an incredibly useful tool for you to have in your programming arsenal. Beyond the regex string I opened this article with, which enabled me to parse class names in a grades app, there are many other applications for parsing strings:
Input Validation
/^.+@.+$/
Emails
/^[a-zA-Z0-9_-]16$/
Usernames
/^\+?(\d.*){3,}$/
Phone numbers

Metadata
/^(0?[1-9]|[12][0-9]|3[01])([ /-])(0?[1-9]|1[012])\2([0-9][0-9][0-9][0-9])(([ -])([0-1]?[0-9]|2[0-3]):[0-5]?[0-9]:[0-5]?[0-9])?$/
DateTimes
/^#?([a-fA-F0-9]6|[a-fA-F0-9]3)$/
Color Hexcodes
/^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).)3(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$/
IPv4 addresses
Those are just a couple examples of common applications for regex.
Next Steps
You can bookmark a "Regex Cheat Sheet" I created for a workshop in 2021 at github.com/GoldinGuy/UltimateRegexResource.

If you're looking for more ways to practice regex, I created an app, Redoku, which lets you learn the syntax of regular expressions by playing fun and engaging randomly generated regex sudoku puzzles.
Thanks for reading :)