Regex Cheat Sheet: A Complete Guide to Regular Expressions
Master regular expressions with our comprehensive cheat sheet. Test patterns live, learn character classes, quantifiers, lookaheads, and see 10 practical examples.
Key Takeaways
- Character classes (\d, \w, \s) and quantifiers (+, *, ?, {n}) cover 80% of real-world regex use cases.
- Lookaheads, lookbehinds, and named groups handle the other 20% -- but you will not need them daily.
- Regex works in every major language, text editor, and CLI tool. Learn it once, use it everywhere.
- Always test against real data before deploying. Edge cases in regex are relentless.
Test Your Patterns
Try any pattern from this guide against your own text. Matches and capture groups highlight in real time.
Test text
44 charsPreview
The quick brown fox jumps over 13 lazy dogs.
[The] [quick] [brown] [fox] [jumps] [over] [13] [lazy] [dogs].
Matches
9 foundMatch 1
0 to 3
The
Match 2
4 to 9
quick
Match 3
10 to 15
brown
Match 4
16 to 19
fox
Match 5
20 to 25
jumps
Match 6
26 to 30
over
Match 7
31 to 33
13
Match 8
34 to 38
lazy
Match 9
39 to 43
dogs
Presets
Character Classes
Character classes match a single character from a defined set. These are the foundation.
| Pattern | Matches | Example |
|---|---|---|
| . | Any character except newline | a.c matches 'abc', 'a1c', 'a-c' |
| \d | Any digit (0-9) | \d\d matches '42', '99' |
| \D | Any non-digit | \D+ matches 'hello' |
| \w | Word character (letter, digit, underscore) | \w+ matches 'hello_123' |
| \W | Non-word character | \W matches '@', ' ', '-' |
| \s | Whitespace (space, tab, newline) | \s+ matches ' ' |
| \S | Non-whitespace | \S+ matches 'hello' |
| [abc] | Any of a, b, or c | [aeiou] matches vowels |
| [^abc] | Any character except a, b, c | [^0-9] matches non-digits |
| [a-z] | Any character in range a to z | [A-Za-z] matches any letter |
The shorthands (\d, \w, \s) and their negations cover most needs. Square-bracket character classes handle everything else.
Quantifiers
Quantifiers control how many times a preceding element must appear.
*— zero or more.ab*cmatchesac,abc,abbc.+— one or more.ab+cmatchesabc,abbc, but notac.?— zero or one (optional).colou?rmatchescolorandcolor.{3}— exactly 3 times.\d{3}matches exactly three digits.{2,5}— between 2 and 5 times.{3,}— 3 or more times.
Greedy vs lazy
By default, quantifiers are greedy — they match as much as possible. Adding ? makes them lazy (match as little as possible). This matters enormously for parsing quoted strings or HTML.
Greedy: ".*" applied to he said "hello" and "goodbye"
Matches: "hello" and "goodbye" (first quote to last quote)
Lazy: ".*?" applied to he said "hello" and "goodbye"
Matches: "hello" then "goodbye" (shortest possible matches)
Anchors
Anchors don’t match characters — they match positions. ^ matches the start of a string, $ matches the end, and \b matches a word boundary (the transition between a word character and a non-word character).
^— start of string (or start of line in multiline mode).$— end of string (or end of line in multiline mode).\b— word boundary.\bcat\bmatchescatbut notcaterpillarorconcatenate.
Word boundaries prevent partial matches
Searching for log without boundaries matches blog, catalog, logarithm, and log. Use \blog\b to match only the standalone word. One of the most useful and underused regex features.
Groups and Alternation
Capturing groups
Parentheses () extract matched substrings. (\d{4})-(\d{2})-(\d{2}) applied to 2026-03-31 captures 2026, 03, 31.
Named groups
(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2}) lets you reference matches by name instead of index. Far more readable in complex patterns.
Non-capturing groups
(?:...) groups elements without capturing. Useful when you need grouping for alternation or quantifiers but do not need the match. (?:https?|ftp):// groups protocol options without wasting a capture slot.
Alternation
| means “or”. cat|dog matches either. Use with groups: (cat|dog) food matches cat food or dog food.
Lookaheads and Lookbehinds
Zero-width assertions — they check what comes before or after the current position without consuming characters.
(?=...)— positive lookahead.\d+(?= dollars)matches100in100 dollarsbut not100 euros.(?!...)— negative lookahead.\d+(?! dollars)matches100in100 eurosbut not100 dollars.(?<=...)— positive lookbehind.(?<=\$)\d+matches50in$50.(?<!...)— negative lookbehind.(?<!\$)\d+matches50in50 itemsbut not$50.
Lookbehind compatibility
Lookbehinds work in JavaScript (ES2018+), Python, Java, C#, and most modern engines. Not supported in some older environments. If you need broad compatibility, restructure using lookaheads or capturing groups.
10 Practical Patterns
Paste any of these into the Regex Tester to see them work.
For real-world examples beyond this reference, the regex patterns you’ll actually use in daily development shows each pattern against real data with commentary on edge cases.
1. Email address (simplified)
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
Covers the vast majority of real email addresses. The full RFC 5322 spec is absurdly complex and not worth implementing in regex.
2. URL
https?:\/\/[^\s/$.?#].[^\s]*
Matches HTTP and HTTPS URLs. For strict validation, use your language’s URL parser.
3. UK phone number
^(?:0|\+44)\d{9,10}$
Numbers starting with 0 or +44 followed by 9-10 digits.
4. Date (YYYY-MM-DD)
^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$
Validates format, restricts months to 01-12, days to 01-31. Does not check whether February 30th exists — use a date library for that.
5. IPv4 address
^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)$
Validates each octet is 0-255.
6. Hex color code
^#([0-9a-fA-F]{3}|[0-9a-fA-F]{6})$
Matches #fff and #1a2b3c.
7. Strong password
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$
At least 8 characters with one lowercase, one uppercase, one digit, one special character. Four positive lookaheads check each requirement independently.
8. HTML tag
<([a-z][a-z0-9]*)\b[^>]*>(.*?)<\/\1>
Matches opening and closing tags. \1 backreference ensures they match. Fine for quick extraction — do not use regex to parse HTML in production.
9. Trailing whitespace
[ \t]+$
Clean up code files. In multiline mode, matches whitespace at the end of each line.
10. Duplicate words
\b(\w+)\s+\1\b
Catches repeated words like “the the” or “is is”. The \1 backreference matches whatever the first group captured.
Common Mistakes
Forgetting to escape special characters. . matches any character, not a literal dot. Use \. for a period. Same for (, ), [, ], {, }, +, *, ?, ^, $, |, \.
Greedy when you need lazy. ".*" matches from the first quote to the last quote in the entire string. Use ".*?" for the nearest closing quote.
Anchoring only one end. ^\d+ checks that the string starts with digits but says nothing about what follows. ^\d+$ ensures the entire string is digits.
Over-engineering validation. Regex matches format. It does not validate semantics. Match the pattern with regex, then validate the logic in code.
Catastrophic backtracking. Nested quantifiers like (a+)+ cause exponential backtracking on almost-matching inputs. This freezes your application. Never nest quantifiers inside groups that are themselves quantified.
Keep Going
Start with the Regex Tester and experiment against real data. Build confidence with character classes and quantifiers before tackling lookaheads. And when a pattern grows beyond two lines — stop, and write a proper parser instead. Regex also pairs naturally with SQL for data validation — the SQL cheat sheet covering queries and joins covers how to use regex inside WHERE clauses in PostgreSQL and MySQL.
Related tools
Regex Tester →
Test regular expressions live with color-coded match highlighting, capture groups, replace mode, and common presets.
JSON Formatter →
Format, validate, and minify JSON instantly — with configurable indentation, error location, and tree view.
Word Counter →
Count words, characters, sentences, and paragraphs with reading time, speaking time, and keyword density.
AI Token Counter →
Count tokens for GPT-4o, Claude, Gemini, and more. Exact tiktoken counts for OpenAI models, estimates for others, with API cost breakdown.
Base64 Encoder/Decoder →
Encode text or files to Base64 or decode Base64 strings back to plain text — real-time, fully in your browser.
JWT Decoder →
Decode and inspect JWT tokens — view header, payload, claims, and expiry status without sending data to any server.
URL Encoder/Decoder →
Encode or decode URLs and query string components instantly — supports encodeURIComponent, decodeURIComponent, and full URL encoding.
XML Formatter →
Format, minify, and validate XML online — syntax checking with clear error messages, configurable indent, real-time output.
String Escape / Unescape →
Escape or unescape strings for JSON, HTML, JavaScript, CSV, and SQL — real-time, client-side, zero dependencies.
HTML Formatter →
Format and beautify messy HTML with configurable indent, wrapping, and attribute handling — paste or type, get clean output instantly.
Related articles
Regex Patterns You'll Actually Use: Copy-Paste Ready
20+ battle-tested regex patterns for emails, URLs, phone numbers, IPs, dates, and more. Copy-paste ready with examples for developers.
SQL Cheat Sheet: Queries, Joins, Functions
SQL cheat sheet with JOIN diagrams, window functions, and query patterns. 70% of developer queries use the same 20 SQL commands (Stack Overflow, 2025).
Markdown Cheat Sheet: The Complete Syntax Reference
Every Markdown syntax element in one reference. Tables, code blocks, task lists, footnotes, and platform differences. Used by 14M+ GitHub repos.