Introduction
Continuing the series of rarely used database features with:
The Oracle REGEXP_COUNT function is one of the most useful regular expression functions available in Oracle Database. It allows you to count how many times a pattern appears in a string using powerful regular expression matching.
It is widely used in:
- Data validation
- ETL processes
- Data quality checks
- Log analysis
- Text parsing
- Security auditing
- Reporting
- CSV and delimiter analysis
- Email and phone validation
If you work with messy real-world data, REGEXP_COUNT can save enormous amounts of SQL code.
What is REGEXP_COUNT?
REGEXP_COUNT returns the number of times a regular expression pattern occurs in a string.
Basic Syntax
REGEXP_COUNT( source_string, pattern, start_position, match_parameter)
Parameters Explained
| Parameter | Description |
| source_string | Text to search |
| pattern | Regular expression pattern |
| start_position | Optional starting position |
| match_parameter | Optional matching behavior |
Simple Example
Count vowels in a word
SELECT REGEXP_COUNT('DATABASE', '[AEIOU]') AS vowel_count FROM dual;
Output
4

Explanation:
- [] defines a character class
- [AEIOU] matches any vowel
- Oracle counts occurrences
Case Sensitive vs Case Insensitive
Case Sensitive (Default)
SELECT REGEXP_COUNT('Oracle oracle ORACLE', 'oracle') FROM dual;
Output:
1
Only lowercase oracle matches.

Case Insensitive
SELECT REGEXP_COUNT( 'Oracle oracle ORACLE', 'oracle', 1, 'i')FROM dual;
Output:
3
i means case-insensitive.

Real-Life Example — Count Words in a Sentence
Suppose you want to count how many words exist in customer comments.
SELECT REGEXP_COUNT( 'Oracle databases are powerful', '\S+') AS word_countFROM dual;
Output:
4
Explanation:
| Pattern | Meaning |
| \S+ | One or more non-space characters |

Real-Life Example — Count Commas in CSV Data
Very common in ETL and data import validation.
SELECT REGEXP_COUNT( 'John,Smith,London,UK', ',') AS comma_countFROM dual;

Output:
3
Useful for validating CSV column consistency.
Count Number of Values in CSV
A practical trick:
Number of values = commas + 1
SELECT REGEXP_COUNT( 'John,Smith,London,UK', ',') + 1 AS total_columnsFROM dual;
Output:
4

Validate Email Structure
Suppose we want to identify rows containing email addresses.
Sample Table
CREATE TABLE customers ( customer_id NUMBER, email VARCHAR2(100));Table CUSTOMERS created.
Insert Data
INSERT INTO customers VALUES (1, 'john@gmail.com');INSERT INTO customers VALUES (2, 'invalid-email');INSERT INTO customers VALUES (3, 'mary@yahoo.com');commit;

Count Valid Emails
SELECT COUNT(*)FROM customersWHERE REGEXP_COUNT( email, '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$') > 0;
Explanation:
| Symbol | Meaning |
| ^ | Start of string |
| $ | End of string |
| + | One or more |
| \. | Literal dot |

Find Rows with Multiple Spaces
Very common in data cleansing projects.
SELECT REGEXP_COUNT( 'Oracle Database 23c', ' {2,}') AS multiple_spacesFROM dual;
Output:
2
Explanation:
| Pattern | Meaning |
| {2,} | Two or more spaces |

Count Digits in a String
Useful for validating IDs or phone numbers.
SELECT REGEXP_COUNT( 'INV-2025-9988', '[0-9]') AS digit_countFROM dual;
Output:
8

Count Special Characters
Security and password validation frequently require this.
SELECT REGEXP_COUNT( 'Welcome@2025!', '[^A-Za-z0-9]') AS special_charactersFROM dual;
Output:
2
Explanation:
| Pattern | Meaning |
| ^ inside [] | NOT |
| [^A-Za-z0-9] | Non-alphanumeric characters |

Count Occurrences of a Word
SELECT REGEXP_COUNT( 'error warning error failed error', 'error') AS error_countFROM dual;
Output:
3

Count HTML Tags
Useful in content management systems.
SELECT REGEXP_COUNT( '<p>Hello</p><div>World</div>', '<[^>]+>') AS html_tag_countFROM dual;
Output:
4

Count Numeric Values in Mixed Data
SELECT REGEXP_COUNT( 'ABC123XYZ456', '[0-9]+') AS numeric_groupsFROM dual;
Output:
2
Explanation:
- 123
- 456
are counted as separate groups.

Real-Life ETL Validation Example
Suppose bank account numbers must contain exactly 12 digits
Sample Data
CREATE TABLE accounts (
account_no VARCHAR2(50)
);
INSERT INTO accounts VALUES ('123456789012');
INSERT INTO accounts VALUES ('ABC123');
INSERT INTO accounts VALUES ('999999999999');
commit;

Find Invalid Accounts
SELECT account_noFROM accountsWHERE REGEXP_COUNT(account_no, '^[0-9]{12}$') = 0;
Output:
ABC123

Using Start Position
You can start searching from a specific character position.
SELECT REGEXP_COUNT( 'abcabcabc', 'abc', 4)FROM dual;
Output:
2
Oracle starts searching from position 4.

Match Parameters
| Parameter | Meaning |
| i | Case-insensitive |
| c | Case-sensitive |
| n | Dot matches newline |
| m | Multi-line mode |
| x | Ignore whitespace |
Multi-Line Example
SELECT REGEXP_COUNT('ERRORWARNINGERROR','ERROR',1,'m')FROM dual;
Output:
2

Real-Life Log File Analysis
Suppose application logs are stored in a table.
Table
CREATE TABLE app_logs ( log_text CLOB);Table APP_LOGS created.
Count Errors
SELECT REGEXP_COUNT( log_text, 'ERROR') AS total_errorsFROM app_logs;

Data Quality Use Cases
REGEXP_COUNT is heavily used in enterprise data quality systems.
Examples
| Scenario | Example |
| Detect duplicate delimiters | ,, |
| Validate phone number length | Count digits |
| Validate PAN/Tax IDs | Alphanumeric checks |
| Detect malformed emails | Email regex |
| Count invalid symbols | Security screening |
| Analyze text patterns | NLP preprocessing |
| Validate CSV structure | Count delimiters |
| Check password strength | Count symbols/numbers |
Comparing REGEXP_COUNT vs LENGTH
Many developers use LENGTH(str) instead of REGEXP_COUNT(str, ‘,’)
Which is Better?
| Method | Advantage |
| REGEXP_COUNT | More readable and powerful |
| LENGTH/REPLACE | Slightly faster for simple chars |
Use REGEXP_COUNT when pattern complexity matters.
Performance Considerations
Regular expressions are powerful but can be expensive.
Best Practices
Use Simple Patterns
Avoid overly complex regex.
Bad:
([A-Za-z0-9._%+-]+)+
Better:
[A-Za-z0-9._%+-]+
Avoid Regexp on Huge Datasets Without Filters
Instead of:
SELECT * FROM big_table WHERE REGEXP_COUNT(col, 'ERROR') > 0;
Use pre-filters:
SELECT * FROM big_table WHERE col LIKE '%ERROR%' AND REGEXP_COUNT(col, 'ERROR') > 0;
Common Regexp Symbols Cheat Sheet
| Symbol | Meaning |
| . | Any character |
| * | Zero or more |
| + | One or more |
| ? | Optional |
| ^ | Start |
| $ | End |
| [abc] | Any listed char |
| [^abc] | Not listed char |
| [0-9] | Digit |
| [A-Z] | Uppercase |
| \s | Whitespace |
| \S | Non-whitespace |
Advanced Real-Life Example — Password Validation
Suppose password rules require:
- Minimum 1 uppercase
- Minimum 1 lowercase
- Minimum 1 digit
- Minimum 1 special character
Validation Query
SELECT password FROM users WHERE REGEXP_COUNT(password, '[A-Z]') >= 1 AND REGEXP_COUNT(password, '[a-z]') >= 1 AND REGEXP_COUNT(password, '[0-9]') >= 1 AND REGEXP_COUNT(password, '[^A-Za-z0-9]') >= 1;
This is widely used in security systems.
Summary
REGEXP_COUNT is extremely valuable for:
- Data validation
- Text analytics
- Cleansing messy data
- ETL pipelines
- Security validation
- Log analysis
- CSV parsing
- Pattern detection
Its biggest advantage is combining SQL querying with powerful pattern matching.
Most Useful Real-World Use Cases
| Use Case | Example |
| Email validation | Count valid patterns |
| CSV validation | Count delimiters |
| Password validation | Count digits/symbols |
| Log monitoring | Count ERROR/WARNING |
| Data cleansing | Detect extra spaces |
| Financial systems | Validate account numbers |
| Healthcare systems | Validate IDs |
| Telecom | Count digits in phone numbers |
| Retail | Analyze customer comments |
| ETL | Detect malformed records |
Final Tips
Use
REGEXP_COUNT
When:
- Pattern matching is needed
- Data is messy
- Validation rules are complex
- Parsing semi-structured text
Avoid It When:
- Simple LIKE is enough
- Performance is critical on massive tables
- Exact string matching is sufficient
possible topics for future articles, let me know your preference!
- Count vowels in employee names
- Count dots in IP addresses
- Count words in customer reviews
- Detect multiple commas
- Validate SSN format
- Count hashtags in social posts
- Count URLs in text
- Count repeated digits in phone numbers
