Site icon Radu Pârvu

REGEXP_INSTR: Powerful Patterns for SQL Queries

Introduction

We continue the series of articles on database features to often ignored with:

The Oracle REGEXP_INSTR function which is one of the most useful regular expression functions in SQL and PL/SQL. It helps you find the position of a pattern inside a string using regular expressions.

What is

Unlike the classic INSTR function, REGEXP_INSTR understands powerful pattern matching rules such as:
• Digits
• Email patterns
• Phone numbers
• Dates
• Multiple spaces
• Special characters
• Repeated words
• Word boundaries
• Case-insensitive matching

Typical use cases

It is extremely useful in:
• Data validation
• ETL pipelines
• Data cleansing
• Log analysis
• Audit systems
• Customer data processing
• Parsing semi-structured text

Syntax of REGEXP_INSTR

REGEXP_INSTR(
source_string,
pattern,
position,
occurrence,
return_option,
match_parameter,
subexpr
)

Parameters Explained

ParameterDescription
source_stringThe text to search
patternRegular expression pattern
positionStarting position (default = 1)
occurrenceWhich occurrence to find
return_option0 = start position, 1 = end position
match_parameterMatching behavior (i, c, n, m)
subexprReturns position of capturing group

Difference Between INSTR and REGEXP_INSTR

Using INSTR

Works only for exact text:

SELECT INSTR('abc123xyz', '123')
FROM dual;

Using REGEXP_INSTR

SELECT REGEXP_INSTR('abc123xyz', '[0-9]+')
FROM dual;

This works for any numeric sequence!

Basic Example

Find Position of First Digit

SELECT REGEXP_INSTR('OrderID: A12345', '[0-9]')
FROM dual;
Output:
11
Explanation:
The first digit 1 starts at position 11.

Real-Life Example — Validate Email Structure

Suppose a customer table contains email addresses.

Sample Data

CREATE TABLE customers (
customer_id NUMBER,
email VARCHAR2(100)
);
INSERT INTO customers VALUES (1, 'john.doe@gmail.com');
INSERT INTO customers VALUES (2, 'invalid-email');
INSERT INTO customers VALUES (3, 'sales@company.org');

Find Position of @ Symbol

SELECT
email,
REGEXP_INSTR(email, '@')
FROM customers;

Result:

EMAILPOSITION
john.doe@gmail.com9
invalid-email0
sales@company.org6

Real-Life Example: Detect Phone Numbers

Suppose logs contain customer notes.

CREATE TABLE support_notes (
note_text VARCHAR2(200)
);
Table SUPPORT_NOTES created.
INSERT INTO support_notes VALUES
('Customer called from 9876543210 regarding invoice');
INSERT INTO support_notes VALUES
('No contact number available');

Find 10-Digit Phone Number Position

SELECT
note_text,
REGEXP_INSTR(note_text, '[0-9]{10}')
FROM support_notes;

Real-Life Example: Extract Multiple Occurrences

Suppose a text contains several numbers.

SELECT REGEXP_INSTR(
'Invoice 100 Amount 250 Tax 30',
'[0-9]+',
1,
2
)
FROM dual;

Output:

20

Explanation:

Because Oracle returns the starting position of the second numeric match.

Understanding Occurrence Parameter

Syntax

REGEXP_INSTR(text, pattern, start, occurrence)

Example

SELECT
REGEXP_INSTR('cat dog fox dog lion', 'dog', 1, 2)
FROM dual;

Output:

13

Because the second dog begins at position 13.

Using return_option

This parameter decides whether Oracle returns:

ValueMeaning
0Start position
1Position after match ends

Example

SELECT
REGEXP_INSTR(
'ABC123XYZ',
'[0-9]+',
1,
1,
0
) AS start_pos,
REGEXP_INSTR(
'ABC123XYZ',
'[0-9]+',
1,
1,
1
) AS end_pos
FROM dual;

Result:

START_POSEND_POS
47

Case-Insensitive Search

Without Case-Insensitive Matching

SELECT REGEXP_INSTR('Oracle Database', 'database')
FROM dual;

Output:

0

With i

Match Parameter

SELECT REGEXP_INSTR(
'Oracle Database',
'database',
1,
1,
0,
'i'
)
FROM dual;

Output:

8

Real-Life Example: Detect Invalid Characters

Suppose usernames should contain only:
• Letters
• Numbers
• Underscores

Sample Data

CREATE TABLE app_users (
username VARCHAR2(50)
);
Table APP_USERS created.
INSERT INTO app_users VALUES ('john_123');
INSERT INTO app_users VALUES ('mike#admin');
INSERT INTO app_users VALUES ('sarah.smith');

Detect Invalid Characters

SELECT
username,
REGEXP_INSTR(username, '[^a-zA-Z0-9_]')
FROM app_users;
SELECT
username,
REGEXP_INSTR(username, '[^a-zA-Z0-9_]')
FROM app_users;

Result:

USERNAMEPOSITION
john_1230
mike#admin5
sarah.smith6

Find Consecutive Spaces

Very useful in data cleansing.

SELECT REGEXP_INSTR(
'Oracle Database',
'( ){2,}'
)
FROM dual;

Output:

7

Real-Life Example: Parse Log Files

Suppose application logs contain timestamps.

CREATE TABLE app_logs (
log_text VARCHAR2(500)
);
INSERT INTO app_logs VALUES
('ERROR at 2025-04-15 10:45:22 connection timeout');

Detect Timestamp Position

SELECT
REGEXP_INSTR(
log_text,
'[0-9]{4}-[0-9]{2}-[0-9]{2}'
)
FROM app_logs;

Using Capturing Groups (subexpr)

One of the most advanced features.

A capturing group is a part of the regular expression enclosed in parentheses (). Example:

([0-9]{3})-([0-9]{4})
SELECT REGEXP_INSTR(
'Phone: (123) 456-7890',
'(\([0-9]{3}\)) ([0-9]{3})-([0-9]{4})',
1,
1,
0,
NULL,
2
)
FROM dual;

output:

14

Explanation:
Returns position of second capturing group (456).

Match Parameters Explained

ParameterMeaning
iCase-insensitive
cCase-sensitive
nDot matches newline
mMulti-line matching

Real-Life Example: Find HTML Tags

Suppose HTML content is stored in database.

SELECT REGEXP_INSTR(
'<div><h1>Oracle</h1></div>',
'<[^>]+>'
)
FROM dual;

Output:

1

Performance Considerations

Regular expressions are powerful but expensive.

Faster

INSTR(column, 'ABC')

Slower

REGEXP_INSTR(column, '[A-Z]{3}')

Best Practices

Use INSTR when simple search is enough

Avoid regex for exact matches.

Use:
• INSTR → for simple exact searches
• REGEXP_INSTR → for intelligent pattern matching
That balance gives both:
• Better performance
• Cleaner SQL
• Easier maintenance

Avoid Leading Wildcards

Bad:

‘.*abc’

Better:

‘abc’

Pre-Filter Large Datasets

WHERE column_name LIKE '%@%'
--before

REGEXP_INSTR(column_name, '@[a-z]+')

Common Regular Expression Patterns

PatternMeaning
[0-9]Single digit
[0-9]+One or more digits
[A-Z]Uppercase letter
[a-z]Lowercase letter
.Any character
^Start of string
$End of string
\sWhitespace
\wWord character
[^abc]Not a, b, or c

Real Enterprise Use Cases

Banking Systems
• Detect malformed account numbers
• Validate SWIFT codes
• Parse payment messages

Telecom
• Extract phone numbers from logs
• Detect invalid customer IDs

Retail
• Clean product descriptions
• Validate coupon codes

Healthcare
• Detect patient IDs
• Parse diagnosis codes

Data Warehousing
• Standardize incoming text data
• Validate ETL input streams

Common Mistakes

Forgetting Escape Characters

Using Regexp for Everything

Avoid regex when simple SQL functions work.

Ignoring Case Sensitivity

Always specify i if needed.

Summary

REGEXP_INSTR is a highly powerful Oracle SQL function for locating complex text patterns.
It is especially useful for:
• Data validation
• Cleansing
• Parsing
• ETL
• Log processing
• Audit systems
The function becomes extremely valuable when working with:
• Emails
• Phone numbers
• Dates
• Structured text
• Semi-structured logs

Exit mobile version