REGEXP_INSTR: Powerful Patterns for SQL Queries


Introduction

We continue the series of articles on database features to often ignored with:

The Oracle REGEXP_INSTR function which is one of the most useful regular expression functions in SQL and PL/SQL. It helps you find the position of a pattern inside a string using regular expressions.

What is

Unlike the classic INSTR function, REGEXP_INSTR understands powerful pattern matching rules such as:
• Digits
• Email patterns
• Phone numbers
• Dates
• Multiple spaces
• Special characters
• Repeated words
• Word boundaries
• Case-insensitive matching

Typical use cases

It is extremely useful in:
• Data validation
• ETL pipelines
• Data cleansing
• Log analysis
• Audit systems
• Customer data processing
• Parsing semi-structured text

Syntax of REGEXP_INSTR

REGEXP_INSTR(
source_string,
pattern,
position,
occurrence,
return_option,
match_parameter,
subexpr
)

Parameters Explained

ParameterDescription
source_stringThe text to search
patternRegular expression pattern
positionStarting position (default = 1)
occurrenceWhich occurrence to find
return_option0 = start position, 1 = end position
match_parameterMatching behavior (i, c, n, m)
subexprReturns position of capturing group

Difference Between INSTR and REGEXP_INSTR

Using INSTR

Works only for exact text:

SELECT INSTR('abc123xyz', '123')
FROM dual;
SQL query showing the use of the INSTR function to find the position of '123' in 'abc123xyz', with the result indicating position 4.

Using REGEXP_INSTR

SELECT REGEXP_INSTR('abc123xyz', '[0-9]+')
FROM dual;

This works for any numeric sequence!

SQL query result showing the use of REGEXP_INSTR function to find the position of numeric characters in a string.

Basic Example

Find Position of First Digit

SELECT REGEXP_INSTR('OrderID: A12345', '[0-9]')
FROM dual;
Output:
11
Explanation:
The first digit 1 starts at position 11.
SQL query using REGEXP_INSTR function to find the position of digits in the string 'OrderID: A12345'.

Real-Life Example — Validate Email Structure

Suppose a customer table contains email addresses.

Sample Data

CREATE TABLE customers (
customer_id NUMBER,
email VARCHAR2(100)
);
INSERT INTO customers VALUES (1, 'john.doe@gmail.com');
INSERT INTO customers VALUES (2, 'invalid-email');
INSERT INTO customers VALUES (3, 'sales@company.org');
Database query output showing successful insertion of three rows into a customers table, with one row containing an invalid email address.

Find Position of @ Symbol

SELECT
email,
REGEXP_INSTR(email, '@')
FROM customers;

Result:

EMAILPOSITION
john.doe@gmail.com9
invalid-email0
sales@company.org6
SQL query output showing a list of email addresses and their respective positions of the '@' character.

Real-Life Example: Detect Phone Numbers

Suppose logs contain customer notes.

CREATE TABLE support_notes (
note_text VARCHAR2(200)
);
Table SUPPORT_NOTES created.
SQL command output showing the creation of a table named SUPPORT_NOTES, with messages indicating that three rows have been inserted.
INSERT INTO support_notes VALUES
('Customer called from 9876543210 regarding invoice');
INSERT INTO support_notes VALUES
('No contact number available');
Screenshot of a database query execution, showing two SQL insert statements and confirmation of rows inserted.

Find 10-Digit Phone Number Position

SELECT
note_text,
REGEXP_INSTR(note_text, '[0-9]{10}')
FROM support_notes;
SQL query result showing customer support notes. First entry indicates a call from a specified phone number regarding an invoice, and the second entry states there is no contact number available.

Real-Life Example: Extract Multiple Occurrences

Suppose a text contains several numbers.

SELECT REGEXP_INSTR(
'Invoice 100 Amount 250 Tax 30',
'[0-9]+',
1,
2
)
FROM dual;

Output:

20

Explanation:

Because Oracle returns the starting position of the second numeric match.

SQL query code snippet showing REGEXP_INSTR usage to extract numeric values from a string containing invoice details.

Understanding Occurrence Parameter

Syntax

REGEXP_INSTR(text, pattern, start, occurrence)

Example

SELECT
REGEXP_INSTR('cat dog fox dog lion', 'dog', 1, 2)
FROM dual;

Output:

13

Because the second dog begins at position 13.

SQL code snippet showing a query using REGEXP_INSTR function to search for 'dog' in a string.

Using return_option

This parameter decides whether Oracle returns:

ValueMeaning
0Start position
1Position after match ends

Example

SELECT
REGEXP_INSTR(
'ABC123XYZ',
'[0-9]+',
1,
1,
0
) AS start_pos,
REGEXP_INSTR(
'ABC123XYZ',
'[0-9]+',
1,
1,
1
) AS end_pos
FROM dual;

Result:

START_POSEND_POS
47
SQL query using REGEXP_INSTR function to find the start and end positions of numeric sequences in the string 'ABC123XYZ'.

Case-Insensitive Search

Without Case-Insensitive Matching

SELECT REGEXP_INSTR('Oracle Database', 'database')
FROM dual;

Output:

0

SQL query showing the use of REGEXP_INSTR function in Oracle Database, displaying results in the query output.

With i

Match Parameter

SELECT REGEXP_INSTR(
'Oracle Database',
'database',
1,
1,
0,
'i'
)
FROM dual;

Output:

8

SQL code snippet showing the use of REGEXP_INSTR function in Oracle Database to find the position of the word 'database'.

Real-Life Example: Detect Invalid Characters

Suppose usernames should contain only:
• Letters
• Numbers
• Underscores

Sample Data

CREATE TABLE app_users (
username VARCHAR2(50)
);
Table APP_USERS created.
INSERT INTO app_users VALUES ('john_123');
INSERT INTO app_users VALUES ('mike#admin');
INSERT INTO app_users VALUES ('sarah.smith');
Screenshot showing SQL commands to create a table named 'app_users' and insert user data with three sample usernames.

Detect Invalid Characters

SELECT
username,
REGEXP_INSTR(username, '[^a-zA-Z0-9_]')
FROM app_users;
SELECT
username,
REGEXP_INSTR(username, '[^a-zA-Z0-9_]')
FROM app_users;

Result:

USERNAMEPOSITION
john_1230
mike#admin5
sarah.smith6
SQL script showing a SELECT statement for usernames and their respective REGEXP_INSTR values from the app_users table.

Find Consecutive Spaces

Very useful in data cleansing.

SELECT REGEXP_INSTR(
'Oracle Database',
'( ){2,}'
)
FROM dual;

Output:

7

SQL query using REGEXP_INSTR function to find patterns in the string 'Oracle Database'.

Real-Life Example: Parse Log Files

Suppose application logs contain timestamps.

CREATE TABLE app_logs (
log_text VARCHAR2(500)
);
INSERT INTO app_logs VALUES
('ERROR at 2025-04-15 10:45:22 connection timeout');
Screenshot of a database query interface displaying an SQL insert statement for logging an error with a connection timeout.

Detect Timestamp Position

SELECT
REGEXP_INSTR(
log_text,
'[0-9]{4}-[0-9]{2}-[0-9]{2}'
)
FROM app_logs;
SQL query using REGEXP_INSTR to extract timestamps from log_text in the app_logs table, with query results showing '10' listed multiple times.

Using Capturing Groups (subexpr)

One of the most advanced features.

A capturing group is a part of the regular expression enclosed in parentheses (). Example:

([0-9]{3})-([0-9]{4})
SELECT REGEXP_INSTR(
'Phone: (123) 456-7890',
'(\([0-9]{3}\)) ([0-9]{3})-([0-9]{4})',
1,
1,
0,
NULL,
2
)
FROM dual;

output:

14
SQL query displaying the use of REGEXP_INSTR function to extract phone number format from a string.

Explanation:
Returns position of second capturing group (456).

Match Parameters Explained

ParameterMeaning
iCase-insensitive
cCase-sensitive
nDot matches newline
mMulti-line matching

Real-Life Example: Find HTML Tags

Suppose HTML content is stored in database.

SELECT REGEXP_INSTR(
'<div><h1>Oracle</h1></div>',
'<[^>]+>'
)
FROM dual;

Output:

1

SQL query displaying the use of REGEXP_INSTR function to analyze a string containing HTML tags.

Performance Considerations

Regular expressions are powerful but expensive.

Faster

INSTR(column, 'ABC')

Slower

REGEXP_INSTR(column, '[A-Z]{3}')

Best Practices

Use INSTR when simple search is enough

Avoid regex for exact matches.

Use:
• INSTR → for simple exact searches
• REGEXP_INSTR → for intelligent pattern matching
That balance gives both:
• Better performance
• Cleaner SQL
• Easier maintenance

Avoid Leading Wildcards

Bad:

‘.*abc’

Better:

‘abc’

Pre-Filter Large Datasets

WHERE column_name LIKE '%@%'
--before

REGEXP_INSTR(column_name, '@[a-z]+')

Common Regular Expression Patterns

PatternMeaning
[0-9]Single digit
[0-9]+One or more digits
[A-Z]Uppercase letter
[a-z]Lowercase letter
.Any character
^Start of string
$End of string
\sWhitespace
\wWord character
[^abc]Not a, b, or c

Real Enterprise Use Cases

Banking Systems
• Detect malformed account numbers
• Validate SWIFT codes
• Parse payment messages

Telecom
• Extract phone numbers from logs
• Detect invalid customer IDs

Retail
• Clean product descriptions
• Validate coupon codes

Healthcare
• Detect patient IDs
• Parse diagnosis codes

Data Warehousing
• Standardize incoming text data
• Validate ETL input streams

Common Mistakes

Forgetting Escape Characters

Using Regexp for Everything

Avoid regex when simple SQL functions work.

Ignoring Case Sensitivity

Always specify i if needed.

Summary

REGEXP_INSTR is a highly powerful Oracle SQL function for locating complex text patterns.
It is especially useful for:
• Data validation
• Cleansing
• Parsing
• ETL
• Log processing
• Audit systems
The function becomes extremely valuable when working with:
• Emails
• Phone numbers
• Dates
• Structured text
• Semi-structured logs


Discover more from Radu Pârvu

Subscribe to get the latest posts sent to your email.

, , , , , ,

Leave a Reply

Discover more from Radu Pârvu

Subscribe now to keep reading and get access to the full archive.

Continue reading

Discover more from Radu Pârvu

Subscribe now to keep reading and get access to the full archive.

Continue reading