Site icon Radu Pârvu

Master REGEXP_REPLACE in Oracle SQL

“If you are afraid, do not do it.
If you do it, do not
be afraid.”
-Genghis Khan

Introduction

REGEXP_REPLACE is one of the most powerful text-processing functions available in Oracle Database SQL.
It combines the flexibility of regular expressions (regexp) with the simplicity of the SQL REPLACE function, allowing you to:
• Clean messy data
• Standardize formats
• Validate input
• Mask sensitive information
• Parse strings
• Transform text dynamically

What is REGEXP_REPLACE?

REGEXP_REPLACE searches a string using a regular expression pattern and replaces matching text with another value.

Basic Syntax

REGEXP_REPLACE(
source_string,
pattern,
replace_string,
position,
occurrence,
match_parameter
)

Parameters Explained

ParameterDescription
source_stringText to search
patternRegex pattern
replace_stringReplacement text
positionStarting position (default = 1)
occurrenceWhich occurrence to replace
match_parameterMatching rules (i, c, n, m)

Simple Example

Replace all numbers

SELECT REGEXP_REPLACE('Order12345', '[0-9]', '')
FROM dual;

Output
Order
Explanation:
• [0-9] means any digit
• Every digit is replaced with an empty string

Reference

Oracle officially documents the function here: 
Oracle REGEXP_REPLACE Documentation

Why REGEXP_REPLACE is Better than REPLACE

Normal REPLACE works only with exact text.

Example:

SELECT REPLACE('abc123', '123', '')
FROM dual;

Works only if 123 exists exactly.

But:

SELECT REGEXP_REPLACE('abc123', '[0-9]', '')
FROM dual;

Removes ANY digits dynamically:

Most Important Regexp Symbols

PatternMeaning
.Any character
[0-9]Any digit
[A-Z]Uppercase letters
[a-z]Lowercase letters
+One or more
*Zero or more
^Start of string
$End of string
\sWhitespace
()Capture group

Real-Life Examples

Example 1 — Cleaning Phone Numbers

Problem

Users enter phone numbers inconsistently:
+358-40-123-4567
040 123 4567
(040)1234567
You want only digits.

Solution

SELECT REGEXP_REPLACE(
'+358-40-123-4567',
'[^0-9]',
''
)
FROM dual;
Output
358401234567
Explanation

Example 2 — Remove Multiple Spaces

Problem

Imported data often contains irregular spaces.

John Smith

Solution

SELECT REGEXP_REPLACE(
'John Smith',
' {2,}',
' '
)
FROM dual;

Output

John Smith

Explanation

• {2,} means “2 or more”
• Multiple spaces become one space

This is one of the most common data-cleaning operations in ETL systems.

Example 3 — Mask Credit Card Numbers

Problem

Display only the last 4 digits.

Solution

SELECT REGEXP_REPLACE(
'4532123412345678',
'([0-9]{12})([0-9]{4})',
'************\2'
)
FROM dual;
************5678
Output
************5678

Use Cases

Example 4 — Reformat Dates

Problem

Convert:
2026-05-20
to:
20/05/2026

Solution

SELECT REGEXP_REPLACE(
'2026-05-20',
'([0-9]{4})-([0-9]{2})-([0-9]{2})',
'\3/\2/\1'
)
FROM dual;

Output

20/05/2026

Understanding Capture Groups

([0-9]{4})
This becomes group \1
([0-9]{2})
This becomes \2

Another becomes \3

Then:
‘\3/\2/\1’
rearranges the values.

Example 5 — Convert CamelCase to snake_case

Problem

Developers often need naming conversions.

Input

CustomerAddressLine

Solution

SELECT LOWER(
REGEXP_REPLACE(
'CustomerAddressLine',
'([A-Z])',
'_\1'
)
)
FROM dual;

Output

_customer_address_line

Example 6 — Remove HTML Tags

Problem

Imported text contains HTML.

<p>Hello World</p>

Solution

SELECT REGEXP_REPLACE(
'<p>Hello World</p>',
'<[^>]+>',
''
)
FROM dual;

Output

Hello World

Very useful in:
• CMS systems
• Data migration
• Web scraping pipelines

Example 7 — Validate Email-Like Structure

Although validation is usually done with REGEXP_LIKE, cleanup is often done with REGEXP_REPLACE.

Remove invalid characters

SELECT REGEXP_REPLACE(
'john#$%@example.com',
'[^A-Za-z0-9@._-]',
''
)
FROM dual;

Output

john@example.com

Example 8 — Keep Only Alphabets

Problem

Input:
EMP-2026-ABC-100

Need:
EMPABC

Solution

SELECT REGEXP_REPLACE(
'EMP-2026-ABC-100',
'[^A-Za-z]',
''
)
FROM dual;

Output

EMPABC

Example 9 — Replace Only First Occurrence

Input

cat cat cat

Solution

SELECT REGEXP_REPLACE(
'cat cat cat',
'cat',
'dog',
1,
1
)
FROM dual;

Output

dog cat cat

Example 10 — Case-Insensitive Replacement

Input

Oracle oracle ORACLE

Solution

SELECT REGEXP_REPLACE(
'Oracle oracle ORACLE',
'oracle',
'DB',
1,
0,
'i'
)
FROM dual;

Output

DB DB DB

Important Match Parameters

ParameterMeaning
iCase-insensitive
cCase-sensitive
nDot matches newline
mMulti-line mode

Working with Capture Groups

Capture groups are essential for advanced transformations.

Example — Reformat Phone Number
Input
0401234567
Query

SELECT REGEXP_REPLACE(
'0401234567',
'([0-9]{3})([0-9]{3})([0-9]{4})',
'(\1) \2-\3'
)
FROM dual;

Output

(040) 123-4567

Performance Considerations

Regexp is powerful but expensive.


Use simple string functions when possible:
• REPLACE
• TRANSLATE
• SUBSTR
• INSTR

Regexp should be used when:
• Pattern matching is dynamic
• Data formats vary
• Multiple transformations are needed

Common Mistakes

Mistake 1 — Forgetting Escape Characters

Mistake 2 — Greedy Matching

Problematic:
‘<.*>’
May consume too much text.

Better:
‘<[^>]+>’

Mistake 3 — Using Regexp Everywhere


Regexp is slower than simple functions.


Avoid:
REGEXP_REPLACE(col, ‘a’, ‘b’)
Prefer:
REPLACE(col, ‘a’, ‘b’)

Real Enterprise Use Cases

ETL/Data Warehousing
Cleaning imported CSV data.

Banking Systems
Masking customer information.

Healthcare
Removing invalid patient identifiers.

ERP Applications
Standardizing product codes.

Log Analysis
Extracting structured information from logs.

Summary

REGEXP_REPLACE is one of the most valuable Oracle SQL functions for real-world data transformation.
It helps solve problems involving:
• Dirty data
• Text normalization
• Parsing
• Data masking
• Formatting
• Validation support
Mastering it can significantly improve your SQL development and ETL capabilities.

Exit mobile version