Column Level Collation Example

The feature was first introduced in 12.2 . It basically determines how the string comparison takes place. In order to use the feature, extended data types should be enabled.

I work on an great autonomous TP instance and it is already enabled:

select name, value from v$parameter
where name in (lower('MAX_STRING_SIZE'), lower('COMPATIBLE'))
;

compatible	18.0.0
max_string_size	EXTENDED

The feature is supported at different levels:

column
table
schema
session
database
statement

In this post, we will look into column level collation. For the demo purpose, I will use SQL*Plus avoiding SQL Developer which has quite complex GUI based NLS settings. We first set the NLS_LANG environment variable:

radu.parvu:~\==> 
radu.parvu:~\==> export NLS_LANG=AMERICAN_AMERICA.UTF8
radu.parvu:~\==> 
radu.parvu:~\==> sqlplus admin@training2_high

SQL*Plus: Release 18.0.0.0.0 Production on Tue Apr 7 21:39:49 2020
Version 18.1.0.0.0

Copyright (c) 1982, 2018, Oracle.  All rights reserved.

Enter password: 
Last Successful login time: Tue Apr 07 2020 21:27:35 +03:00

Connected to:
Oracle Database 18c Enterprise Edition Release 18.0.0.0.0 - Production
Version 18.4.0.0.0

SQL>

To check the normal behaviour, let us create a table with two columns:

SQL> ed
Wrote file afiedt.buf

  1  create table name_table
  2  (
  3  first_name VARCHAR2(20),
  4  last_name	VARCHAR2(20)
  5* )
SQL> /

Table created.

Next we insert four rows with similar data only differing by case or accent:

SQL> insert into name_table values ('Radu', 'Pârvu');

1 row created.

SQL> insert into name_table values ('RAdu', 'PÂrvu');

1 row created.

SQL> insert into name_table values ('Radu', 'Parvu');       

1 row created.

insert into name_table values ('RAdu', 'PArvu')
SQL> /

1 row created.

SQL> commit
  2  ;

SQL> column FIRST_NAME format A20
SQL> column last_name format A20
SQL> /

FIRST_NAME	     LAST_NAME
-------------------- --------------------
Radu                 Pârvu
RAdu                 PÂrvu
Radu		     Parvu
RAdu		     PArvu

If we do an equality test we will get only one row for an exact match. In a GROUP BY operation each row will appear as unique.

SQL> select * from name_table
  2  where last_name = 'PÂrvu'
  3  ;

FIRST_NAME	     LAST_NAME
-------------------- --------------------
RAdu                 PÂrvu

  1  select first_name, last_name, count(*)
  2  from name_table
  3  group by first_name, last_name
  4* order by first_name, last_name
SQL> /

FIRST_NAME	     LAST_NAME		    COUNT(*)
-------------------- -------------------- ----------
RAdu		     PArvu			   1
RAdu                 PÂrvu                         1
Radu		     Parvu			   1
Radu                 Pârvu                         1

SQL> select * from name_table
  2  	where first_name = 'radu'
  3  /

no rows selected

Next, let us re’create the table with the first_name column is case insensitive and re’run the tests:

SQL> prompt Create table with the first name case insensitive
Create table with the first name case insensitive
SQL> create table name_table
  2    (
  3    first_name	VARCHAR2(20) collate binary_ci,
  4    last_name	VARCHAR2(20)
  5   )
  6  /

Table created.

SQL> select * from name_table
  2  	where last_name = 'PÂrvu'
  3  /

FIRST_NAME	     LAST_NAME
-------------------- --------------------
RAdu                 PÂrvu

SQL> pause

SQL> 
SQL> select first_name, last_name, count(*)
  2  	from name_table
  3  	group by first_name, last_name
  4  	order by first_name, last_name
  5  /

FIRST_NAME	     LAST_NAME		    COUNT(*)
-------------------- -------------------- ----------
RAdu		     PArvu			   1
Radu		     Parvu			   1
RAdu                 PÂrvu                         1
Radu                 Pârvu                         1

SQL> pause

SQL> 
SQL> select * from name_table
  2  	where first_name = 'radu'
  3  /

FIRST_NAME	     LAST_NAME
-------------------- --------------------
Radu                 Pârvu
RAdu                 PÂrvu
Radu		     Parvu
RAdu		     PArvu

As we can see, now the test query comparing the first name is completely case insensitive! All other test queries return the same, as expected. Next, let us make the last_name column accent insensitive and re test adding one extra query for matching the last_name to the string ‘Parvu’:

SQL> create table name_table
  2    (
  3    first_name	VARCHAR2(20) collate binary_ci,
  4    last_name	VARCHAR2(20) collate binary_ai
  5   )
  6  /

SQL> select * from name_table
  2  	where last_name = 'PÂrvu'
  3  /

FIRST_NAME	     LAST_NAME
-------------------- --------------------
Radu                 Pârvu
RAdu                 PÂrvu
Radu		     Parvu
RAdu		     PArvu

SQL> pause

SQL> 
SQL> select first_name, last_name, count(*)
  2  	from name_table
  3  	group by first_name, last_name
  4  	order by first_name, last_name
  5  /

FIRST_NAME	     LAST_NAME		    COUNT(*)
-------------------- -------------------- ----------
Radu                 Pârvu                         4

SQL> pause

SQL> 
SQL> select * from name_table
  2  	where first_name = 'radu'
  3  /

FIRST_NAME	     LAST_NAME
-------------------- --------------------
Radu                 Pârvu
RAdu                 PÂrvu
Radu		     Parvu
RAdu		     PArvu

SQL> select * from name_table where last_name = 'Parvu';

FIRST_NAME	     LAST_NAME
-------------------- --------------------
Radu                 Pârvu
RAdu                 PÂrvu
Radu		     Parvu
RAdu		     PArvu

In conclusion, we see that when defining a column to be case or accent insensitive, the SQL using that column in the predicate return a result set ignoring the case (or accent and case) in the column values being compared.

In the next post, we will look into using collation at statement level.

References:

https://docs.oracle.com/en/database/oracle/oracle-database/18/spuss/enabling-the-new-extended-data-type-capability.html#GUID-88FB7FFD-4392-49C6-843A-45B49F8A1821

Tech & lifestyle

Since 2016