Oracle PL SQL

A category holding articles for Oracles programming language PL SQL.

Recently I had to extract information from a field using Oracles function RegExp_SubStr().

The function is nice but it does only deliver one match to a pattern at a time. Therefore I implemented a PL SQL function, which returns a collection of matches to a handed regular expression.

I do not want to discuss the function in detail here since the Oracle documentation is detailed enough. See link provided under References at the end of this article.

Source Code

First we need to define a new type in which we can collect the matches in that will be returned. Since I only process Strings of type VarChar2 the following Type definition is sufficient. In general the function RegExp_SubStr accepts all kinds of text datatypes like CHAR, VARCHAR2, NCHAR, NVARCHAR2, CLOB, or NCLOB. In order to provide support for the different datatypes the implementation has to be adjusted to your needs or overridden. But this is not part of this article since I trust you to be able to adjust it accordingly.

Type textCollection Is Table Of VarChar2(256 Char) Index By Binary_Integer;

The above defines a 1 based collection, which is internally a Table with one column of type accessible like an Array via an integer index.

After defining the type like this I can reference it using the following syntax.

...
IS
 aColl textCollection;
BEGIN
 aColl(1) := 'Hello World'
 aColl(2) := 'another String'
...
END;

In order to get only unique matches I also defined a hashCollection, which allows me to quickly check, if a match has already been added to a collection to be returned.

Type hashCollection Is Table Of VarChar2(256 Char) Index By VarChar2(256 Char);

The above provides a collection which always associates a String key to a String value. It behaves like a HashTable. I do not know about the internal order of the type defined above, but its objective is anyway to provide values for keys not indices.

Now that I have all needed collections prepared I can use them in a Function to get all unique matches to a regular expression from a provided String. 

/* ------------------------------------------------------------------------ */
/*
Author
	Alexander Bolte

ChangeDate
	2014-04-29

Description
	Creates a collection of matches for a provided pattern,
	identified in given String using regular expression.

Param aText
	VarChar2 holding the text, which is beeing sought.
Param pattern
	a valid regular expression, which is beeing searched.

Returns
	an indexed Table holding one column of VarChar2.

Remarks
	the returned collection only contains uniique values. Duplicates
	are ignored.
*/
FUNCTION getRegExpColl(
	aText   IN VARCHAR2,
	pattern IN VARCHAR2)
RETURN textCollection
IS
	matches textCollection;
	hasMatch hashCollection;
	aMatch VARCHAR2(256 CHAR);
	i      NUMBER := 0;
BEGIN
	LOOP
	i      := i + 1;
	aMatch := RegExp_SubStr(aText, pattern, 1, i, 'im');
	EXIT
	WHEN aMatch IS NULL;
		IF NOT hasMatch.Exists(aMatch) THEN
			matches(i)       := aMatch;
			hasMatch(aMatch) := aMatch;
		END IF;
	END LOOP;
	hasMatch.Delete;
	RETURN matches;
END;

The parameter 'im' handed to RegExp_SubStr tells the function to behave case-insensitive ('i') and to treat the source string searched for patterns as having multiple lines ('m'). Equals 'im'. This is a very important parameter as it has influence not only on the behavior of the function but also the syntax and contents in regular expressions. Read the Oracle documentation to RegExp_SubStr carefully before using above function. You might experience it not working, if you do not adjust it to your needs.

Clearing a collection can be applied using the following command.

aColl.Delete 

Example usage (not tested)

Extract 4 digit numbers from a text.

IS 
    matches textCollection;
    i Number := 0;
BEGIN
    matches := getRegExpColl('a text too 78541, 5649 extract foo bar 1258 from', '[0-9]{4}');
    FOR i IN 1 .. matches.count
    LOOP
      dbms_output.put_line(matches(i));
    END LOOP;
END;

Result will look as follows (not tested).

5649
1258

References

Oracle documentation on the function RegExp_SubStr.

http://docs.oracle.com/cd/B19306_01/server.102/b14200/functions131.htm 

Unfortunately you will not always have all rights on the database you would want or need to fulfill your tasks.

Especially when you have to do performance tuning on Queries and you do not have the right to use the Oracle Advisor you can get frustrated because the DBA is not available as you only do stuff on a UAT or even SIT. "Sorry, the DBA is busy with PRD." Nice!

When in addition your company decided to switch off automated update of statistics for the Optimizer, you want to pull out your hair.

Recently I had to rebuild some indexes, since they met two major conditions under which an index rebuild most likely will improve performance.

According to Burleson, an index rebuild can help under the following conditions.

http://www.dba-oracle.com/art_dbazine_idx_rebuild.htm

  • High index fragmentation:  The SQL workload has lots of table DML causing lots of deleted leaf blocks.
  • High index scan access plans:  The SQL workload is rich with index scans (index fast-full scans and index range scans)

The second condition definetely holds valid for my clients database. The application, RSA IGL is used for over 10 years without any archiving or regular house keeping and up to 100 Million entries in several important tables.