Oracle Database

Recently I had to extract information from a field using Oracles function RegExp_SubStr().

The function is nice but it does only deliver one match to a pattern at a time. Therefore I implemented a PL SQL function, which returns a collection of matches to a handed regular expression.

I do not want to discuss the function in detail here since the Oracle documentation is detailed enough. See link provided under References at the end of this article.

Source Code

First we need to define a new type in which we can collect the matches in that will be returned. Since I only process Strings of type VarChar2 the following Type definition is sufficient. In general the function RegExp_SubStr accepts all kinds of text datatypes like CHAR, VARCHAR2, NCHAR, NVARCHAR2, CLOB, or NCLOB. In order to provide support for the different datatypes the implementation has to be adjusted to your needs or overridden. But this is not part of this article since I trust you to be able to adjust it accordingly.

Type textCollection Is Table Of VarChar2(256 Char) Index By Binary_Integer;

The above defines a 1 based collection, which is internally a Table with one column of type accessible like an Array via an integer index.

After defining the type like this I can reference it using the following syntax.

...
IS
 aColl textCollection;
BEGIN
 aColl(1) := 'Hello World'
 aColl(2) := 'another String'
...
END;

In order to get only unique matches I also defined a hashCollection, which allows me to quickly check, if a match has already been added to a collection to be returned.

Type hashCollection Is Table Of VarChar2(256 Char) Index By VarChar2(256 Char);

The above provides a collection which always associates a String key to a String value. It behaves like a HashTable. I do not know about the internal order of the type defined above, but its objective is anyway to provide values for keys not indices.

Now that I have all needed collections prepared I can use them in a Function to get all unique matches to a regular expression from a provided String. 

/* ------------------------------------------------------------------------ */
/*
Author
	Alexander Bolte

ChangeDate
	2014-04-29

Description
	Creates a collection of matches for a provided pattern,
	identified in given String using regular expression.

Param aText
	VarChar2 holding the text, which is beeing sought.
Param pattern
	a valid regular expression, which is beeing searched.

Returns
	an indexed Table holding one column of VarChar2.

Remarks
	the returned collection only contains uniique values. Duplicates
	are ignored.
*/
FUNCTION getRegExpColl(
	aText   IN VARCHAR2,
	pattern IN VARCHAR2)
RETURN textCollection
IS
	matches textCollection;
	hasMatch hashCollection;
	aMatch VARCHAR2(256 CHAR);
	i      NUMBER := 0;
BEGIN
	LOOP
	i      := i + 1;
	aMatch := RegExp_SubStr(aText, pattern, 1, i, 'im');
	EXIT
	WHEN aMatch IS NULL;
		IF NOT hasMatch.Exists(aMatch) THEN
			matches(i)       := aMatch;
			hasMatch(aMatch) := aMatch;
		END IF;
	END LOOP;
	hasMatch.Delete;
	RETURN matches;
END;

The parameter 'im' handed to RegExp_SubStr tells the function to behave case-insensitive ('i') and to treat the source string searched for patterns as having multiple lines ('m'). Equals 'im'. This is a very important parameter as it has influence not only on the behavior of the function but also the syntax and contents in regular expressions. Read the Oracle documentation to RegExp_SubStr carefully before using above function. You might experience it not working, if you do not adjust it to your needs.

Clearing a collection can be applied using the following command.

aColl.Delete 

Example usage (not tested)

Extract 4 digit numbers from a text.

IS 
    matches textCollection;
    i Number := 0;
BEGIN
    matches := getRegExpColl('a text too 78541, 5649 extract foo bar 1258 from', '[0-9]{4}');
    FOR i IN 1 .. matches.count
    LOOP
      dbms_output.put_line(matches(i));
    END LOOP;
END;

Result will look as follows (not tested).

5649
1258

References

Oracle documentation on the function RegExp_SubStr.

http://docs.oracle.com/cd/B19306_01/server.102/b14200/functions131.htm 

In order to select a value / a result from a query into a variable in PL SQL, you can use the following syntax.

Recently I had to check over and over again, if tables, which I needed for testing are existing in an Oracle database.

Since checking for existance and completion repeatedly involved the same tasks I decided to implement little PL SQL script, which does the task for me.

SQL

Subcategories

This category will hold examples for Oracle SQL.

This category will hold Oracle SQL Developer specific articles.

A category holding articles for Oracles programming language PL SQL.