My SQL skills are good and I rarely faced major performance issues yet when using a real database. MS Access: Yeah of course I had performance issues, but I could most times solve them by aggregating data in temp tables as soon as possible. On a real database server even running only locally on a laptop I never had issues. Maybe because I always try to follow some basics, like using indices or appropriate joins and of course testing properly. Another reason might be that we have a pretty sick database server, which others would envy us for :0).
Today a new functionality I implemented should have gone live, but it did not pass User Acceptance Testing as the performance was very bad. What should have ran only a few minutes, did not finish within three hours and we had to kill the session.
Check all used views
So, we started to look into the issue. The DBA's first suspect looking at the execution plans of the queries was us using a view, which is defined to use a regular expression to split one field in its source table into two new fields.
We thought and still think that what he said makes sense: Using a regular expressions in a view is a bad idea. To be honest, I did not check the referenced view in detail before using it and you can imagine I was not happy.
So we had a look into the SQL in order to fix the performance issue with regards to the DBA's insight. First I started to execute the SQL Statements one by one in order to find the bad guy, but even the very big queries were running fine in SQL Developer the next day.
Explain the execution plan
Next I was told to have a look at the execution plans of single SQL statements. Since I do not have sufficient rights on UAT to look at the execution plans of queries executed in the past I had to use the "explain plan for ..." command in SQL Developer. This will create an execution plan for a given query using the current state of the database like statistics for tables and other objects.
Use the following command in order to create a execution plan for an SQL without actually executing it.
So far so good. If the query is correct, you will receive a message like "Explained.". If a query is not correct, you will receive an error message.
Display the last execution plan explanation
Now to display an explanation for an execution plan, you have to execute the following command. This will display a tabular overview for the last created execution plan with all details. If you need less detail you can change the last parameter according to Oracle's blog post (link can be found in the references at the end of this article).
Below a less detailed execution plan for above SQL. With 'all' you will receive a lot of detail.
I will not explain here how to read an execution plan for now, since this can be found in Oracle's blog post.
Interpreting the execution plan explanation
However I want to have a closer look at "Operation", "Name" and "Rows".
The "Name" provides the name of the object, which is referenced may it be a table or a view.
The operation can give you a hint on Full Table Scans executed in a query. Full table scans cost a lot of performance and I started to look into impacted tables first to ensure they had indices for fields used in joins and filters.
Since this was the case for all tables except of course for the view using regular expressions, I asked a more experienced programmer what I could do. He said that I should have a look at the number of "Rows" in the execution plan especially for potentially big tables or tables which are joined on big tables. If there were any tables accessed, which showed only 1 or a few rows, when it should be more records in this table, the statistics might be outdated.
I looked at him with a big question mark above my head and he further explained that outdated statistics for tables cause the Oracle Optimizer to fail as he will build Cartesian Joins which will return a join of every row in table A with every row in table B. This of course will cause very bad performance as the result set, which afterwards has to be narrowed down can be huge depending on the amount of rows in referenced tables. Its like a join without a join condition.
So my question was, why are the statistics outdated? He answered that one of the applications (RSA - IMG) using the database requires the automated updating of statistics to be switched off in Oracle. Great!
In most environments this will NOT be the reason for your SQL to show bad performance.
Manually update statistics for one table
So we used the following command to update the statistics at least for the tables referenced in my SQL.
When we had updated the statistics I explained the plans for some big SQL Statements again and the execution plans showed now the right counts for Rows.
Alternatively you can use the following SQL to check the update of statistics for one or several tables.
I executed the newly implemented function triggering all unchanged queries again. What had not finished in three hours the day before succeeded now in roughly 30 seconds. Wow! So simple and still so annoying.
The new function we implemented could not go live, because of the switched off automated updating of statistics in the database. I wish someone would have told me before. On SIT everything ran fine, but of course we had much less data there.
The next step will be to utilize an existing PL SQL in Java to update the statistics for all impacted tables after the new data has been imported into our staging tables. I am not happy with this solution and will try to find out, if it is really necessary to keep the automated statistics update switched off to run RSA IMG. The more experienced programmer could not tell me why it has to be switched off, just that it is recommended.
I asked him how he and the others are supposed to do their job properly but he just sighed. The database server would be able to do much more complex things without even sweating one bit if the statistics would be updated automatically.
Anyhow, I hope this article can give you some hints on how to analyze performance of your Oracle SQL queries.