Wednesday, September 19, 2007

Functions

Over the years I have come across lots of developers. When they discover they can use functions in SQL statements they end up using them every where. Coming from that background myself I can see how they can be very seductive. The idea was always to try to reuse as much code as possible. You wanted to make things as modular as possible. Functions allowed you do that. Using functions you could effectively outer join a table or tables to your main query. They seem great by allowing you to cut your query into chunks and even reuse the pieces. Of course, when you see how a function laden query performs you start to see the problem. Developers never believe me when I say they need to remove some or all the functions in the query. I am not one that says you can never use you a function. You just need to be prepared for how much performance functions can suck up. The bigger your resultset, the worse your performance.

Here is a very simple example.

function fn_1 (p1 in number) return number is
v_cnt number;
begin
select count(distinct col3) into v_cnt
from sometable2
where col1 = p1;
return v_cnt;
end;

Select col1, fn_1(col1) fn1
from sometable1 a
where col2 = 'blah';

I had actually come across one query that had 8 functions in it. The developer wanted me to do my magic and make the database go faster. Of course, its not the database having issues its the SQL statement.

A simple solution to my example above is something like this.

Select col1, count(distinct b.col3) cnt
from sometable1 a, sometable2 b
where a.col1 = b.col1 (+)
and a.col2 = 'blah'
group by a.col1;

A single query like this allows the database to evaluate the full task and do that SQL operation. When you have something like the original query, the database executes the query then loops through and executes the function for each row. Looking at the explain plan for the original query you would never see the work that is being done in the function. The function's work would be hidden. You would have to pull out the SQL and run explain plans on each query inside the function to know what its doing. Every step you do is one more step that slows down the query. Developers love using functions and Oracle has responded by adding a special cache in 11g that will help speed up function laden queries. Of course even with a cache you will never go faster than the just doing things in one query. The cache would help functions that get called with the same input often. So if your function took the 2 character state code and returned the full state name, its possible that most of those calls would be cached after a while. However something like taking in an account number and returning a balance would probably never be in the cache.

The main thing know about functions that reference tables being used in SQL is to try to not use them if at all possible, functions that are passed parameters and do logic with them can make the main query simpler. Ask "Is there a table in my function?" Yes - bad. No - good. Sometimes it makes sense to use them and its nice to have the flexibility to do that. I would say using them should be the exception for your code and not the rule.

No comments: