The use of dynamic vs aggregated functions
October 3rd, 2007 at 1:00 pm by JasperAn eyeopener for me when constructing sql queries in a complex environment was the use of analytic functions. As appose to aggregated functions, you can use more analytic functions on one table instead of constructing multiple inline views to accomplish “the same”.
For those of you who like to see examples:
select job_id
, sum(salary)
from employees
group by job_id;
has the same result as:
select distinct job_id
, sum(salary) over (partition by job_id)
from employees;
But using the second query you could also add for example, the avarage salary per job:
select distinct job_id
, sum(salary) over (partition by job_id)
, avg(salary) over (partition by job_id)
from employees;
If I would want this result using an aggregated function, I would have to use a inline views, like this:
select s.job_id, s.ssal, a.asal
from ( select job_id, sum(salary) ssal
from employees
group by job_id
) s
, ( select job_id, avg(salary) asal
from employees
group by job_id
) a
where s.job_id = a.job_id
order by s.job_id
;
So that makes the code much more readable. Advantage number one…
But also, something that can’t be accomplished using aggregated functions, is selecting other data besides the grouped data from the table. Such as:
select job_id
, first_name
, sum(salary) over (partition by job_id) ssal
, avg(salary) over (partition by job_id) asal
from employees
order by job_id
;
Now I can make much better use of this data, and it’s less code!
Sort of automatically I figured that using analytic functions would also be quicker. Because after all, we would only have to do a full table scan once. But I found myself to be greatly mistaken when I checked the explain plans of two different statements, one using aggregated and one using analytic functions. Explain plan for the query above using aggregated function vs the analytic functions shows that the aggregated function has much better performance.
So by using analytic functions on large sets of data… causes some performance difficulties. So using aggregated functions would be better here?
I started searching the net for explanations. Discovering first of all, that analytic functions should not be used to aggregate. Because when you aggregate you return only the aggregated rows. By using a distinct in the query’s above using the analytic functions, you get the same result, but for every row that is selected, the analytic function is added. So logically this is costing us performance.
I discovered this by reading an article on asktom.com. So I guess I should always ask myself the question: do I want to analyse or aggregate?…
Other articles show that the use of analytic of aggregated functions is all up to you. What’s your gain? What’s your personal preference?
I think analytic functions are fantastic and I couldn’t stop using them even if I wanted to
But I guess it’s up to you to deside…
Popularity: 289 points


October 3rd, 2007 at 7:52 pm
On your sum-and-average example, what’s wrong with:
select job_id
, sum(salary)
, avg(salary)
from employees
group by job_id;
??
October 4th, 2007 at 7:12 am
You’re absolutely right. Whatever grouped function you would use would be on the same grouped data for that select statement anyway. I guess the advantage lies more in the remaining data you can select without creating inline views. Thanks for the comment.