SQL经典实例(附录)窗口函数

窗口函数针对指定的行集合(分组)执行聚合运算。不一样之处在于,窗口函数可以为每一个分组返回多个值,而聚合函数只能返回单一值。聚合运算的对象实际上是一组行记录,咱们称之为“窗口”(所以才有了术语“窗口函数”)。在Oracle中成为分析函数。sql

窗口操做

若是要计算整个公司的员工总数,传统作法是执行count()*:函数

select count(*) from emp;

clipboard.png
可是有时候咱们可能须要从非聚合数据行或者从不一样纬度的聚合数据行里访问这一类聚合运算结果。spa

select ename,
       deptno,
       count(*) over() as cnt
    from emp
order by 2;

clipboard.png

关键字 OVER 代表 COUNT 函数会做为窗口函数来调用,而不是一次普通的聚合函数调用。3d

执行时机

这里咱们为前一节的查询语句加上一个 WHERE 子句,以过滤掉 DEPTNO 等于 20和 30 的员工。code

select ename,
       deptno,
       count(*) over() as cnt
    from emp where deptno = 10
order by 2;

clipboard.png

该示例代表 WHERE 和 GROUP BY 这一类子句执行完以后,才轮到窗口函数执行。对象

分区

可使用 PARTITION BY 子句针对行数据进行分区(partition)或者分组(group),并根据其结果执行聚合运算。咱们在前面的示例中看到过,若是 OVER 关键字后面跟着一个空的圆括号,那么窗口函数执行聚合运算时,会把该查询结果集总体做为一个分区来看待。所以,咱们不妨把 PARTITION BY 子句理解成“动态的 GROUP BY”,它不一样于传统的 GROUP BY,由于在最终的结果集中容许出现多种由 PARTITION BY 生成的分区。
考虑以下查询语句:blog

select ename,
       deptno,
       count(*) over(partition by deptno) as cnt
    from emp 
order by 2;

clipboard.png

因为使用了 PARTITION BY DEPTNO 子句,如今聚合函数 COUNT 会分别计算出每个部门的员工人数。排序

相较于传统的 GROUP BY,PARTITION BY 子句的另外一个好处是,在同一个 SELECT 语句里咱们能够按照不一样的列进行分区,并且不一样的窗口函数调用之间互不影响。ip

以下所示的查询,它会逐一列出全体员工,并返回每个人所属的部门,所在部门的员工总数,每个人的职位,以及公司范围内从事相同工做的员工总数。get

select ename,
       deptno,
       count(*) over(partition by deptno) as dept_cnt,
       job,
       count(*) over(partition by job) as job_cnt
    from emp 
order by 2;

clipboard.png

Null的影响

相似于 GROUP BY 子句,PARTITION BY 子句会把全部的 Null 纳入同一个分区或者分组。
考虑以下查询:

select coalesce(comm,-1) as comm,
    count(*)over(partition by comm) as cnt
from emp

clipboard.png
若是不用count()而是用count(comm)*则会有以下结果:

select coalesce(comm,-1) as comm,
    count(comm)over(partition by comm) as cnt
from emp

clipboard.png
聚合函数会忽略掉 NULL 值。

当使用 COUNT 函数时,咱们应该思考一下是否要把 Null 包括在内。使用COUNT(column) 会忽略 Null。若是但愿把 NULL 值一并计入,则应该使用COUNT(*)。(此时咱们要计算的不是实际的列值,而是但愿知道有多少行。)

排序

当在窗口函数的 OVER 子句中使用 ORDER BY 时,咱们其实是在决定两件事:
(1) 分区内的行数据如何排序;
(2) 计算涉及哪些行数据。
咱们来看一下以下所示的查询,该查询计算出了 DEPTNO 等于 10 的员工的工资累计合计值。

select deptno,
       ename,
       hiredate, 
       sal,
       sum(sal) over(partition by deptno) as total1,
       sum(sal) over() as total2,
       sum(sal) over(order by hiredate) as running_total
    from emp
where deptno = 10;

这个查询与下列查询等价,使用range between...and显式指定了order by hiredate默认行为方式:

select deptno,
       ename,
       hiredate, 
       sal,
       sum(sal) over(partition by deptno) as total1,
       sum(sal) over() as total2,
       sum(sal) over(order by hiredate
                    range between unbounded preceding
                    and current row) as running_total
    from emp
where deptno = 10;

结果均为:
clipboard.png
上述查询中出现的 RANGE BETWEEN 子句在 ANSI 标准中被称做 Framing 子句。Framing 子句能定义动态变化的“数据子窗口”,并将其融入聚合运算。
例如以下查询语句:

select deptno,
       ename,
       hiredate, 
       sal,
       sum(sal) over(order by hiredate
                    range between unbounded preceding
                    and current row) as run_total1,
       sum(sal) over(order by hiredate
                    range between 1 preceding
                    and current row) as run_total2,
       sum(sal) over(order by hiredate
                    range between current row
                    and unbounded following) as run_total3,
       sum(sal) over(order by hiredate
                    range between current row
                    and 1 following) as run_total4
    from emp
where deptno = 10;

clipboard.png

select 
    ename,
    sal,
    min(sal)over(order by sal) min1,
    max(sal)over(order by sal) max1,
    min(sal)over(order by sal
                range between unbounded preceding
                and unbounded following) min2,
    max(sal)over(order by sal
                range between unbounded preceding
                and unbounded following) max2,
    min(sal)over(order by sal
                range between current row
                and current row) min3,
    max(sal)over(order by sal
                range between current row
                and current row) max3,
    max(sal)over(order by sal
                rows between 3 preceding
                and 3 following) max4
 from emp;

clipboard.png

《SQL经典实例》 附录A