SQLCookbook中文版:第12章 报表和数据仓库运算

本章将介绍几种查询,它们对创建报表非常有帮助。其代表性的内容有针对报表格式的 一些因素以及不同层的聚集。

本章的另一个重点是转置变换结果集、把行转换为列。转置变换对解决许多问题是一种非常有用的技巧。

经转置变换会使人感觉更舒服,除本章所介绍的之外,无疑还有其他用途。

12.1 将结果集转置为一行

每个部门的员工数目

select deptno, count(*) as cnt from emp group by deptno
deptno cnt
10 3
20 5
30 6

python 实现:

import mysql.connector
cnx = mysql.connector.connect(user='root', password='',
                              host='127.0.0.1',
                              database='sqlcookbook')
cursor = cnx.cursor()

sql = 'select deptno, count(*) as cnt from emp group by deptno;'
cursor.execute(sql)
results = cursor.fetchall()

cursor.close()
cnx.close()

depts = []
cnts = []
for r in results:
    depts.append(str(r[0]))
    cnts.append(str(r[1]))

for d in depts:
    print("{:9}".format('deptno_' + d),end=' ')

print('\t')

for i in range(len(depts)):
    print("{:9}".format('---------'),end=' ')

print('\t')

for c in cnts:
    print("{:^9}".format(c),end=' ')

输出:

deptno_10 deptno_20 deptno_30   
--------- --------- ---------   
    3         5         6     

12.2 把结果集转置为多行

select job,ename from emp order by job
job ename
ANALYST SCOTT
ANALYST FORD
CLERK JAMES
CLERK ADAMS
CLERK SMITH
CLERK MILLER
MANAGER BLAKE
MANAGER CLARK
MANAGER JONES
PRESIDENT KING
SALESMAN MARTIN
SALESMAN TURNER
SALESMAN WARD
SALESMAN ALLEN

希望将各job作为列名,返回数据. python 实现

import mysql.connector
cnx = mysql.connector.connect(user='root', password='',
                              host='127.0.0.1',
                              database='sqlcookbook')
cursor = cnx.cursor()

sql = 'select job,ename from emp order by job;'
cursor.execute(sql)
results = cursor.fetchall()

cursor.close()
cnx.close()

dict_job_ename = dict()

for r in results:
    if r[0] in dict_job_ename:
        temp = dict_job_ename[r[0]]
        temp.append(r[1])
        dict_job_ename[r[0]] = temp
    else:
        dict_job_ename[r[0]] = [r[1]]

# max ename len, num of lines
max_ename_len = 0
for k in dict_job_ename.keys():
    if len(dict_job_ename[k]) > max_ename_len:
        max_ename_len = len(dict_job_ename[k])

# print 
keys = list(dict_job_ename.keys())

for k in keys:
    print("{:12}".format(k),end=' ')
print('\t')

for k in keys:
    print("{:12}".format('------------'),end=' ')
print('\t')

for i in range(max_ename_len): 
    for k in keys:
        try:
            print("{:12}".format(dict_job_ename[k][i]),end=' ')
        except:
            print('            ',end= ' ')
    print('\t')
ANALYST      CLERK        MANAGER      PRESIDENT    SALESMAN
------------ ------------ ------------ ------------ ------------
SCOTT        JAMES        BLAKE        KING         MARTIN
FORD         ADAMS        CLARK                     TURNER
             SMITH        JONES                     WARD
             MILLER                                 ALLEN

12.3 反向转置结果集

(略)

12.4 将结果集反向转置为一列

(略)

12.5 抑制结果集中的重复值

select deptno,ename from emp order by deptno;
deptno ename
10 MILLER
10 KING
10 CLARK
20 FORD
20 ADAMS
20 SCOTT
20 JONES
20 SMITH
30 BLAKE
30 MARTIN
30 TURNER
30 WARD
30 JAMES
30 ALLEN

希望 deptno 只显示一次, python实现:

import mysql.connector
cnx = mysql.connector.connect(user='root', password='',
                              host='127.0.0.1',
                              database='sqlcookbook')
cursor = cnx.cursor()

sql = 'select deptno,ename from emp order by deptno;'
cursor.execute(sql)
results = cursor.fetchall()

cursor.close()
cnx.close()

# print 
print("{:12}".format('deptno'),end=' ')
print("{:12}".format('ename'),end=' ')
print('\t')

print("{:12}".format('-'*12),end=' ')
print("{:12}".format('-'*12),end=' ')
print('\t')

printed_deptno = []

for r in results:
    if r[0] in printed_deptno:
        print("{:12}".format(''),end=' ')
    else:
        print("{:12}".format(r[0]),end=' ')
        printed_deptno.append(r[0])

    print("{:12}".format(r[1]),end=' ')
    print('\t')
deptno       ename
------------ ------------
          10 MILLER
             KING
             CLARK
          20 FORD
             ADAMS
             SCOTT
             JONES
             SMITH
          30 BLAKE
             MARTIN
             TURNER
             WARD
             JAMES
             ALLEN

12.6 转置结果集以利于跨行计算

(略)

12.7 创建固定大小的数据桶

每个桶的数据量大小固定

(略)

12.8 创建预定数目的桶

指定数据桶的数量

(略)

12.9 创建横向直方图

select deptno,count(*) as cnt from emp group by deptno order by deptno;
deptno cnt
10 3
20 5
30 6

python 实现

import mysql.connector
cnx = mysql.connector.connect(user='root', password='',
                              host='127.0.0.1',
                              database='sqlcookbook')
cursor = cnx.cursor()

sql = 'select deptno,count(*) as cnt from emp group by deptno order by deptno;'
cursor.execute(sql)
results = cursor.fetchall()

cursor.close()
cnx.close()

# print 
print("{:12}".format('deptno'),end=' ')
print("{:12}".format('cnt'),end=' ')
print('\t')

print("{:12}".format('-'*12),end=' ')
print("{:12}".format('-'*12),end=' ')
print('\t')

for r in results:
    print("{:12}".format(r[0]),end=' ')
    print("{:12}".format(r[1]*'*'),end=' ')
    print('\t')
deptno       cnt
------------ ------------
          10 ***
          20 *****
          30 ******

12.9 创建纵向直方图

python实现:

import mysql.connector
cnx = mysql.connector.connect(user='root', password='',
                              host='127.0.0.1',
                              database='sqlcookbook')
cursor = cnx.cursor()

sql = 'select deptno,count(*) as cnt from emp group by deptno order by deptno;'
cursor.execute(sql)
results = cursor.fetchall()

cursor.close()
cnx.close()

# print head
for r in results:
    print("{:^12}".format('dept' + str(r[0])),end=' ')
print('\t')

for r in results:
    print("{:12}".format('-'*12),end=' ')
print('\t')

# num of lines
num_of_lines = max([r[1] for r in results])

# print *
for i in range(num_of_lines):
    for r in results:
        if r[1] >=(num_of_lines-i):
            print("{:^12}".format('*'),end=' ')
        else:
            print("{:^12}".format(''),end=' ')
    print('\t')
   dept10       dept20       dept30
------------ ------------ ------------
                               *
                  *            *
                  *            *
     *            *            *
     *            *            *
     *            *            *

12.11 返回未包含在GROUP BY 中的列

(略)

12.12 计算简单的小计

MySQL: with rollup

select JOB, sum(sal) as SAL from emp  group by job  with rollup
JOB SAL
ANALYST 6000.00
CLERK 4150.00
MANAGER 8275.00
PRESIDENT 5000.00
SALESMAN 5600.00
NULL 29025.00

12.13 计算所有表达式组合的小计

MySQL,PG: group + union all

select deptno,job,'TOTAL_BY_DEPT_JOB' as category,sum(sal) as sal 
from emp group by deptno,job
union all
select null,job,'TOTAL_BY_JOB' as category,sum(sal)  
from emp group by job
union all
select null,deptno,'TOTAL_BY_DEPT' as category,sum(sal)  
from emp group by deptno
union all
select null,null,'TOTAL' as category,sum(sal)  
from emp 
deptno job category sal
10 CLERK TOTAL_BY_DEPT_JOB 1300.00
10 MANAGER TOTAL_BY_DEPT_JOB 2450.00
10 PRESIDENT TOTAL_BY_DEPT_JOB 5000.00
20 ANALYST TOTAL_BY_DEPT_JOB 6000.00
20 CLERK TOTAL_BY_DEPT_JOB 1900.00
20 MANAGER TOTAL_BY_DEPT_JOB 2975.00
30 CLERK TOTAL_BY_DEPT_JOB 950.00
30 MANAGER TOTAL_BY_DEPT_JOB 2850.00
30 SALESMAN TOTAL_BY_DEPT_JOB 5600.00
NULL ANALYST TOTAL_BY_JOB 6000.00
NULL CLERK TOTAL_BY_JOB 4150.00
NULL MANAGER TOTAL_BY_JOB 8275.00
NULL PRESIDENT TOTAL_BY_JOB 5000.00
NULL SALESMAN TOTAL_BY_JOB 5600.00
NULL 10 TOTAL_BY_DEPT 8750.00
NULL 20 TOTAL_BY_DEPT 10875.00
NULL 30 TOTAL_BY_DEPT 9400.00
NULL NULL TOTAL 29025.00

12.14 判别非小计的行

(略)

12.15 使用case表达式给行做标记

使用case的前提是知道取值列表,并没有python脚本灵活.

select ename,job, 
    case when job='CLERK' then 1 else 0 
    end as is_clerk,
    case when job='SALESMAN' then 1 else 0 
    end as is_salesman,
    case when job='MANAGER' then 1 else 0 
    end as is_manager,
    case when job='ANALYST' then 1 else 0 
    end as is_analyst,
    case when job='PRESIDENT' then 1 else 0 
    end as is_president
from emp order by job
ename   job is_clerk    is_salesman is_manager  is_analyst  is_president
SCOTT   ANALYST     0   0   0   1   0
FORD    ANALYST     0   0   0   1   0
JAMES   CLERK       1   0   0   0   0
ADAMS   CLERK       1   0   0   0   0
SMITH   CLERK       1   0   0   0   0
MILLER  CLERK       1   0   0   0   0
BLAKE   MANAGER     0   0   1   0   0
CLARK   MANAGER     0   0   1   0   0
JONES   MANAGER     0   0   1   0   0
KING    PRESIDENT   0   0   0   0   1
MARTIN  SALESMAN    0   1   0   0   0
TURNER  SALESMAN    0   1   0   0   0
WARD    SALESMAN    0   1   0   0   0
ALLEN   SALESMAN    0   1   0   0   0

12.16 创建稀疏矩阵

(略)

12.17 按时间单位给行分组

(略)

12.18 对不同组/分区同时实现聚集

(略)

12.19 对移动范围的值进行聚集

(略)

12.20 转置带小计的结果集

(略)

第13章 分层查询

本章介绍如何表示数据的层次关系。当处理分层数据时 ,检索和显示这些数据比存储它们更难。由于 SQL 不太灵活 (SQL 的非递归性质) ,这种现象更为突出。当处理分层查询时,绝对要利用RDBMS 提供的方法实现这些操作,否则可能会被为处理分层数据而编写低效的查询和构建令人费解的数据模型搞得焦头烂额。

本章将利用每个RDBMS 提供的函数, 介绍如何拆开数据的层次结构。

(略)

第14章 若干另类目标

(略)

附录A:窗口函数补充

(略)

附录B:回顾Rozenshtein

(略)

© Licensed under CC BY-NC-SA 4.0

通往地狱的路,都是由善意铺成的——哈耶克

发表我的评论
取消评论
表情

Hi,您需要填写昵称和邮箱!