公用表表达式(Common Table Expression,简称 CTE)是 MySQL 8.0 引入的一项重要特性,它通过 WITH 语法定义临时命名结果集,可以在单个查询中多次引用。与子查询相比,CTE 提供了更好的可读性和维护性,特别是在处理复杂查询时。
注意:CTE 仅在当前查询执行期间有效,不会像临时表那样持久化存储,也不会占用额外的存储空间。
CTE 的核心价值在于:
在性能方面,CTE 通常会被优化器内联处理,不会产生额外的性能开销。但对于复杂的递归 CTE,可能需要特别注意查询效率问题。
非递归 CTE 的基本语法如下:
sql复制WITH cte_name AS (
SELECT column1, column2...
FROM table_name
WHERE conditions...
)
SELECT * FROM cte_name;
这种结构特别适合需要多次引用同一子查询结果的场景。例如,在分析销售数据时,我们可能先计算各产品的总销售额,然后在多个地方使用这个中间结果。
sql复制WITH dept_stats AS (
SELECT
department_id,
AVG(salary) AS avg_salary,
MAX(salary) AS max_salary,
MIN(salary) AS min_salary
FROM employees
GROUP BY department_id
)
SELECT
e.employee_id,
e.name,
e.salary,
d.avg_salary,
CASE
WHEN e.salary > d.avg_salary THEN '高于平均'
ELSE '低于平均'
END AS salary_status
FROM employees e
JOIN dept_stats d ON e.department_id = d.department_id;
这个查询首先计算各部门的薪资统计指标,然后在主查询中将员工薪资与部门平均值进行比较。使用 CTE 使得查询逻辑更加清晰,避免了重复计算部门平均薪资。
sql复制WITH sales_summary AS (
SELECT
salesperson_id,
SUM(amount) AS total_sales,
COUNT(*) AS transaction_count
FROM sales
WHERE sale_date BETWEEN '2023-01-01' AND '2023-12-31'
GROUP BY salesperson_id
),
ranked_sales AS (
SELECT
salesperson_id,
total_sales,
transaction_count,
RANK() OVER (ORDER BY total_sales DESC) AS sales_rank,
DENSE_RANK() OVER (ORDER BY total_sales DESC) AS dense_sales_rank
FROM sales_summary
)
SELECT * FROM ranked_sales WHERE sales_rank <= 10;
这个例子展示了如何串联使用多个 CTE:第一个 CTE 计算销售汇总数据,第二个 CTE 进行排名计算,最后筛选出排名前10的销售人员。
MATERIALIZED 提示强制 MySQL 物化中间结果递归 CTE 通过以下三个关键部分实现:
基本语法结构:
sql复制WITH RECURSIVE cte_name AS (
-- 初始查询(非递归部分)
SELECT initial_columns
FROM initial_table
WHERE initial_conditions
UNION [ALL]
-- 递归查询部分
SELECT recursive_columns
FROM cte_name
JOIN some_table ON join_conditions
WHERE recursive_conditions
)
SELECT * FROM cte_name;
假设我们有一个包含5层结构的部门表:
sql复制WITH RECURSIVE org_hierarchy AS (
-- 初始查询:获取顶级部门
SELECT
id,
name,
parent_id,
1 AS level,
name AS path
FROM departments
WHERE parent_id IS NULL
UNION ALL
-- 递归查询:获取下级部门
SELECT
d.id,
d.name,
d.parent_id,
h.level + 1,
CONCAT(h.path, ' > ', d.name) AS path
FROM departments d
JOIN org_hierarchy h ON d.parent_id = h.id
WHERE h.level < 10 -- 防止无限递归的安全措施
)
SELECT * FROM org_hierarchy
ORDER BY path;
这个查询不仅展示了部门的层级关系,还通过 path 列生成了完整的部门路径字符串,如"总公司 > 技术部 > 后端开发组"。
sql复制WITH RECURSIVE bom_explosion AS (
-- 初始查询:获取顶级物料
SELECT
component_id,
parent_id,
quantity,
1 AS level
FROM bom
WHERE parent_id = 'TOP-ASSEMBLY-001'
UNION ALL
-- 递归查询:展开下级组件
SELECT
b.component_id,
b.parent_id,
b.quantity * be.quantity AS total_quantity,
be.level + 1
FROM bom b
JOIN bom_explosion be ON b.parent_id = be.component_id
)
SELECT * FROM bom_explosion;
这个查询展示了如何计算多级物料清单中各组件的累计用量,非常适用于制造业的物料需求计算。
cte_max_recursion_depth 参数调整在一个查询中可以定义多个CTE,并按顺序引用:
sql复制WITH
sales_data AS (
SELECT product_id, SUM(quantity) AS total_quantity
FROM sales
GROUP BY product_id
),
inventory_status AS (
SELECT
p.product_id,
p.product_name,
p.stock_quantity,
sd.total_quantity,
p.stock_quantity - sd.total_quantity AS remaining
FROM products p
JOIN sales_data sd ON p.product_id = sd.product_id
),
reorder_list AS (
SELECT *
FROM inventory_status
WHERE remaining < (SELECT AVG(total_quantity) FROM sales_data) * 0.3
)
SELECT
product_id,
product_name,
remaining,
CASE
WHEN remaining < 0 THEN '缺货'
ELSE '需补货'
END AS status
FROM reorder_list;
这个查询通过三个CTE逐步分析销售数据、库存状态,最终生成需要补货的产品列表。
sql复制WITH monthly_sales AS (
SELECT
salesperson_id,
DATE_FORMAT(sale_date, '%Y-%m') AS month,
SUM(amount) AS monthly_amount
FROM sales
GROUP BY salesperson_id, month
),
sales_stats AS (
SELECT
salesperson_id,
month,
monthly_amount,
SUM(monthly_amount) OVER (PARTITION BY salesperson_id ORDER BY month) AS cumulative_amount,
monthly_amount - LAG(monthly_amount, 1) OVER (PARTITION BY salesperson_id ORDER BY month) AS monthly_change
FROM monthly_sales
)
SELECT * FROM sales_stats
WHERE monthly_change IS NOT NULL
ORDER BY salesperson_id, month;
这个查询展示了如何结合CTE和窗口函数计算销售人员的月度销售额、累计销售额以及环比变化。
sql复制WITH raw_data AS (
SELECT
id,
TRIM(name) AS cleaned_name,
CASE
WHEN email REGEXP '^[A-Za-z0-9._%-]+@[A-Za-z0-9.-]+\\.[A-Za-z]{2,4}$' THEN email
ELSE NULL
END AS valid_email,
CAST(REGEXP_REPLACE(phone, '[^0-9]', '') AS UNSIGNED) AS numeric_phone
FROM customer_input
),
duplicate_check AS (
SELECT
cleaned_name,
numeric_phone,
COUNT(*) AS dup_count
FROM raw_data
GROUP BY cleaned_name, numeric_phone
HAVING COUNT(*) > 1
)
SELECT
r.*,
IF(d.dup_count IS NULL, 0, 1) AS is_duplicate
FROM raw_data r
LEFT JOIN duplicate_check d ON r.cleaned_name = d.cleaned_name AND r.numeric_phone = d.numeric_phone;
这个例子展示了如何使用CTE进行数据清洗、格式化和重复项检测。
使用EXPLAIN分析CTE查询的执行计划:
sql复制EXPLAIN WITH my_cte AS (...)
SELECT * FROM my_cte;
重点关注:
MySQL 8.0.19+支持CTE物化提示:
sql复制WITH
ALGORITHM = MERGE cte1 AS (SELECT ...),
MATERIALIZED cte2 AS (SELECT ...)
SELECT ...;
MERGE:将CTE内联到主查询(默认行为)MATERIALIZED:强制物化CTE结果sales_summary而非t1问题1:递归CTE导致服务器高负载
LIMIT子句限制返回行数问题2:CTE查询性能突然下降
问题3:递归CTE出现重复结果
UNION ALL而不是UNION在实际项目中,CTE特别适用于报表查询、数据分析管道和复杂业务逻辑实现。合理使用CTE可以显著提升SQL代码的可读性和可维护性,但需要注意控制递归深度和结果集大小,避免性能问题。