作为后端开发中最常用的关系型数据库之一,MySQL 的查询能力直接决定了数据处理效率。我在实际项目中发现,80% 的数据库操作都集中在 20% 的基础查询语法上。本文将系统梳理这些核心语法,并分享我在实际开发中总结的高效查询技巧。
很多初学者会误以为 SQL 语句是按书写顺序执行的,这可能导致性能问题和错误结果。实际上,MySQL 引擎处理 SQL 的顺序是:
sql复制FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT
这个顺序解释了为什么:
提示:理解这个执行顺序是写出高效 SQL 的基础,特别是在处理复杂查询时能避免很多性能陷阱。
基础查询看似简单,但实际开发中有几个关键优化点:
sql复制-- 不推荐写法(性能杀手)
SELECT * FROM users;
-- 推荐写法:明确指定字段
SELECT id, username, email FROM users;
为什么字段列表比 * 更好?
表别名的正确使用场景:
sql复制-- 多表关联时必须使用
SELECT u.name, o.amount
FROM users u
JOIN orders o ON u.id = o.user_id;
-- 单表查询时可省略
SELECT name FROM users;
WHERE 条件的顺序会影响索引使用:
sql复制-- 好的写法:优先使用索引字段
SELECT * FROM users
WHERE status = 1 AND create_time > '2023-01-01';
-- 差的写法:非索引字段在前
SELECT * FROM users
WHERE email LIKE '%@gmail.com' AND status = 1;
实测案例:在一个百万级用户表中,优化后的查询速度从 2.3s 提升到 0.02s。
NULL 比较是 SQL 中最容易出错的部分之一:
sql复制-- 错误写法(不会报错但结果错误)
SELECT * FROM users WHERE phone = NULL;
-- 正确写法
SELECT * FROM users WHERE phone IS NULL;
NULL 的特殊性体现在:
常见分页写法存在严重性能问题:
sql复制-- 低效写法(偏移量大时极慢)
SELECT * FROM users LIMIT 10000, 20;
优化方案1:使用索引覆盖
sql复制SELECT * FROM users
WHERE id >= (SELECT id FROM users ORDER BY id LIMIT 10000, 1)
LIMIT 20;
优化方案2:记住上一页最后ID
sql复制SELECT * FROM users
WHERE id > 最后显示ID
ORDER BY id
LIMIT 20;
实测数据:当偏移量达到 10 万时,优化方案比传统方案快 100 倍以上。
GROUP BY 的隐式排序特性经常被忽视:
sql复制-- 会按 city 自动排序(消耗额外性能)
SELECT city, COUNT(*) FROM users GROUP BY city;
-- 明确指定不需要排序(MySQL 8.0+)
SELECT city, COUNT(*) FROM users
GROUP BY city
ORDER BY NULL;
另一个常见误区是 SELECT 非聚合字段:
sql复制-- 错误写法(不同DBMS表现不一致)
SELECT name, city, COUNT(*) FROM users GROUP BY city;
-- 正确写法
SELECT city, COUNT(*) FROM users GROUP BY city;
最左前缀原则:对于复合索引 (a,b,c),只有以下条件能用上索引:
避免索引失效的常见操作:
使用 EXPLAIN 分析执行计划:
sql复制EXPLAIN SELECT * FROM users WHERE status = 1;
关键指标解读:
多表连接时的性能陷阱:
sql复制-- 低效写法:先笛卡尔积再过滤
SELECT * FROM users, orders
WHERE users.id = orders.user_id;
-- 高效写法:明确使用 JOIN
SELECT * FROM users
JOIN orders ON users.id = orders.user_id;
连接查询的执行计划应该显示:
sql复制-- 按时间倒序取前N条
SELECT * FROM articles
ORDER BY publish_time DESC
LIMIT 10;
优化要点:
sql复制-- 每月新增用户统计
SELECT
DATE_FORMAT(create_time, '%Y-%m') AS month,
COUNT(*) AS new_users
FROM users
GROUP BY month
ORDER BY month;
sql复制-- 低效写法
SELECT COUNT(*) FROM users WHERE email = 'test@example.com';
-- 高效写法(只需判断是否存在)
SELECT 1 FROM users WHERE email = 'test@example.com' LIMIT 1;
sql复制-- 错误:在 WHERE 使用聚合函数
SELECT city, COUNT(*) FROM users
WHERE COUNT(*) > 100
GROUP BY city;
-- 正确:使用 HAVING 过滤分组
SELECT city, COUNT(*) FROM users
GROUP BY city
HAVING COUNT(*) > 100;
sql复制-- 低效:偏移量越大性能越差
SELECT * FROM large_table LIMIT 1000000, 10;
-- 优化方案:使用索引覆盖
SELECT * FROM large_table
WHERE id >= (SELECT id FROM large_table ORDER BY id LIMIT 1000000, 1)
LIMIT 10;
sql复制-- 包含隐藏的文件排序操作
SELECT * FROM users GROUP BY city;
-- 明确取消排序
SELECT * FROM users GROUP BY city ORDER BY NULL;
原始查询(执行时间 4.8s):
sql复制SELECT * FROM users
WHERE status = 1
ORDER BY last_login DESC
LIMIT 20;
优化步骤:
优化后查询(执行时间 0.02s):
sql复制SELECT id, username, last_login
FROM users
WHERE status = 1
ORDER BY last_login DESC
LIMIT 20;
原始复杂查询:
sql复制SELECT
department,
COUNT(*) AS total,
SUM(CASE WHEN gender = 'M' THEN 1 ELSE 0 END) AS male,
SUM(CASE WHEN gender = 'F' THEN 1 ELSE 0 END) AS female
FROM employees
GROUP BY department;
优化方案:对于超大数据集,可以拆分为:
sql复制-- 最差性能:前导通配符
SELECT * FROM products WHERE name LIKE '%apple%';
-- 中等性能:后缀通配符
SELECT * FROM products WHERE name LIKE 'apple%';
-- 最佳性能:精确匹配
SELECT * FROM products WHERE name = 'apple';
sql复制-- 低效:使用函数
SELECT * FROM logs WHERE YEAR(create_time) = 2023;
-- 高效:范围查询
SELECT * FROM logs
WHERE create_time >= '2023-01-01'
AND create_time < '2024-01-01';
sql复制-- 计算每个部门的薪资排名
SELECT
name,
department,
salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dept_rank
FROM employees;
sql复制-- 递归查询组织架构
WITH RECURSIVE org_tree AS (
SELECT * FROM departments WHERE parent_id IS NULL
UNION ALL
SELECT d.* FROM departments d
JOIN org_tree ot ON d.parent_id = ot.id
)
SELECT * FROM org_tree;
危险写法:
java复制String sql = "SELECT * FROM users WHERE id = " + userInput;
安全写法:
java复制PreparedStatement stmt = conn.prepareStatement(
"SELECT * FROM users WHERE id = ?");
stmt.setInt(1, userId);
sql复制-- 避免直接查询敏感字段
SELECT id, username FROM users;
-- 而非
SELECT * FROM users;
sql复制-- 低效嵌套
SELECT * FROM products
WHERE category_id IN (
SELECT id FROM categories WHERE type = 'electronics'
);
-- 高效JOIN
SELECT p.* FROM products p
JOIN categories c ON p.category_id = c.id
WHERE c.type = 'electronics';
sql复制-- 使用派生表优化复杂查询
SELECT t.department, AVG(t.salary)
FROM (
SELECT department, salary
FROM employees
WHERE hire_date > '2020-01-01'
) t
GROUP BY t.department;
ini复制# my.cnf 配置
slow_query_log = 1
slow_query_log_file = /var/log/mysql/mysql-slow.log
long_query_time = 1
log_queries_not_using_indexes = 1
sql复制-- 查看最耗资源的SQL
SELECT * FROM performance_schema.events_statements_summary_by_digest
ORDER BY SUM_TIMER_WAIT DESC
LIMIT 10;
sql复制-- 不适用于分布式环境的写法
SELECT * FROM large_table LIMIT 1000000, 10;
-- 分片查询方案
SELECT * FROM large_table
WHERE shard_key = ?
ORDER BY id
LIMIT 10;
避免:
sql复制SELECT * FROM db1.users u
JOIN db2.orders o ON u.id = o.user_id;
推荐:
在实际项目中,我总结出几个关键原则:
一个典型性能优化案例:将某个报表查询从 12 秒优化到 0.3 秒,主要方法是:
最后提醒:所有优化都要基于实际业务场景,没有放之四海而皆准的银弹方案。建议先在测试环境验证效果,再应用到生产环境。