慢查询是数据库性能优化的首要切入点。作为MySQL DBA,我处理过数百个性能问题案例,其中80%以上通过优化慢查询就能显著改善。以下是经过实战验证的慢查询定位方法论:
在my.cnf中配置以下参数(建议生产环境长期开启):
ini复制slow_query_log = 1
slow_query_log_file = /var/log/mysql/mysql-slow.log
long_query_time = 1 # 单位秒,建议从1开始逐步收紧
log_queries_not_using_indexes = 1 # 记录未走索引的查询
log_throttle_queries_not_using_indexes = 10 # 限制每分钟记录数量
重要提示:日志文件所在分区需预留足够空间,我曾遇到过日志爆盘导致数据库挂起的生产事故
SHOW PROCESSLIST
sql复制SELECT * FROM information_schema.processlist
WHERE TIME > 5 # 自定义阈值
ORDER BY TIME DESC;
performance_schema监控
sql复制SELECT * FROM performance_schema.events_statements_history_long
WHERE SQL_TEXT IS NOT NULL
ORDER BY TIMER_WAIT DESC LIMIT 10;
sys库快捷查询
sql复制SELECT * FROM sys.statement_analysis
ORDER BY avg_latency DESC LIMIT 10;
| 工具 | 分析维度 | 优势 | 适用场景 |
|---|---|---|---|
| mysqldumpslow | 执行统计 | MySQL原生,无需安装 | 快速概览 |
| pt-query-digest | 多维分析 | 支持高级聚合与过滤 | 深度优化 |
| Anemometer | 可视化趋势 | 图形化展示历史变化 | 长期监控 |
我习惯先用mysqldumpslow快速定位TOP SQL,再用pt-query-digest做深度解析:
bash复制pt-query-digest --limit=10 --filter='$event->{arg} =~ m/^select/i' mysql-slow.log
重点关注以下字段:
通过performance_schema获取更精确的执行数据:
sql复制SELECT * FROM performance_schema.events_statements_summary_by_digest
ORDER BY SUM_TIMER_WAIT DESC LIMIT 5;
关键指标:
使用以下SQL验证索引选择性:
sql复制SELECT
COUNT(DISTINCT column_name)/COUNT(*) AS selectivity
FROM table_name;
选择性低于0.1的列通常不适合单独建索引。
错误示范:
sql复制SELECT * FROM large_table LIMIT 1000000, 10;
优化方案:
sql复制SELECT * FROM large_table
WHERE id > (SELECT id FROM large_table ORDER BY id LIMIT 1000000, 1)
ORDER BY id LIMIT 10;
未优化查询:
sql复制SELECT * FROM orders
JOIN customers ON orders.customer_id = customers.id
WHERE customers.reg_date > '2023-01-01';
优化后方案:
sql复制SELECT * FROM orders
JOIN (
SELECT id FROM customers
WHERE reg_date > '2023-01-01'
) AS filtered_customers
ON orders.customer_id = filtered_customers.id;
常见问题:
sql复制SELECT * FROM users WHERE phone = 13800138000; # phone是varchar类型
这会导致索引失效,必须改为:
sql复制SELECT * FROM users WHERE phone = '13800138000';
核心监控项:
推荐使用Prometheus+Granfa构建监控看板,配置告警规则:
yaml复制- alert: HighSlowQueries
expr: rate(mysql_global_status_slow_queries[1m]) > 5
for: 5m
我使用的每日检查脚本:
bash复制#!/bin/bash
# 每日慢查询分析报告
LOG_FILE=/var/log/mysql/mysql-slow.log
REPORT_DIR=/opt/mysql_reports
pt-query-digest $LOG_FILE --limit=10 > $REPORT_DIR/$(date +%F).log
mysqldumpslow -s t $LOG_FILE | head -20 > $REPORT_DIR/$(date +%F)-summary.log
获取优化器决策过程:
sql复制SET optimizer_trace="enabled=on";
SELECT * FROM users WHERE...; # 执行目标SQL
SELECT * FROM information_schema.optimizer_trace;
SET optimizer_trace="enabled=off";
使用火焰图定位性能瓶颈:
bash复制pt-pmp -stack mysqld > stack.txt
./FlameGraph.pl stack.txt > profile.svg
开启高级监控:
sql复制SET GLOBAL innodb_monitor_enable = '%';
SELECT * FROM information_schema.INNODB_METRICS
WHERE COUNT > 0 ORDER BY COUNT DESC;
经过多年实践,我总结出慢查询优化的黄金法则:先定位、再分析、后验证。每个优化方案必须通过真实负载测试,避免出现"实验室优化"现象——测试环境性能提升但生产环境反而下降的情况。