处理时间序列数据是数据库领域的常见需求。想象一下,你正在运营一个物联网平台,每天有数百万台设备上传传感器数据;或者管理一个大型应用系统,每小时产生GB级别的日志。这些数据通常都带有时间戳,而且随着时间推移不断累积。我曾经接手过一个项目,单表数据量在三个月内就突破了10亿条,查询速度从最初的毫秒级退化到分钟级,运维操作也变得异常艰难。
PostgreSQL的分区表功能就像给数据仓库安装了智能货架系统。通过按照日期范围将数据分散到不同的分区中,查询时数据库只需要扫描相关时间段的分区,就像在图书馆按分类找书比遍历整个书库要快得多。但手动创建分区就像每天手工整理货架,既繁琐又容易出错。特别是在处理跨时区业务时,我曾经因为时区换算错误导致一整天的数据写入了错误分区,排查起来相当痛苦。
触发器方案的核心思路是"边用边建"。下面是我在日志系统中实际使用的增强版触发器函数:
sql复制CREATE OR REPLACE FUNCTION create_monthly_partition()
RETURNS TRIGGER AS $$
DECLARE
partition_name TEXT;
partition_start DATE;
partition_end DATE;
date_format TEXT := 'YYYY_MM';
BEGIN
-- 计算当月第一天和下月第一天
partition_start := DATE_TRUNC('month', NEW.log_date);
partition_end := partition_start + INTERVAL '1 month';
partition_name := 'logs_' || TO_CHAR(partition_start, date_format);
-- 仅当分区不存在时才创建
IF NOT EXISTS (
SELECT 1 FROM pg_catalog.pg_class c
JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE n.nspname = 'public' AND c.relname = partition_name
) THEN
EXECUTE format('
CREATE TABLE %I PARTITION OF logs
FOR VALUES FROM (%L) TO (%L)
WITH (parallel_workers=4)
', partition_name, partition_start, partition_end);
-- 为新分区创建索引
EXECUTE format('
CREATE INDEX idx_%s_device_id ON %I (device_id)
', partition_name, partition_name);
RAISE NOTICE 'Created new partition %', partition_name;
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
这个方案有几个关键改进点:
在AWS r5.large实例(16GB内存)上的测试结果显示:
当并发插入量达到1000 QPS时,触发器的开销会被分摊,性能差异变得不明显。但在突发写入场景下,首次创建分区时的延迟会突增到50-100ms。
对于写入量大的生产系统,我更推荐使用pg_cron扩展实现预创建分区。以下是我们的标准配置:
sql复制-- 安装扩展
CREATE EXTENSION pg_cron;
-- 每月25号创建下个月的分区
SELECT cron.schedule(
'create_next_month_partition',
'0 2 25 * *', -- 每月25号凌晨2点执行
$$
DO $$
DECLARE
next_month_start DATE := DATE_TRUNC('month', NOW() + INTERVAL '1 month');
next_month_end DATE := next_month_start + INTERVAL '1 month';
partition_name TEXT := 'logs_' || TO_CHAR(next_month_start, 'YYYY_MM');
BEGIN
IF NOT EXISTS (
SELECT 1 FROM pg_tables
WHERE schemaname = 'public'
AND tablename = partition_name
) THEN
EXECUTE format('
CREATE TABLE %I PARTITION OF logs
FOR VALUES FROM (%L) TO (%L)
TABLESPACE fast_ssd
', partition_name, next_month_start, next_month_end);
-- 其他初始化操作...
END IF;
END $$;
$$
);
-- 每周日维护任务
SELECT cron.schedule(
'weekly_partition_maintenance',
'0 4 * * 0', -- 每周日早上4点
$$
-- 执行VACUUM ANALYZE、更新统计信息等维护操作
$$
);
在实际运维中,我们还实现了这些增强功能:
sql复制-- 空间监控函数示例
CREATE FUNCTION check_partition_usage()
RETURNS TABLE(partition_name text, used_percent numeric) AS $$
BEGIN
RETURN QUERY EXECUTE '
SELECT child.relname,
pg_size_pretty(pg_total_relation_size(child.oid)) as size,
(pg_total_relation_size(child.oid)::float /
pg_total_relation_size(parent.oid)::float * 100)::numeric(5,2)
FROM pg_inherits
JOIN pg_class parent ON pg_inherits.inhparent = parent.oid
JOIN pg_class child ON pg_inherits.inhrelid = child.oid
WHERE parent.relname = $1
ORDER BY child.relname
' USING 'logs';
END;
$$ LANGUAGE plpgsql;
对于关键业务系统,我推荐结合两种方案的优点:
这种架构下,99%的分区都由定时任务提前创建,触发器仅处理0.1%的边界情况(如跨时区数据),同时还能记录这些异常事件供后续分析。
sql复制-- 混合方案中的安全触发器
CREATE OR REPLACE FUNCTION partition_safety_net()
RETURNS TRIGGER AS $$
BEGIN
BEGIN
-- 尝试正常插入
RETURN NEW;
EXCEPTION WHEN OTHERS THEN
-- 仅捕获分区不存在的特定错误代码
IF SQLSTATE = '23514' THEN
PERFORM create_missing_partition(NEW.log_date);
RETURN NEW;
ELSE
RAISE;
END IF;
END;
END;
$$ LANGUAGE plpgsql;
在实施过程中,我们总结出这些黄金法则:
分区粒度选择:
索引策略:
常见陷阱:
TO是开区间,值等于上限时会失败监控指标:
sql复制-- 查看分区表统计信息
SELECT * FROM pg_stat_user_tables
WHERE relname LIKE 'logs_%'
ORDER BY seq_scan DESC;
-- 检查锁等待
SELECT blocked_locks.pid AS blocked_pid,
blocking_locks.pid AS blocking_pid
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_locks blocking_locks
ON blocking_locks.locktype = blocked_locks.locktype
AND blocking_locks.DATABASE IS NOT DISTINCT FROM blocked_locks.DATABASE
AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page
AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple
AND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxid
AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid
AND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classid
AND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objid
AND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubid
AND blocking_locks.pid != blocked_locks.pid;
随着PostgreSQL 14引入的声明式分区改进和15的并行分区查询,自动化分区管理变得更加重要。我们正在试验这些新特性:
sql复制-- Postgres 14+ 的哈希分区示例
CREATE TABLE sensor_data (
sensor_id bigint,
ts timestamptz,
value numeric
) PARTITION BY HASH (sensor_id);
-- 自动创建8个哈希分区
SELECT format('CREATE TABLE sensor_data_%s PARTITION OF sensor_data FOR VALUES WITH (MODULUS 8, REMAINDER %s)',
i, i)
FROM generate_series(0, 7) i;