Doris采用经典的MPP(Massively Parallel Processing)架构,由Frontend(FE)和Backend(BE)两类节点组成。FE负责元数据管理、查询解析和调度,BE负责数据存储和计算。这种分离架构使得Doris既具备良好的扩展性,又能保证元数据的一致性。
FE节点采用主从架构,通过Paxos协议实现高可用。一个集群通常包含:
生产环境中建议至少部署3个Follower节点,确保Leader故障时能快速选举出新Leader。Observer节点可以根据读负载横向扩展。
Doris提供三种数据模型满足不同场景需求:
Aggregate模型:
CREATE TABLE sales ( dt DATE, product VARCHAR(50), city VARCHAR(20), sales_amount BIGINT SUM ) ENGINE=OLAP;Unique模型:
CREATE TABLE users ( user_id BIGINT, name VARCHAR(50), UNIQUE KEY(user_id) ) ENGINE=OLAP;Duplicate模型:
CREATE TABLE click_log ( ts DATETIME, user_id BIGINT, page_url VARCHAR(200), DUPLICATE KEY(ts, user_id) ) ENGINE=OLAP;**分区(Partition)**是逻辑划分,通常按时间范围分区便于管理。**分桶(Bucket)**是物理分片,决定数据在BE节点的分布。
sql复制-- 典型的分区分桶配置
CREATE TABLE user_behavior (
dt DATE,
user_id BIGINT,
action VARCHAR(20)
) ENGINE=OLAP
PARTITION BY RANGE(dt) (
PARTITION p202301 VALUES LESS THAN ('2023-02-01'),
PARTITION p202302 VALUES LESS THAN ('2023-03-01')
)
DISTRIBUTED BY HASH(user_id) BUCKETS 32
PROPERTIES (
"replication_num" = "3",
"storage_medium" = "SSD",
"storage_cooldown_time" = "7 days"
);
分桶策略选择建议:
大数据组件对Swap的敏感度极高,必须禁用Swap:
bash复制# 永久禁用Swap
sudo swapoff -a
sudo sed -i '/swap/s/^/#/' /etc/fstab
# 调整vm.swappiness
echo 'vm.swappiness = 1' | sudo tee -a /etc/sysctl.conf
sudo sysctl -p
内存不足时的应急方案:
be.conf中的mem_limit和query_mem_limitSET global_query_mem_limit=8589934592;(8GB)前缀索引是Doris的默认索引机制,按照建表时指定的前36字节建立稀疏索引。合理设计排序列顺序能极大提升查询性能:
sql复制-- 优化前(低效)
CREATE TABLE logs (
id BIGINT,
ts DATETIME,
user_id BIGINT,
DUPLICATE KEY(id, ts, user_id)
);
-- 优化后(高效)
CREATE TABLE logs (
ts DATETIME,
user_id BIGINT,
id BIGINT,
DUPLICATE KEY(ts, user_id, id) -- 高频查询条件放前面
);
物化视图使用示例:
sql复制-- 创建每小时汇总的物化视图
CREATE MATERIALIZED VIEW mv_sales_hourly
REFRESH EVERY INTERVAL 1 HOUR
AS SELECT
DATE_TRUNC('HOUR', dt) AS hour,
product,
SUM(sales_amount) AS total_sales
FROM sales
GROUP BY 1, 2;
-- 查询自动路由到物化视图
SELECT product, SUM(total_sales)
FROM mv_sales_hourly
WHERE hour BETWEEN '2023-01-01 00:00:00' AND '2023-01-01 23:59:59'
GROUP BY product;
Join性能优化等级(从优到劣):
Colocation Join(同分布Join)
sql复制-- 建表时指定相同的分桶方式和副本数
CREATE TABLE orders (...) DISTRIBUTED BY HASH(user_id) BUCKETS 32;
CREATE TABLE users (...) DISTRIBUTED BY HASH(user_id) BUCKETS 32;
-- 查询自动使用Colocation Join
SELECT * FROM orders JOIN users ON orders.user_id = users.id;
Broadcast Join(小表广播)
sql复制-- 自动触发(小表<100MB)
SELECT * FROM large_table l JOIN small_table s ON l.id = s.id;
-- 手动指定
SELECT * FROM large_table l JOIN [broadcast] small_table s ON l.id = s.id;
Shuffle Join(哈希重分布)
sql复制SELECT * FROM table1 JOIN [shuffle] table2 ON table1.id = table2.id;
FE部署要点:
BE部署要点:
bash复制# BE典型启动参数(be.conf)
brpc_port = 8060
webserver_port = 8040
heartbeat_service_port = 9050
be_port = 9060
关键监控指标:
性能调优参数:
sql复制-- 查询级别调优
SET exec_mem_limit = 8589934592; -- 单个查询内存限制(8GB)
SET parallel_fragment_exec_instance_num = 8; -- 并行度
SET enable_profile = true; -- 开启执行计划分析
-- 会话级别调优
SET global_query_mem_limit = 34359738368; -- 全局查询内存限制(32GB)
SET disable_colocate_join = false; -- 启用Colocation Join
查询慢速诊断流程:
sql复制SET enable_profile = true;
SELECT /*+ SET_VAR(enable_profile=true) */ * FROM ...;
OlapScanner:扫描行数/耗时HashJoinNode:Join耗时/内存使用ExchangeNode:数据Shuffle量数据倾斜处理:
sql复制SHOW TABLET FROM table_name WHERE ReplicaCount > 3;
sql复制ALTER TABLE table_name SET ("bucket_num" = "64");
智能存储策略配置:
sql复制CREATE TABLE logs (
ts DATETIME,
data VARCHAR(2000)
) ENGINE=OLAP
PARTITION BY RANGE(ts) (
PARTITION p202301 VALUES LESS THAN ('2023-02-01') (
"storage_medium" = "SSD",
"storage_cooldown_time" = "7 days"
),
PARTITION p202302 VALUES LESS THAN ('2023-03-01') (
"storage_medium" = "HDD"
)
);
自动化分区维护:
sql复制CREATE TABLE dynamic_table (
dt DATE,
...
) PARTITION BY RANGE(dt) (
PARTITION p_prev VALUES LESS THAN ('2023-01-01'),
PARTITION p_current VALUES LESS THAN ('2023-02-01')
)
PROPERTIES (
"dynamic_partition.enable" = "true",
"dynamic_partition.time_unit" = "MONTH",
"dynamic_partition.start" = "-3",
"dynamic_partition.end" = "3",
"dynamic_partition.prefix" = "p_",
"dynamic_partition.buckets" = "32"
);
HDFS外部表示例:
sql复制CREATE EXTERNAL RESOURCE "hdfs_resource"
PROPERTIES (
"type" = "hdfs",
"fs.defaultFS" = "hdfs://namenode:8020",
"hadoop.username" = "doris"
);
CREATE EXTERNAL TABLE ext_hdfs_table (
id BIGINT,
name VARCHAR(50)
) ENGINE=HDFS
PROPERTIES (
"resource" = "hdfs_resource",
"path" = "/path/to/data/",
"format" = "parquet"
);
在实际生产环境中,我们团队曾遇到一个典型性能问题:某报表查询突然从2秒劣化到2分钟。通过Profile分析发现是新增数据导致分桶倾斜,某个BE节点的tablet数据量是其他节点的5倍。解决方案是通过ALTER TABLE MODIFY DISTRIBUTION重新分布数据,同时调整分桶键为更均匀的业务字段。这个案例让我深刻理解到,分桶策略需要随业务增长定期评估调整。