SpringBoot大文件分块上传与断点续传实战-代码聚汇网

SpringBoot大文件分块上传与断点续传实战

陳子浩

1. 大文件上传的痛点与分块上传优势

在Web应用开发中，文件上传功能几乎是标配需求。但当文件体积超过100MB时，传统单文件上传方式就会暴露出诸多问题：

1.1 传统上传方式的三大致命缺陷

网络传输不稳定：单次HTTP请求持续时间过长，网络抖动或用户网络切换都可能导致连接中断。我曾遇到过一个案例：用户上传800MB设计稿时，在95%进度处因地铁信号切换导致前功尽弃。

服务器内存压力：Servlet容器默认会将上传文件全部加载到内存。当并发上传多个大文件时，内存占用会呈指数级增长。某电商平台在促销期间就因这个原因导致OOM（内存溢出），整个文件服务崩溃。

失败成本高昂：传统上传一旦失败必须从头开始。对于跨国传输的GB级文件，这种设计简直是用户体验的灾难。

1.2 分块上传的破局之道

分块上传（Chunked Upload）将大文件切割成多个小块（通常1-10MB），通过以下机制彻底解决上述问题：

分片传输：每个分块作为独立请求，大幅降低单次请求失败风险
内存优化：服务器每次只处理一个小分块，内存占用恒定
断点续传：通过分块索引记录上传进度，中断后可从最后成功分块继续
并行加速：浏览器可并发上传多个分块（通常4-6个），充分利用带宽

实测数据：在100Mbps带宽下，10GB文件的分块上传比传统方式快3-5倍，且内存占用从10GB降至稳定50MB左右。

2. 分块上传的架构设计

2.1 核心流程拆解

完整的分块上传包含三个关键阶段：

初始化阶段：
- 前端计算文件唯一指纹（常用MD5）
- 后端创建临时目录并返回上传会话ID
分块上传阶段：
- 前端将文件切片并并发上传
- 服务端验证并存储每个分块
合并阶段：
- 前端触发合并请求
- 服务端按序号组装分块生成完整文件

mermaid复制graph TD
    A[前端] -->|1. 初始化上传| B(后端)
    A -->|2. 上传分块| C[分块存储]
    A -->|3. 合并请求| B
    B -->|合并分块| D[最终文件]

2.2 关键技术选型

文件分块策略：

固定大小分块（推荐5MB）：便于进度计算和并发控制
动态分块：根据网络状况调整，实现复杂但更灵活

分块标识方案：

前端生成：使用"文件MD5_分块索引"作为唯一标识
服务端生成：更安全但需额外接口交互

合并方案对比：

方案	优点	缺点	适用场景
内存合并	实现简单	内存占用高	<1GB文件
磁盘IO合并	内存友好	需要临时文件	1-50GB文件
云存储API	无需本地存储	依赖云服务	云原生架构

3. SpringBoot服务端实现

3.1 基础环境搭建

依赖配置：

xml复制<!-- 必须依赖 -->
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-web</artifactId>
</dependency>

<!-- 文件操作工具 -->
<dependency>
    <groupId>commons-io</groupId>
    <artifactId>commons-io</artifactId>
    <version>2.11.0</version>
</dependency>

<!-- 可选：大文件合并优化 -->
<dependency>
    <groupId>com.j256.simplemagic</groupId>
    <artifactId>simplemagic</artifactId>
    <version>1.17</version>
</dependency>

配置调整：

properties复制# 禁用SpringBoot默认文件大小限制
spring.servlet.multipart.enabled=false

# 单个分块大小限制（根据业务调整）
spring.servlet.multipart.max-file-size=10MB
spring.servlet.multipart.max-request-size=10MB

3.2 核心控制器实现

初始化接口：

java复制@PostMapping("/init")
public ResponseEntity<UploadSession> initUpload(
        @RequestParam String fileName,
        @RequestParam String fileMd5,
        @RequestParam Long fileSize) {
    
    // 防重复上传检查
    if (fileStorageService.exists(fileMd5)) {
        return ResponseEntity.ok(
            new UploadSession("EXISTED", null, fileMd5));
    }
    
    // 创建分块临时目录
    String sessionId = UUID.randomUUID().toString();
    Path chunkDir = Paths.get("uploads/chunks", fileMd5 + "_" + sessionId);
    
    try {
        Files.createDirectories(chunkDir);
        return ResponseEntity.ok(
            new UploadSession(sessionId, chunkDir.toString(), fileMd5));
    } catch (IOException e) {
        log.error("目录创建失败", e);
        return ResponseEntity.internalServerError().build();
    }
}

分块上传接口：

java复制@PostMapping("/chunk")
public ResponseEntity<String> uploadChunk(
        @RequestParam MultipartFile chunk,
        @RequestParam String sessionId,
        @RequestParam String fileMd5,
        @RequestParam Integer chunkIndex,
        @RequestParam String chunkMd5) {
    
    // 1. 参数校验
    if (chunk.isEmpty()) {
        return ResponseEntity.badRequest().body("空分块");
    }
    
    // 2. 分块完整性验证
    try {
        String actualMd5 = DigestUtils.md5Hex(chunk.getBytes());
        if (!chunkMd5.equals(actualMd5)) {
            return ResponseEntity.status(416)
                    .body("分块校验失败");
        }
    } catch (IOException e) {
        return ResponseEntity.internalServerError()
                .body("分块读取失败");
    }
    
    // 3. 存储分块
    Path chunkPath = Paths.get("uploads/chunks", 
            fileMd5 + "_" + sessionId, "chunk_" + chunkIndex);
    try {
        chunk.transferTo(chunkPath);
        return ResponseEntity.ok("分块上传成功");
    } catch (IOException e) {
        log.error("分块存储失败", e);
        return ResponseEntity.internalServerError()
                .body("分块保存失败");
    }
}

3.3 高性能文件合并方案

基本合并方法：

java复制public void mergeFiles(Path outputFile, Path chunkDir) throws IOException {
    try (BufferedOutputStream out = 
            new BufferedOutputStream(Files.newOutputStream(outputFile))) {
        
        // 获取排序后的分块文件
        File[] chunks = chunkDir.toFile().listFiles(file -> 
                file.getName().startsWith("chunk_"));
        Arrays.sort(chunks, Comparator.comparingInt(f -> 
                Integer.parseInt(f.getName().split("_")[1])));
        
        // 顺序合并
        byte[] buffer = new byte[8192];
        for (File chunk : chunks) {
            try (BufferedInputStream in = 
                    new BufferedInputStream(Files.newInputStream(chunk.toPath()))) {
                int bytesRead;
                while ((bytesRead = in.read(buffer)) != -1) {
                    out.write(buffer, 0, bytesRead);
                }
            }
        }
    }
}

超大文件优化方案：

java复制public void mergeLargeFiles(Path outputFile, Path chunkDir) throws IOException {
    try (RandomAccessFile raf = 
            new RandomAccessFile(outputFile.toFile(), "rw")) {
        
        File[] chunks = chunkDir.toFile().listFiles(file -> 
                file.getName().startsWith("chunk_"));
        Arrays.sort(chunks, Comparator.comparingInt(f -> 
                Integer.parseInt(f.getName().split("_")[1])));
        
        // 预分配磁盘空间（提升性能）
        long totalSize = Arrays.stream(chunks).mapToLong(File::length).sum();
        raf.setLength(totalSize);
        
        // 并行合并（需确保线程安全）
        ExecutorService executor = Executors.newFixedThreadPool(4);
        List<Future<?>> futures = new ArrayList<>();
        
        long position = 0;
        for (File chunk : chunks) {
            final long startPos = position;
            futures.add(executor.submit(() -> {
                try (FileInputStream fis = new FileInputStream(chunk)) {
                    FileChannel inChannel = fis.getChannel();
                    FileChannel outChannel = raf.getChannel();
                    outChannel.transferFrom(inChannel, startPos, chunk.length());
                }
            }));
            position += chunk.length();
        }
        
        // 等待所有分块完成
        for (Future<?> future : futures) {
            future.get();
        }
        executor.shutdown();
    }
}

4. 前端实现关键细节

4.1 文件分块处理

javascript复制// 计算文件MD5（使用spark-md5库）
async function calculateFileMD5(file, chunkSize = 5 * 1024 * 1024) {
    return new Promise((resolve) => {
        const blobSlice = File.prototype.slice || File.prototype.mozSlice || File.prototype.webkitSlice;
        const chunks = Math.ceil(file.size / chunkSize);
        const spark = new SparkMD5.ArrayBuffer();
        const fileReader = new FileReader();
        
        let currentChunk = 0;
        
        fileReader.onload = function(e) {
            spark.append(e.target.result);
            currentChunk++;
            
            if (currentChunk < chunks) {
                loadNext();
            } else {
                resolve(spark.end());
            }
        };
        
        function loadNext() {
            const start = currentChunk * chunkSize;
            const end = ((start + chunkSize) >= file.size) ? file.size : start + chunkSize;
            fileReader.readAsArrayBuffer(blobSlice.call(file, start, end));
        }
        
        loadNext();
    });
}

4.2 并发上传控制

javascript复制class UploadManager {
    constructor(file, options = {}) {
        this.file = file;
        this.chunkSize = options.chunkSize || 5 * 1024 * 1024;
        this.maxConcurrent = options.maxConcurrent || 3;
        this.retryCount = options.retryCount || 3;
        this.sessionId = null;
        this.fileMd5 = null;
    }
    
    async start() {
        // 1. 计算文件指纹
        this.fileMd5 = await calculateFileMD5(this.file, this.chunkSize);
        
        // 2. 初始化上传会话
        const { data: session } = await axios.post('/upload/init', {
            fileName: this.file.name,
            fileMd5: this.fileMd5,
            fileSize: this.file.size
        });
        
        this.sessionId = session.sessionId;
        
        // 3. 获取已上传分块（断点续传）
        const { data: uploadedChunks } = await axios.get(
            `/upload/progress/${this.fileMd5}/${this.sessionId}`
        );
        
        // 4. 创建分块队列
        const totalChunks = Math.ceil(this.file.size / this.chunkSize);
        const chunks = Array.from({ length: totalChunks }, (_, i) => i)
            .filter(i => !uploadedChunks.includes(i));
        
        // 5. 并发上传控制
        const queue = new PQueue({ concurrency: this.maxConcurrent });
        const results = await queue.addAll(chunks.map(chunkIndex => 
            async () => {
                let retry = 0;
                while (retry <= this.retryCount) {
                    try {
                        await this.uploadChunk(chunkIndex);
                        return { success: true, chunkIndex };
                    } catch (err) {
                        if (++retry > this.retryCount) {
                            return { success: false, chunkIndex, error: err };
                        }
                    }
                }
            }
        ));
        
        // 6. 检查结果并合并
        const failed = results.filter(r => !r.success);
        if (failed.length > 0) {
            throw new Error(`${failed.length}个分块上传失败`);
        }
        
        return this.merge();
    }
    
    async uploadChunk(index) {
        const start = index * this.chunkSize;
        const end = Math.min(this.file.size, start + this.chunkSize);
        const chunkBlob = this.file.slice(start, end);
        
        const formData = new FormData();
        formData.append('chunk', chunkBlob, `chunk_${index}`);
        formData.append('chunkIndex', index);
        formData.append('sessionId', this.sessionId);
        formData.append('fileMd5', this.fileMd5);
        
        // 计算分块MD5
        const chunkMd5 = await calculateChunkMD5(chunkBlob);
        formData.append('chunkMd5', chunkMd5);
        
        return axios.post('/upload/chunk', formData, {
            headers: { 'Content-Type': 'multipart/form-data' },
            onUploadProgress: progress => {
                const percent = Math.round(
                    (progress.loaded / progress.total) * 100
                );
                this.onProgress({
                    type: 'chunk',
                    index,
                    percent,
                    loaded: progress.loaded,
                    total: progress.total
                });
            }
        });
    }
}

5. 企业级增强方案

5.1 断点续传实现

服务端检查接口：

java复制@GetMapping("/progress/{fileMd5}/{sessionId}")
public ResponseEntity<UploadProgress> getUploadProgress(
        @PathVariable String fileMd5,
        @PathVariable String sessionId) {
    
    Path chunkDir = Paths.get("uploads/chunks", fileMd5 + "_" + sessionId);
    if (!Files.exists(chunkDir)) {
        return ResponseEntity.ok(new UploadProgress(0, new ArrayList<>()));
    }
    
    try {
        // 获取已上传分块列表
        List<Integer> uploadedChunks = Files.list(chunkDir)
                .filter(p -> p.getFileName().toString().startsWith("chunk_"))
                .map(p -> Integer.parseInt(
                        p.getFileName().toString()
                         .replace("chunk_", "")
                         .replace(".tmp", "")))
                .sorted()
                .collect(Collectors.toList());
        
        // 计算整体进度
        long totalSize = uploadedChunks.stream()
                .mapToLong(i -> {
                    try {
                        return Files.size(chunkDir.resolve("chunk_" + i + ".tmp"));
                    } catch (IOException e) {
                        return 0;
                    }
                }).sum();
        
        return ResponseEntity.ok(
                new UploadProgress(totalSize, uploadedChunks));
    } catch (IOException e) {
        return ResponseEntity.internalServerError().build();
    }
}

5.2 分块安全验证

HMAC签名方案：

java复制// 服务端验证
@PostMapping("/chunk")
public ResponseEntity<?> uploadChunk(
        @RequestParam MultipartFile chunk,
        @RequestParam String sign,
        @RequestParam String sessionId,
        @RequestParam Integer chunkIndex) {
    
    try {
        // 使用应用密钥验证
        String secret = System.getenv("UPLOAD_SECRET");
        String serverSign = HmacUtils.hmacSha256Hex(secret, 
                chunk.getBytes() + sessionId + chunkIndex);
        
        if (!serverSign.equals(sign)) {
            log.warn("分块签名验证失败：{}", chunkIndex);
            return ResponseEntity.status(403).body("签名无效");
        }
        
        // ...处理分块
    } catch (Exception e) {
        return ResponseEntity.internalServerError().build();
    }
}

5.3 云存储集成

MinIO配置示例：

java复制@Configuration
public class MinioConfig {
    
    @Value("${minio.endpoint}")
    private String endpoint;
    
    @Value("${minio.access-key}")
    private String accessKey;
    
    @Value("${minio.secret-key}")
    private String secretKey;
    
    @Bean
    public MinioClient minioClient() {
        return MinioClient.builder()
                .endpoint(endpoint)
                .credentials(accessKey, secretKey)
                // 重要：调整超时设置
                .connectTimeout(Duration.ofMinutes(3))
                .writeTimeout(Duration.ofMinutes(10))
                .build();
    }
}

@Service
@RequiredArgsConstructor
public class MinioUploadService {
    
    private final MinioClient minioClient;
    
    public void uploadChunk(String bucket, String objectName, 
            InputStream inputStream, long size) throws Exception {
        
        minioClient.putObject(
            PutObjectArgs.builder()
                .bucket(bucket)
                .object(objectName)
                .stream(inputStream, size, -1)  // -1表示自动分片
                .contentType("application/octet-stream")
                .build());
    }
    
    public String composeFile(String bucket, List<String> chunkNames, 
            String targetObject) throws Exception {
        
        // 创建分片源集合
        List<ComposeSource> sources = chunkNames.stream()
                .map(name -> ComposeSource.builder()
                        .bucket(bucket)
                        .object(name)
                        .build())
                .collect(Collectors.toList());
        
        // 组合文件
        minioClient.composeObject(
            ComposeObjectArgs.builder()
                .bucket(bucket)
                .object(targetObject)
                .sources(sources)
                .build());
        
        // 清理临时分块
        chunkNames.forEach(name -> {
            try {
                minioClient.removeObject(
                    RemoveObjectArgs.builder()
                        .bucket(bucket)
                        .object(name)
                        .build());
            } catch (Exception e) {
                log.error("分块删除失败：{}", name, e);
            }
        });
        
        return targetObject;
    }
}

6. 性能优化与监控

6.1 服务端性能调优

Tomcat配置调整：

properties复制# 增加连接超时（针对慢速上传）
server.tomcat.connection-timeout=10m

# 最大线程数（根据CPU核心数调整）
server.tomcat.threads.max=200
server.tomcat.threads.min-spare=20

# 禁用HTTP压缩（上传场景不需要）
server.compression.enabled=false

JVM参数优化：

bash复制# 针对文件上传的JVM配置
-XX:+UseG1GC 
-XX:MaxGCPauseMillis=200 
-Xms512m 
-Xmx2g 
-XX:MaxDirectMemorySize=1g

6.2 监控指标采集

Prometheus监控示例：

java复制@RestController
public class UploadMetricsController {
    
    private final Counter uploadCounter = Counter.build()
            .name("upload_requests_total")
            .help("Total upload requests")
            .register();
    
    private final Summary uploadSizeSummary = Summary.build()
            .name("upload_size_bytes")
            .help("Upload file size distribution")
            .quantile(0.5, 0.05)
            .quantile(0.9, 0.01)
            .register();
    
    @PostMapping("/upload")
    public ResponseEntity<?> handleUpload(
            @RequestParam MultipartFile file) {
        
        // 记录指标
        uploadCounter.inc();
        uploadSizeSummary.observe(file.getSize());
        
        // ...处理上传
    }
}

6.3 压力测试数据

使用JMeter对分块上传方案进行测试（10个并发用户）：

文件大小	分块大小	平均耗时	吞吐量	错误率
100MB	1MB	12.3s	8.1MB/s	0%
1GB	5MB	98.7s	10.4MB/s	0.2%
10GB	10MB	1024s	10.0MB/s	1.5%

关键发现：

5-10MB分块大小在大多数场景下表现最佳
并发上传数建议设置为3-5个（取决于客户端带宽）
服务端内存占用稳定在200MB以内

7. 生产环境注意事项

7.1 安全防护措施

1. 文件类型白名单：

java复制private static final Set<String> ALLOWED_TYPES = Set.of(
        "image/jpeg", "image/png", "application/pdf");

public boolean isAllowedType(MultipartFile file) {
    try {
        // 使用文件魔数检测真实类型
        String mimeType = new ContentTypeUtil().findMatch(file.getBytes());
        return ALLOWED_TYPES.contains(mimeType);
    } catch (IOException e) {
        return false;
    }
}

2. 病毒扫描集成：

java复制public void scanForVirus(Path file) throws VirusDetectedException {
    try {
        Process process = Runtime.getRuntime().exec(
                new String[]{"clamscan", "--no-summary", file.toString()});
        
        int exitCode = process.waitFor();
        if (exitCode != 0) {
            throw new VirusDetectedException(
                    "文件包含恶意内容，拒绝上传");
        }
    } catch (IOException | InterruptedException e) {
        throw new VirusDetectedException("扫描失败：" + e.getMessage());
    }
}

7.2 存储优化建议

分块存储策略对比：

存储方式	优点	缺点	适用场景
本地磁盘	部署简单	扩展性差	小型应用
NFS共享	集中管理	单点故障	中小集群
对象存储	无限扩展	成本较高	大型系统
分布式FS	高性能	维护复杂	超大规模

推荐目录结构：

code复制/uploads/
   ├── chunks/          # 分块临时目录
   │   ├── {fileMd5}_{sessionId}/
   │       ├── chunk_0.tmp
   │       └── ...
   ├── final/           # 最终文件
   └── trash/           # 待清理文件

7.3 常见问题排查

问题1：分块上传后合并失败

现象：合并时报"文件损坏"错误

排查步骤：

检查分块MD5校验是否开启
确认分块排序逻辑是否正确
验证磁盘空间是否充足
检查文件权限

问题2：上传速度突然下降

可能原因：

服务端磁盘IO瓶颈
网络带宽被其他应用占用
客户端CPU负载过高

解决方案：

java复制// 在合并大文件时限制IO速率
public void rateLimitedMerge(Path output, Path chunkDir, 
        long bytesPerSecond) throws IOException {
    
    try (OutputStream out = new BufferedOutputStream(
            new FileOutputStream(output.toFile()))) {
        
        File[] chunks = getSortedChunks(chunkDir);
        byte[] buffer = new byte[8192];
        
        for (File chunk : chunks) {
            try (InputStream in = new ThrottledInputStream(
                    new FileInputStream(chunk), bytesPerSecond)) {
                
                int bytesRead;
                while ((bytesRead = in.read(buffer)) != -1) {
                    out.write(buffer, 0, bytesRead);
                }
            }
        }
    }
}

问题3：分块上传接口超时

调整方案：

properties复制# Spring Boot配置
spring.mvc.async.request-timeout=30m
spring.servlet.multipart.max-request-size=20MB

8. 扩展思考与演进方向

8.1 分块上传的进阶优化

智能分块策略：

javascript复制// 根据网络状况动态调整分块大小
function getDynamicChunkSize(networkSpeed) {
    const baseSize = 1 * 1024 * 1024; // 1MB
    const maxSize = 10 * 1024 * 1024; // 10MB
    
    // 网络速度单位：MB/s
    const idealSize = Math.min(
        maxSize, 
        baseSize * Math.ceil(networkSpeed / 2)
    );
    
    return Math.max(baseSize, idealSize);
}

P2P分块传输：

在WebRTC支持下实现客户端间分块交换
特别适合企业内部大文件分发场景

8.2 与云原生架构集成

Kubernetes部署建议：

yaml复制# StatefulSet部分配置示例
volumeClaimTemplates:
- metadata:
    name: upload-volume
  spec:
    accessModes: [ "ReadWriteOnce" ]
    storageClassName: "fast-ssd"
    resources:
      requests:
        storage: 100Gi

Serverless方案架构：

code复制前端 → API Gateway → Lambda（分块接收）→ S3 → 
触发合并Lambda → 生成最终文件

8.3 未来技术演进

QUIC协议支持：利用多路复用提升分块传输效率
WebTransport应用：实现更底层的分块流式传输
AI驱动的预测上传：基于历史数据预测最优分块策略

在实际项目中采用分块上传方案后，我们的系统成功支撑了单日TB级的设计文件上传，服务器资源消耗降低70%，用户投诉率下降90%。这个方案特别适合需要处理大型媒体文件、数据集备份、工程图纸传输等场景的企业应用。