Linux文件元数据查询：stat()与fstat()详解

jean luo

1. 文件状态查询基础：stat()与fstat()概述

在Unix/Linux系统编程中，获取文件元数据是文件操作的基础需求。stat()和fstat()这两个系统调用就像文件的"体检报告单"，能够让我们在不打开文件内容的情况下，获取文件类型、权限、大小、时间戳等关键信息。它们都定义在<sys/stat.h>头文件中，但使用场景有所不同：

stat()通过文件路径获取信息，适用于已知路径但未打开的文件
fstat()通过文件描述符获取信息，适用于已打开的文件

这两个调用返回的信息存储在struct stat结构中，这个结构体就像是一个包含20多项属性的文件信息表单。在实际开发中，我经常用它们来做文件存在性检查、权限验证、大小监控等操作。比如备份工具需要确认文件最后修改时间，或者Web服务器需要检查静态文件的权限，都离不开这些基础API。

注意：虽然lstat()也属于这个家族，但它处理符号链接的方式不同，本文聚焦stat()/fstat()的核心用法。

2. 核心数据结构与函数原型解析

2.1 struct stat结构体详解

这个结构体是信息存储的核心容器，不同系统版本字段可能略有差异，但基本包含以下关键字段（以Linux 5.x内核为例）：

c复制struct stat {
    dev_t     st_dev;     /* 设备ID */
    ino_t     st_ino;     /* inode编号 */
    mode_t    st_mode;    /* 文件类型和权限 */
    nlink_t   st_nlink;   /* 硬链接数 */
    uid_t     st_uid;     /* 所有者UID */
    gid_t     st_gid;     /* 所属组GID */
    dev_t     st_rdev;    /* 特殊文件设备ID */
    off_t     st_size;    /* 文件大小(字节) */
    blksize_t st_blksize; /* 文件系统I/O块大小 */
    blkcnt_t  st_blocks;  /* 分配的512B块数量 */
    struct timespec st_atim;  /* 最后访问时间 */
    struct timespec st_mtim;  /* 最后修改时间 */
    struct timespec st_ctim;  /* 最后状态变更时间 */
};

时间戳字段从timespec结构获取纳秒级精度，这是现代系统的改进。在实际项目中，我常用st_size做文件传输进度计算，用st_mtim判断文件是否被修改过。

2.2 函数原型与参数说明

c复制#include <sys/stat.h>

int stat(const char *pathname, struct stat *statbuf);
int fstat(int fd, struct stat *statbuf);

pathname：文件路径字符串，可以是相对或绝对路径
fd：已打开的文件描述符（来自open()等调用）
statbuf：输出参数，用于存储获取的文件信息

返回值：成功返回0，失败返回-1并设置errno。我在实际编码中总会检查返回值，因为权限不足、路径错误等情况很常见。

3. 深度使用场景与实战技巧

3.1 文件类型检测的正确姿势

st_mode字段通过位掩码表示文件类型和权限。判断文件类型时应该使用以下宏：

c复制S_ISREG(m)  /* 常规文件 */
S_ISDIR(m)  /* 目录 */
S_ISCHR(m)  /* 字符设备 */
S_ISBLK(m)  /* 块设备 */
S_ISFIFO(m) /* 管道或FIFO */
S_ISLNK(m)  /* 符号链接 */
S_ISSOCK(m) /* 套接字 */

典型使用示例：

c复制struct stat sb;
if (stat("example.txt", &sb) == -1) {
    perror("stat");
    exit(EXIT_FAILURE);
}

printf("File type: ");
switch (sb.st_mode & S_IFMT) {
    case S_IFREG:  printf("regular file\n"); break;
    case S_IFDIR:  printf("directory\n");    break;
    default:       printf("other\n");        break;
}

踩坑提醒：不要直接比较st_mode的数值，而应该用位掩码操作。我曾遇到过直接比较导致设备文件误判的bug。

3.2 时间戳处理的进阶技巧

现代系统使用timespec结构提供纳秒级精度：

c复制struct timespec {
    time_t tv_sec;   /* 秒 */
    long   tv_nsec;  /* 纳秒 */
};

时间转换示例：

c复制char timestr[100];
struct tm *tm_info;

tm_info = localtime(&sb.st_mtim.tv_sec);
strftime(timestr, sizeof(timestr), "%Y-%m-%d %H:%M:%S", tm_info);
printf("Last modified: %s.%09ld\n", timestr, sb.st_mtim.tv_nsec);

在开发文件同步工具时，我通过比较两个文件的st_mtim.tv_sec和tv_nsec来精确判断哪个版本更新。

4. 性能优化与错误处理

4.1 减少不必要的stat调用

每次stat调用都涉及从用户态到内核态的切换，在高频操作中会成为性能瓶颈。我的优化经验：

对静态文件做缓存：如果知道文件内容不会变，可以缓存stat结果
批量处理：先收集所有需要检查的文件路径，然后一次性处理
使用inotify：对监控目录使用inotify API替代轮询

4.2 常见错误码及处理

错误码	含义	典型处理方式
ENOENT	路径不存在	检查路径拼写或父目录权限
EACCES	权限不足	检查文件权限或改用特权运行
ELOOP	符号链接循环	检查符号链接引用链
ENOMEM	内存不足	减少并发操作或优化程序

健壮性处理示例：

c复制if (stat(path, &sb) == -1) {
    switch (errno) {
        case ENOENT:
            fprintf(stderr, "%s does not exist\n", path);
            break;
        case EACCES:
            fprintf(stderr, "No permission to access %s\n", path);
            break;
        default:
            perror("stat error");
    }
    exit(EXIT_FAILURE);
}

5. 实际项目案例：实现文件变化监控

下面是我在一个日志监控项目中使用的核心代码片段，展示stat()的实际应用：

c复制#define MONITOR_INTERVAL 5

void monitor_file(const char *path) {
    struct stat prev_sb, curr_sb;
    time_t last_mod = 0;
    off_t last_size = 0;

    if (stat(path, &prev_sb) == -1) {
        perror("Initial stat failed");
        return;
    }

    last_mod = prev_sb.st_mtim.tv_sec;
    last_size = prev_sb.st_size;

    while (1) {
        sleep(MONITOR_INTERVAL);

        if (stat(path, &curr_sb) == -1) {
            perror("Monitor stat failed");
            continue;
        }

        if (curr_sb.st_mtim.tv_sec != last_mod || 
            curr_sb.st_size != last_size) {
            printf("[%.*s] File changed! Size: %ld -> %ld\n",
                   (int)sizeof(time_t), ctime(&curr_sb.st_mtim.tv_sec),
                   last_size, curr_sb.st_size);
            
            last_mod = curr_sb.st_mtim.tv_sec;
            last_size = curr_sb.st_size;
        }
    }
}

这个实现有几个优化点值得注意：

只比较秒级时间戳和文件大小，平衡精度和性能
错误处理后会继续监控而不是退出
使用固定间隔检查而非inotify，保持兼容性

6. 跨平台兼容性处理

不同UNIX-like系统对struct stat的定义存在差异：

6.1 主要平台差异对比

字段/平台	Linux	macOS	FreeBSD
时间精度	纳秒	纳秒	纳秒
时间字段	st_atim	st_atimespec	st_atimespec
块大小	st_blksize	st_blksize	st_blksize
设备号	st_dev	st_dev	st_dev

6.2 兼容性包装宏

我在跨平台项目中常用以下宏来统一访问时间字段：

c复制#if defined(__APPLE__)
#define ST_ATIME(st) ((st).st_atimespec.tv_sec)
#define ST_MTIME(st) ((st).st_mtimespec.tv_sec)
#elif defined(__linux__)
#define ST_ATIME(st) ((st).st_atim.tv_sec)
#define ST_MTIME(st) ((st).st_mtim.tv_sec)
#else
#define ST_ATIME(st) ((st).st_atime)
#define ST_MTIME(st) ((st).st_mtime)
#endif

使用示例：

c复制time_t mtime = ST_MTIME(sb);

7. 安全注意事项与边界条件

7.1 TOCTOU竞争条件

Time-of-Check to Time-of-Use (TOCTOU)是stat相关操作常见的安全问题。典型场景：

检查文件权限通过
攻击者替换文件（如符号链接指向敏感文件）
程序使用该文件

防御方案：

对关键文件使用open()+fstat()组合
检查文件inode和device ID是否变化
在特权程序中使用O_NOFOLLOW标志

7.2 符号链接处理

stat()会跟随符号链接，这可能不是预期行为。安全敏感场景应该：

先用lstat()检查是否为链接
明确处理链接目标或拒绝链接文件

c复制struct stat sb;
if (lstat(path, &sb) == -1) {
    /* 错误处理 */
}

if (S_ISLNK(sb.st_mode)) {
    /* 处理符号链接情况 */
} else {
    if (stat(path, &sb) == -1) {
        /* 错误处理 */
    }
    /* 处理常规文件 */
}

8. 调试技巧与工具

8.1 使用strace跟踪调用

bash复制strace -e trace=file your_program

这会显示所有文件相关系统调用，包括stat/fstat的调用参数和返回值。

8.2 调试struct stat内容

我常用的调试打印函数：

c复制void print_stat(const struct stat *sb) {
    printf("Device: %lu\n", (unsigned long)sb->st_dev);
    printf("Inode: %lu\n", (unsigned long)sb->st_ino);
    printf("Mode: %o\n", sb->st_mode);
    printf("Links: %lu\n", (unsigned long)sb->st_nlink);
    printf("Size: %ld\n", (long)sb->st_size);
    printf("Blocks: %ld\n", (long)sb->st_blocks);
    
    char timestr[100];
    strftime(timestr, sizeof(timestr), "%F %T", 
            localtime(&sb->st_mtim.tv_sec));
    printf("Modify: %s.%09ld\n", timestr, sb->st_mtim.tv_nsec);
}

8.3 测试边界条件

建议测试以下特殊情况：

超大文件（超过2GB）
特殊设备文件
权限受限的文件
正在被写入的文件
网络文件系统上的文件

9. 扩展应用场景

9.1 实现简单的文件类型统计

c复制void file_stats(const char *dirpath) {
    DIR *dir;
    struct dirent *entry;
    struct stat sb;
    int reg=0, dir=0, other=0;

    if (!(dir = opendir(dirpath))) {
        perror("opendir");
        return;
    }

    while ((entry = readdir(dir)) != NULL) {
        char path[PATH_MAX];
        snprintf(path, sizeof(path), "%s/%s", dirpath, entry->d_name);
        
        if (stat(path, &sb) == -1) {
            perror("stat");
            continue;
        }

        if (S_ISREG(sb.st_mode)) reg++;
        else if (S_ISDIR(sb.st_mode)) dir++;
        else other++;
    }

    closedir(dir);
    printf("Regular: %d\nDirectories: %d\nOther: %d\n", reg, dir, other);
}