数据结构与算法学习指南：从基础到实践

鲸晚好梦

1. 数据结构与算法学习指南：从零基础到系统掌握

作为一名计算机专业的学习者，我深知数据结构与算法的重要性。去年我系统学习了浙江大学MOOC的数据结构与算法课程，整理了这份详细的学习笔记。这份笔记不仅记录了课程的核心内容，还融入了我在实践中的理解和思考，希望能帮助同样在学习的你少走弯路。

2. 数据结构基础概念解析

2.1 什么是数据结构

数据结构本质上是数据对象在计算机中的组织方式。想象一下图书馆的书架：书籍可以按出版时间顺序排列（线性结构），也可以按学科分类后再按作者排序（树形结构）。数据结构就是解决如何高效组织和存储数据的问题。

数据结构包含两个核心方面：

逻辑结构：数据元素之间的抽象关系，包括线性结构、树形结构、图结构等
物理存储结构：数据在计算机内存中的实际存储方式，如数组、链表等

2.2 抽象数据类型(ADT)

抽象数据类型是描述数据结构的重要工具，它包含：

数据对象集：要处理的数据集合
数据操作集：对数据可以进行的操作

关键点在于"抽象"二字 - 我们只关心操作的功能，不关心具体实现细节。例如栈的ADT定义了push和pop操作，但不关心是用数组还是链表实现。

3. 算法基础与复杂度分析

3.1 算法定义与评价标准

算法是解决特定问题的有限指令集。一个好的算法应该具备：

正确性：能正确解决问题
可读性：易于理解和维护
健壮性：能处理异常输入
高效性：时间空间复杂度低

3.2 复杂度分析方法

复杂度分析是算法设计的核心。我们通常关注最坏情况复杂度，使用大O表示法描述算法随输入规模增长的趋势。

常见复杂度等级：

O(1)：常数时间，如数组随机访问
O(log n)：对数时间，如二分查找
O(n)：线性时间，如遍历数组
O(n²)：平方时间，如简单排序算法
O(2ⁿ)：指数时间，通常不可行

3.3 递归与迭代的效率对比

递归虽然代码简洁，但存在额外开销：

函数调用栈占用内存空间
频繁的函数调用消耗时间

以打印1到N的数字为例：

c复制// 递归实现 - 空间复杂度O(N)
void printN_recursive(int n) {
    if(n) {
        printN_recursive(n-1);
        printf("%d\n", n);
    }
}

// 迭代实现 - 空间复杂度O(1)
void printN_iterative(int n) {
    for(int i=1; i<=n; i++) {
        printf("%d\n", i);
    }
}

当N很大时（如100000），递归版本可能导致栈溢出，而迭代版本则不会。

4. 线性结构详解

4.1 线性表实现方式对比

线性表是最基础的数据结构，主要有两种实现方式：

特性	顺序表(数组)	链表
存储方式	连续内存空间	离散节点通过指针连接
访问速度	O(1)随机访问	O(n)顺序访问
插入删除	O(n)需要移动元素	O(1)修改指针即可
空间分配	预先固定大小	动态灵活分配
缓存友好	是	否

4.2 栈与队列的应用场景

栈(Stack)是LIFO结构，典型应用：

函数调用栈
表达式求值
括号匹配检查
回溯算法(如迷宫求解)

队列(Queue)是FIFO结构，典型应用：

任务调度
消息队列
广度优先搜索
打印机任务管理

4.2.1 栈的两种实现方式

数组实现：

c复制#define MAXSIZE 100
typedef struct {
    int data[MAXSIZE];
    int top;
} Stack;

void push(Stack *s, int item) {
    if(s->top == MAXSIZE-1) {
        printf("Stack overflow");
        return;
    }
    s->data[++(s->top)] = item;
}

int pop(Stack *s) {
    if(s->top == -1) {
        printf("Stack underflow");
        return -1; 
    }
    return s->data[(s->top)--];
}

链表实现：

c复制typedef struct Node {
    int data;
    struct Node *next;
} Node;

void push(Node **top, int item) {
    Node *newNode = (Node*)malloc(sizeof(Node));
    newNode->data = item;
    newNode->next = *top;
    *top = newNode;
}

int pop(Node **top) {
    if(*top == NULL) {
        printf("Stack underflow");
        return -1;
    }
    Node *temp = *top;
    int item = temp->data;
    *top = (*top)->next;
    free(temp);
    return item;
}

5. 树形结构深度解析

5.1 二叉树遍历算法

二叉树遍历有三种基本方式，每种又有递归和迭代两种实现：

前序遍历：根→左→右
中序遍历：左→根→右
后序遍历：左→右→根

递归实现简洁但可能栈溢出，迭代实现更安全但代码复杂。以下是中序遍历的迭代实现：

c复制void inorderTraversal(TreeNode *root) {
    Stack s;
    initStack(&s);
    TreeNode *curr = root;
    
    while(curr != NULL || !isEmpty(s)) {
        // 左子树入栈
        while(curr != NULL) {
            push(&s, curr);
            curr = curr->left;
        }
        
        curr = pop(&s);
        printf("%d ", curr->val);
        
        // 转向右子树
        curr = curr->right;
    }
}

5.2 二叉搜索树(BST)操作

BST保持以下性质：左子树值 < 根值 < 右子树值。这使得查找、插入、删除都能在O(log n)时间内完成。

BST删除节点有三种情况：

叶子节点：直接删除
只有一个子节点：用子节点替代
有两个子节点：用左子树最大或右子树最小节点替代

c复制TreeNode* deleteNode(TreeNode* root, int key) {
    if(!root) return NULL;
    
    if(key < root->val) {
        root->left = deleteNode(root->left, key);
    } else if(key > root->val) {
        root->right = deleteNode(root->right, key);
    } else {
        // 找到要删除的节点
        if(!root->left) {
            TreeNode* temp = root->right;
            free(root);
            return temp;
        } else if(!root->right) {
            TreeNode* temp = root->left;
            free(root);
            return temp;
        }
        
        // 有两个子节点的情况
        TreeNode* temp = findMin(root->right);
        root->val = temp->val;
        root->right = deleteNode(root->right, temp->val);
    }
    return root;
}

5.3 平衡二叉树(AVL树)

AVL树通过旋转操作保持平衡，确保树高始终为O(log n)。有四种旋转情况：

左左(LL)旋转：右旋
右右(RR)旋转：左旋
左右(LR)旋转：先左旋后右旋
右左(RL)旋转：先右旋后左旋

旋转操作的核心是调整节点指针，保持BST性质的同时恢复平衡。

6. 堆结构与优先队列

6.1 堆的基本操作

堆是完全二叉树，分为最大堆和最小堆。堆支持两种基本操作：

插入：新元素放在末尾，然后上浮
删除：取走根元素，将末尾元素移到根部，然后下沉

c复制// 最大堆的上浮操作
void siftUp(int heap[], int pos) {
    int temp = heap[pos];
    while(pos > 1 && heap[pos/2] < temp) {
        heap[pos] = heap[pos/2];
        pos /= 2;
    }
    heap[pos] = temp;
}

// 最大堆的下沉操作
void siftDown(int heap[], int pos, int size) {
    int temp = heap[pos];
    while(2*pos <= size) {
        int child = 2*pos;
        if(child < size && heap[child] < heap[child+1]) {
            child++;
        }
        if(temp >= heap[child]) break;
        heap[pos] = heap[child];
        pos = child;
    }
    heap[pos] = temp;
}

6.2 堆排序算法

堆排序利用堆的性质实现高效排序：

建堆：将无序数组构建成堆
排序：反复取出堆顶元素，调整堆结构

c复制void heapSort(int arr[], int n) {
    // 建堆
    for(int i = n/2; i >= 1; i--) {
        siftDown(arr, i, n);
    }
    
    // 排序
    for(int i = n; i > 1; i--) {
        swap(&arr[1], &arr[i]);
        siftDown(arr, 1, i-1);
    }
}

堆排序时间复杂度为O(n log n)，空间复杂度O(1)，是不稳定的排序算法。

7. 图算法精讲

7.1 图的表示方法

邻接矩阵：适合稠密图，空间O(V²)

c复制int adjMatrix[MAX][MAX];  // 1表示有边，0表示无边

邻接表：适合稀疏图，空间O(V+E)

c复制typedef struct Node {
    int vertex;
    struct Node* next;
} Node;

Node* adjList[MAX];  // 每个顶点一个链表

7.2 图的遍历算法

深度优先搜索(DFS)：

c复制void DFS(int v, bool visited[], Node* adjList[]) {
    visited[v] = true;
    printf("%d ", v);
    
    Node* curr = adjList[v];
    while(curr != NULL) {
        if(!visited[curr->vertex]) {
            DFS(curr->vertex, visited, adjList);
        }
        curr = curr->next;
    }
}

广度优先搜索(BFS)：

c复制void BFS(int start, Node* adjList[]) {
    bool visited[MAX] = {false};
    Queue q;
    initQueue(&q);
    
    visited[start] = true;
    enqueue(&q, start);
    
    while(!isEmpty(q)) {
        int v = dequeue(&q);
        printf("%d ", v);
        
        Node* curr = adjList[v];
        while(curr != NULL) {
            if(!visited[curr->vertex]) {
                visited[curr->vertex] = true;
                enqueue(&q, curr->vertex);
            }
            curr = curr->next;
        }
    }
}

7.3 最短路径算法

Dijkstra算法（单源最短路径）：

c复制void dijkstra(int graph[MAX][MAX], int src, int V) {
    int dist[V];
    bool sptSet[V];
    
    for(int i = 0; i < V; i++) {
        dist[i] = INT_MAX;
        sptSet[i] = false;
    }
    
    dist[src] = 0;
    
    for(int count = 0; count < V-1; count++) {
        int u = minDistance(dist, sptSet, V);
        sptSet[u] = true;
        
        for(int v = 0; v < V; v++) {
            if(!sptSet[v] && graph[u][v] && dist[u] != INT_MAX 
               && dist[u] + graph[u][v] < dist[v]) {
                dist[v] = dist[u] + graph[u][v];
            }
        }
    }
    
    printSolution(dist, V);
}

8. 排序算法全面比较

8.1 常见排序算法性能对比

排序算法	平均时间复杂度	最坏时间复杂度	空间复杂度	稳定性
冒泡排序	O(n²)	O(n²)	O(1)	稳定
选择排序	O(n²)	O(n²)	O(1)	不稳定
插入排序	O(n²)	O(n²)	O(1)	稳定
希尔排序	O(n log n)	O(n²)	O(1)	不稳定
归并排序	O(n log n)	O(n log n)	O(n)	稳定
快速排序	O(n log n)	O(n²)	O(log n)	不稳定
堆排序	O(n log n)	O(n log n)	O(1)	不稳定

8.2 快速排序实现

快速排序是分治法的典型应用：

c复制void quickSort(int arr[], int low, int high) {
    if(low < high) {
        int pi = partition(arr, low, high);
        
        quickSort(arr, low, pi - 1);
        quickSort(arr, pi + 1, high);
    }
}

int partition(int arr[], int low, int high) {
    int pivot = arr[high];
    int i = low - 1;
    
    for(int j = low; j <= high-1; j++) {
        if(arr[j] < pivot) {
            i++;
            swap(&arr[i], &arr[j]);
        }
    }
    swap(&arr[i+1], &arr[high]);
    return i+1;
}

9. 高级数据结构与应用

9.1 哈希表设计与冲突解决

哈希表通过哈希函数将键映射到存储位置，核心问题是如何处理冲突：

开放定址法：
- 线性探测
- 平方探测
- 双重哈希
链地址法：每个桶使用链表存储冲突元素

c复制#define SIZE 10

typedef struct HashNode {
    int key;
    int value;
    struct HashNode* next;
} HashNode;

HashNode* hashTable[SIZE];

void insert(int key, int value) {
    int hash = key % SIZE;
    
    HashNode* newNode = (HashNode*)malloc(sizeof(HashNode));
    newNode->key = key;
    newNode->value = value;
    newNode->next = NULL;
    
    if(hashTable[hash] == NULL) {
        hashTable[hash] = newNode;
    } else {
        HashNode* curr = hashTable[hash];
        while(curr->next != NULL) {
            curr = curr->next;
        }
        curr->next = newNode;
    }
}

9.2 并查集实现

并查集支持两种操作：

Find：查找元素所属集合
Union：合并两个集合

优化技术：

路径压缩：使树更扁平
按秩合并：小树合并到大树下

c复制int parent[MAX];
int rank[MAX];

void makeSet(int x) {
    parent[x] = x;
    rank[x] = 0;
}

int find(int x) {
    if(parent[x] != x) {
        parent[x] = find(parent[x]);  // 路径压缩
    }
    return parent[x];
}

void unionSets(int x, int y) {
    int xRoot = find(x);
    int yRoot = find(y);
    
    if(xRoot == yRoot) return;
    
    // 按秩合并
    if(rank[xRoot] < rank[yRoot]) {
        parent[xRoot] = yRoot;
    } else if(rank[xRoot] > rank[yRoot]) {
        parent[yRoot] = xRoot;
    } else {
        parent[yRoot] = xRoot;
        rank[xRoot]++;
    }
}