"递增子序列II"是算法领域的一个经典变种问题,它建立在基础递增子序列问题之上,但增加了额外的约束条件。我们先明确几个关键概念:
在基础版本中,我们可能只需要找出最长递增子序列的长度。而"II"版本往往会增加如下约束之一:
在股票价格序列分析中,寻找价格持续上涨的天数组合,用于预测趋势持续性。例如:
DNA序列分析中,寻找特定模式的碱基排列。例如:
分析用户活动时间序列,发现行为模式:
这是解决递增子序列问题的经典方法,时间复杂度通常为O(n²)。
python复制def countIncreasingSubsequences(nums):
n = len(nums)
dp = [1] * n # 每个元素至少可以单独作为子序列
count = 0
for i in range(n):
for j in range(i):
if nums[i] > nums[j]:
dp[i] += dp[j]
count += dp[i]
return count
关键点解析:
当序列长度很大时(n>10⁴),可以使用树状数组将复杂度优化到O(n log n)。
python复制class FenwickTree:
def __init__(self, size):
self.size = size
self.tree = [0] * (self.size + 1)
def update(self, index, delta):
while index <= self.size:
self.tree[index] += delta
index += index & -index
def query(self, index):
res = 0
while index > 0:
res += self.tree[index]
index -= index & -index
return res
def countIncreasingSubsequences(nums):
sorted_nums = sorted(set(nums))
rank = {v:i+1 for i,v in enumerate(sorted_nums)}
ft = FenwickTree(len(sorted_nums))
res = 0
for num in nums:
r = rank[num]
count = ft.query(r-1) + 1 # +1是当前元素单独成序列
res += count
ft.update(r, count)
return res
优化原理:
当序列包含重复元素时,需要特别处理以避免重复计数。以下是改进方案:
python复制def countDistinctIncreasingSubsequences(nums):
n = len(nums)
dp = [1] * n
last_occurrence = {} # 记录每个值最后一次出现的位置
for i in range(n):
if nums[i] in last_occurrence:
j = last_occurrence[nums[i]]
dp[i] -= dp[j] # 减去重复部分
for j in range(i):
if nums[i] > nums[j]:
dp[i] += dp[j]
last_occurrence[nums[i]] = i
return sum(dp)
去重逻辑:
对于特定场景,可以预先处理数据提升性能:
对于超长序列(n>10⁶),可以考虑分治+并行:
python复制from multiprocessing import Pool
def process_chunk(args):
start, end, nums = args
# 处理子序列计算
return partial_result
def parallel_count(nums, chunk_size=10000):
chunks = [(i, min(i+chunk_size, len(nums)), nums)
for i in range(0, len(nums), chunk_size)]
with Pool() as p:
results = p.map(process_chunk, chunks)
return merge_results(results)
边界条件处理不当:
整数溢出问题:
重复计数问题:
python复制assert countIncreasingSubsequences([]) == 0
assert countIncreasingSubsequences([1]) == 1
assert countIncreasingSubsequences([1,2,3]) == 7 # [1],[2],[3],[1,2],[1,3],[2,3],[1,2,3]
python复制import pdb; pdb.set_trace()
python复制import cProfile
cProfile.run('countIncreasingSubsequences(large_list)')
要求统计长度恰好为k的递增子序列数量。解法需要增加一维状态:
python复制def countKLengthIncreasing(nums, k):
n = len(nums)
# dp[i][l]表示以nums[i]结尾长度为l的子序列数
dp = [[0]*(k+1) for _ in range(n)]
for i in range(n):
dp[i][1] = 1 # 单元素序列
for j in range(i):
if nums[i] > nums[j]:
for l in range(2, k+1):
dp[i][l] += dp[j][l-1]
return sum(dp[i][k] for i in range(n))
每个元素有权重,求递增子序列的最大权重和:
python复制def maxWeightIncreasingSubsequence(nums, weights):
n = len(nums)
dp = weights.copy() # 初始化为单个元素的权重
for i in range(n):
for j in range(i):
if nums[i] > nums[j]:
dp[i] = max(dp[i], dp[j] + weights[i])
return max(dp)
要求相邻元素差值在一定范围内:
python复制def countBoundedIncreasingSubsequences(nums, delta):
n = len(nums)
dp = [1] * n
for i in range(n):
for j in range(i):
if 0 < nums[i] - nums[j] <= delta:
dp[i] += dp[j]
return sum(dp)
python复制# 替代完整的二维DP表
prev_dp = [1]*n
curr_dp = [1]*n
# 按层计算...
python复制# 不好的方式
for i in range(n):
for j in range(i):
# ...
# 更好的方式
for j in range(n):
for i in range(j+1, n):
# ...
python复制def adaptive_count(nums):
if len(nums) < 1000:
return basic_dp(nums)
else:
return optimized_tree(nums)
python复制# 建立值到索引的映射
from collections import defaultdict
value_indices = defaultdict(list)
for idx, num in enumerate(nums):
value_indices[num].append(idx)
在处理实际业务场景时,我发现最有效的优化往往来自于对具体问题的深入理解。例如在金融时序分析中,价格通常只在有限范围内波动,可以利用这个特性设计更高效的剪枝策略。另一个实用技巧是:当只需要统计数量而不需要具体子序列时,可以使用更紧凑的状态表示法,通常能降低一个数量级的内存使用。