上周三凌晨2点,我们的ERP系统突然触发库存预警。排查发现是三个货到付款订单被拒收后,系统自动回退库存的脚本连续报错。更棘手的是,由于夜间无人值守,导致实际库存与系统记录相差87件商品,直接影响了第二天的大促活动备货。这种因拒收订单引发的库存同步问题,在电商后端开发中其实非常典型。
货到付款(COD)订单的逆向流程比普通订单复杂得多。当顾客拒收包裹时,物流状态变更会触发至少三个关键操作:订单状态变更为"已拒收"、支付状态回滚(如果已预授权)、库存数量回退。其中库存回退是最容易出问题的环节,因为它涉及多个系统的数据一致性校验。
我们最初的库存回退代码是这样的伪逻辑:
php复制try {
$order = Order::find($orderId);
if ($order->status === 'rejected') {
foreach ($order->items as $item) {
$product = Product::find($item->product_id);
$product->increment('stock', $item->quantity); // 直接增加库存
}
$order->logStatus('stock_returned');
}
} catch (Exception $e) {
Log::error($e);
}
这段代码存在三个致命缺陷:
更隐蔽的问题是,我们系统没有保存订单创建时的库存快照。假设:
在高并发场景下,这样的代码会导致库存超卖:
php复制$stock = $product->stock; // 读取当前库存
$stock += $returnQty; // 计算新库存
$product->update(['stock' => $stock]); // 更新库存
改造后的核心逻辑应该包含:
php复制DB::transaction(function() use ($orderId) {
$order = Order::with('items.product')->lockForUpdate()->find($orderId);
if ($order->status !== 'rejected') {
throw new BusinessException('订单状态不符');
}
$snapshot = json_decode($order->inventory_snapshot, true);
foreach ($order->items as $item) {
$product = $item->product;
if (!$product->is_active) {
throw new BusinessException("商品{$product->id}已下架");
}
$expectedStock = $snapshot[$item->product_id] ?? 0;
$currentStock = $product->stock;
if ($currentStock + $item->quantity > $expectedStock) {
throw new BusinessException("库存回退将导致超额");
}
$product->increment('stock', $item->quantity);
}
$order->update([
'inventory_status' => 'returned',
'returned_at' => now()
]);
InventoryLog::create([
'order_id' => $order->id,
'type' => 'return',
'details' => $order->items->pluck('quantity', 'product_id')
]);
});
在订单创建时即保存库存基准:
php复制// 创建订单时
$snapshot = $order->items->mapWithKeys(function($item) {
return [$item->product_id => $item->product->stock];
});
$order->update([
'inventory_snapshot' => $snapshot->toJson()
]);
对于可能失败的操作,需要增加重试机制:
php复制// 在队列任务中
public function handle()
{
try {
$this->returnStock();
} catch (Throwable $e) {
if ($this->attempts() < 3) {
$this->release(now()->addMinutes(5));
return;
}
$this->failAndNotify($e);
}
}
使用JMeter模拟以下场景:
关键指标监控:
建议监控面板包含:
现象:后台显示有库存,但下单时提示缺货
排查步骤:
修复方案:
sql复制-- 校正库存脚本示例
UPDATE products p
SET stock = (
SELECT initial_stock - reserved
FROM (
SELECT
product_id,
SUM(CASE WHEN type='order' THEN quantity ELSE 0 END) as reserved,
MAX(CASE WHEN type='initial' THEN quantity ELSE 0 END) as initial_stock
FROM inventory_logs
WHERE product_id = p.id
) t
)
WHERE p.track_inventory = true;
错误日志:
code复制Deadlock found when trying to get lock; try restarting transaction
优化方案:
php复制public function attemptReturnStock()
{
$retry = 0;
$maxRetry = 3;
do {
try {
return $this->performReturn();
} catch (QueryException $e) {
if (str_contains($e->getMessage(), 'Deadlock')) {
usleep(rand(100, 500) * 1000); // 随机延迟
$retry++;
continue;
}
throw $e;
}
} while ($retry < $maxRetry);
throw new Exception('超过最大重试次数');
}
将库存操作抽象为独立服务:
php复制class InventoryService {
public function returnStock(Order $order): void
{
Validator::make($order->toArray(), [
'status' => 'required|in:rejected',
'payment_status' => 'required|in:voided,refunded'
])->validate();
$this->guardAgainstConcurrentModification($order);
DB::transaction(function() use ($order) {
// 核心回退逻辑
});
}
protected function guardAgainstConcurrentModification(Order $order)
{
$original = $order->getOriginal();
if ($original['status'] !== 'shipped') {
throw new ConcurrentModificationException(
"订单状态已从{$original['status']}变为{$order->status}"
);
}
}
}
采用事件流记录所有库存变更:
php复制event(new InventoryChanged(
productId: $product->id,
delta: $item->quantity,
type: 'return',
source: ['order' => $order->id],
metadata: [
'operator' => 'system',
'reason' => 'cod_rejected'
]
));
通过投影(Projection)重建当前库存:
php复制class CurrentStockProjection {
public function rebuild($productId)
{
return InventoryEvent::where('product_id', $productId)
->sum('delta');
}
}
在实施完整方案后,我们的库存差异率从0.17%降至0.002%,夜间异常处理耗时平均减少82%。最宝贵的教训是:分布式系统中的状态回退,必须当作独立业务流程来设计,而不能简单视为正向流程的逆向操作。