去年在开发自动化测试工具时,我遇到了一个棘手问题:如何精准模拟人类操作浏览器的完整流程?市面上的自动化工具要么功能过于简单,要么学习曲线陡峭。经过多次尝试,最终基于Openclaw框架实现了这套浏览器操作方案,今天就来分享第三阶段的实现细节。
Openclaw(内部代号"龙虾")是我们团队自研的浏览器自动化控制框架,其核心优势在于:
推荐使用以下环境组合:
bash复制# 基础环境
Python 3.8+
Node.js 14+
Java 8(部分驱动依赖)
# 核心依赖
pip install openclaw-core==2.3.1
pip install selenium-wire
npm install puppeteer-cluster
注意:不同浏览器需要对应版本的WebDriver,建议通过以下命令自动管理:
python复制from openclaw.driver import DriverManager
DriverManager.install_all()
创建config/browser.yaml:
yaml复制default:
headless: false
timeout: 30
retry_times: 3
viewport:
width: 1366
height: 768
chrome_options:
args:
- --disable-infobars
- --disable-extensions
prefs:
profile.default_content_setting_values.notifications: 2
传统定位方法经常因动态DOM失效,我们改进的方案是:
python复制def smart_locate(selector, timeout=10):
"""智能元素定位器"""
start_time = time.time()
while time.time() - start_time < timeout:
try:
# 优先尝试常规定位
elem = driver.find_element(By.CSS_SELECTOR, selector)
if elem.is_displayed():
return elem
except:
pass
# 备用方案:XPath+视觉匹配
if '#' in selector:
xpath = f'//*[@id="{selector.split("#")[1]}"]'
elems = driver.find_elements(By.XPATH, xpath)
if len(elems) > 0:
return elems[0]
time.sleep(0.5)
raise TimeoutError(f"Element {selector} not found")
实现人类化鼠标移动轨迹:
python复制def human_move_to(element):
"""模拟人类鼠标移动"""
start_x, start_y = driver.get_window_size().values()
end_x = element.location['x'] + element.size['width']/2
end_y = element.location['y'] + element.size['height']/2
# 贝塞尔曲线路径生成
points = generate_bezier(
start=(start_x, start_y),
end=(end_x, end_y),
control_points=3,
deviation=30
)
# 分段移动
action = ActionChains(driver)
for x, y in points:
action.move_by_offset(x, y)
action.perform()
python复制class PageStateDetector:
def __init__(self, driver):
self.driver = driver
self.baseline = None
def capture_state(self):
"""记录当前页面特征"""
return {
'dom_hash': self._get_dom_hash(),
'network': self._get_active_requests(),
'performance': self._get_performance_metrics()
}
def is_stable(self, timeout=5, interval=0.5):
"""判断页面是否稳定"""
if not self.baseline:
self.baseline = self.capture_state()
return False
start = time.time()
while time.time() - start < timeout:
current = self.capture_state()
if self._compare_states(self.baseline, current):
return True
time.sleep(interval)
self.baseline = current
return False
python复制def auto_recover(func):
"""自动化操作异常恢复装饰器"""
def wrapper(*args, **kwargs):
retry = config.get('retry_times', 3)
while retry > 0:
try:
return func(*args, **kwargs)
except Exception as e:
logging.warning(f"Operation failed: {str(e)}")
handle_exception(e)
retry -= 1
if retry == 0:
raise
refresh_context()
return wrapper
python复制@auto_recover
def purchase_flow(url, item_id, address):
"""电商平台完整下单流程"""
# 初始化检测器
detector = PageStateDetector(driver)
# 访问商品页
driver.get(url)
wait_until_stable(detector)
# 选择商品规格
select_item(item_id)
human_move_to(find('add_to_cart'))
smart_click('add_to_cart')
# 结算流程
wait_until_stable(detector)
smart_click('checkout_btn')
fill_shipping_info(address)
submit_payment()
# 订单确认
return get_order_number()
python复制def preload_resources(urls):
"""预加载关键资源"""
with concurrent.futures.ThreadPoolExecutor() as executor:
futures = []
for url in urls:
futures.append(executor.submit(
requests.head, url,
headers={'User-Agent': driver.execute_script("return navigator.userAgent")}
))
concurrent.futures.wait(futures)
javascript复制// 注入页面优化脚本
function enableDOMCache() {
const originalCreateElement = document.createElement.bind(document);
document.createElement = function(tagName) {
const elem = originalCreateElement(tagName);
if (['DIV', 'SPAN', 'A'].includes(tagName)) {
elem.dataset.cached = 'true';
}
return elem;
};
}
| 错误码 | 含义 | 解决方案 |
|---|---|---|
| E404 | 元素未找到 | 检查选择器/增加等待时间 |
| E405 | 交互被拦截 | 禁用浏览器扩展/调整点击位置 |
| E408 | 页面超时 | 检查网络/调整超时阈值 |
| E500 | 内部错误 | 重启浏览器实例 |
推荐日志配置:
python复制logging.basicConfig(
level=logging.INFO,
format='%(asctime)s [%(levelname)s] %(message)s',
handlers=[
logging.FileHandler('openclaw.log'),
logging.StreamHandler()
]
)
# 关键操作日志标记
def log_step(action, selector=None):
logging.info(f"[ACTION] {action}" + (f" on {selector}" if selector else ""))
take_screenshot(f"step_{int(time.time())}.png")
创建插件模板:
python复制from openclaw.plugins import BasePlugin
class CustomPlugin(BasePlugin):
PLUGIN_NAME = "my_plugin"
def __init__(self, driver):
super().__init__(driver)
self.register_action('triple_click', self.triple_click)
def triple_click(self, selector):
elem = self.find_element(selector)
ActionChains(self.driver)\
.click(elem).pause(0.1)\
.click(elem).pause(0.1)\
.click(elem).perform()
浏览器差异处理策略:
python复制def adapt_selector(selector):
"""选择器平台适配"""
browser = config.get('browser', 'chrome')
if browser == 'firefox':
if selector.startswith('.'):
return f'css|{selector}'
elif selector.startswith('#'):
return f'id|{selector[1:]}'
return selector
这套方案在实际项目中已经处理了超过20万次浏览器操作,成功率保持在99.7%以上。最难能可贵的是所有组件都可独立使用,比如单独采用智能等待机制或异常恢复系统。建议首次使用时先从小规模场景开始验证,逐步扩展到复杂流程。