Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: file-worker任务状态更新请求无序到达导致第三方源文件偶现分发失败 #2434

Closed
jsonwan opened this issue Sep 6, 2023 · 0 comments
Assignees
Labels
done 已上线到正式环境并验收通过 kind/bug 程序故障Bug,漏洞

Comments

@jsonwan
Copy link
Collaborator

jsonwan commented Sep 6, 2023

Version / Branch / tag
3.7.x

出了什么问题?(What Happened?)
第三方源文件偶现分发失败:
image

分析:
image
企业微信截图_16939696651799
并发问题,file-worker发起2个请求,两者几乎同时但是后面的请求先到达了file-gateway,请求的处理过程非原子性,两个请求对数据的交错访问导致了这个问题

如何复现?(How to reproduce?)
建立分发第三方源文件的拨测任务,长时间运行后在大量的任务中发现少量分发失败的任务。
特征:任务耗时比较长(30m左右),最终状态为执行失败,单个机器的状态为等待执行,从上传源日志中能够看出触发了一次重调度。

预期结果(What you expect?)
正常分发。

@jsonwan jsonwan added kind/bug 程序故障Bug,漏洞 backlog 需求初始状态,等待产品进行评估 labels Sep 6, 2023
@jsonwan jsonwan self-assigned this Sep 6, 2023
wangyu096 added a commit that referenced this issue Sep 14, 2023
fix: file-worker任务状态更新请求无序到达导致第三方源文件偶现分发失败 #2434
jsonwan added a commit to jsonwan/bk-job that referenced this issue Sep 15, 2023
统计子任务完成状态时加共享锁,防止应用统计结果的过程中数据被修改
jsonwan added a commit to jsonwan/bk-job that referenced this issue Sep 15, 2023
统计子任务完成状态时加锁,防止应用统计结果的过程中数据被修改
jsonwan added a commit that referenced this issue Sep 15, 2023
fix: file-worker任务状态更新请求无序到达导致第三方源文件偶现分发失败 #2434
@bkjob-bot bkjob-bot added for test 可以在测试环境进行验收 done 已上线到正式环境并验收通过 and removed backlog 需求初始状态,等待产品进行评估 for test 可以在测试环境进行验收 labels Sep 21, 2023
jsonwan added a commit to jsonwan/bk-job that referenced this issue Sep 26, 2023
jsonwan added a commit to jsonwan/bk-job that referenced this issue Sep 26, 2023
jsonwan added a commit to jsonwan/bk-job that referenced this issue Sep 26, 2023
1.修复Trace数据断链;
2.优化重调度逻辑,支持参数可配置;
3.重构部分代码。
wangyu096 added a commit that referenced this issue Sep 26, 2023
fix: file-worker任务状态更新请求无序到达导致第三方源文件偶现分发失败 #2434
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
done 已上线到正式环境并验收通过 kind/bug 程序故障Bug,漏洞
Projects
None yet
Development

No branches or pull requests

2 participants