Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Found a data race in GCWorker #42405

Open
winkyao opened this issue Mar 20, 2023 · 11 comments
Open

Found a data race in GCWorker #42405

winkyao opened this issue Mar 20, 2023 · 11 comments
Labels
affects-7.1 affects-7.5 affects-8.1 may-affects-4.0 This bug maybe affects 4.0.x versions. may-affects-5.0 This bug maybe affects 5.0.x versions. may-affects-5.1 This bug maybe affects 5.1.x versions. may-affects-5.2 This bug maybe affects 5.2.x versions. may-affects-5.3 This bug maybe affects 5.3.x versions. may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.1 may-affects-6.5 severity/major sig/transaction SIG:Transaction type/bug The issue is confirmed as a bug.

Comments

@winkyao
Copy link
Contributor

winkyao commented Mar 20, 2023

Bug Report

Original report: https://ask.pingcap.com/t/data-race-found-in-version/474

Application environment: 3 TiKV nodes, 3 PD nodes, 2 TiDB nodes in a local cluster
PoC environment
TiDB version: 6.6.0 (commit b417ad0)
Reproduction method: Brought the environment up and ran a simple bank account transaction workload, using the antithesis testing platform.

Problem: 3 Data Races observed:
First DATA RACE:

WARNING: DATA RACE
Read at 0x00c001e86958 by goroutine 9152:
  github.com/pingcap/tidb/store/gcworker.(*GCWorker).resolveLocks()
      /tidb/store/gcworker/gc_worker.go:1169 +0xb2
  github.com/pingcap/tidb/store/gcworker.(*GCWorker).runGCJob()
      /tidb/store/gcworker/gc_worker.go:767 +0x117
  github.com/pingcap/tidb/store/gcworker.(*GCWorker).leaderTick.func1()
      /tidb/store/gcworker/gc_worker.go:444 +0x6c

Previous write at 0x00c001e86958 by goroutine 1007:
  github.com/pingcap/tidb/store/gcworker.(*GCWorker).checkLeader()
      /tidb/store/gcworker/gc_worker.go:1920 +0x1cb
  github.com/pingcap/tidb/store/gcworker.(*GCWorker).tick()
      /tidb/store/gcworker/gc_worker.go:286 +0x64
  github.com/pingcap/tidb/store/gcworker.(*GCWorker).start()
      /tidb/store/gcworker/gc_worker.go:229 +0x659
  github.com/pingcap/tidb/store/gcworker.(*GCWorker).Start.func1()
      /tidb/store/gcworker/gc_worker.go:120 +0x64

Goroutine 9152 (running) created at:
  github.com/pingcap/tidb/store/gcworker.(*GCWorker).leaderTick()
      /tidb/store/gcworker/gc_worker.go:443 +0xb5a
  github.com/pingcap/tidb/store/gcworker.(*GCWorker).tick()
      /tidb/store/gcworker/gc_worker.go:293 +0x8f
  github.com/pingcap/tidb/store/gcworker.(*GCWorker).start()
      /tidb/store/gcworker/gc_worker.go:229 +0x659
  github.com/pingcap/tidb/store/gcworker.(*GCWorker).Start.func1()
      /tidb/store/gcworker/gc_worker.go:120 +0x64

Goroutine 1007 (running) created at:
  github.com/pingcap/tidb/store/gcworker.(*GCWorker).Start()
      /tidb/store/gcworker/gc_worker.go:120 +0x1f7
  github.com/pingcap/tidb/store/driver.(*tikvStore).StartGCWorker()
      /tidb/store/driver/tikv_driver.go:320 +0x98
  github.com/pingcap/tidb/session.BootstrapSession()
      /tidb/session/session.go:3467 +0x1aaa
  github.com/pingcap/tidb/domain.(*Domain).rebuildSysVarCache()
      /tidb/domain/sysvar_cache.go:147 +0x8d0
  github.com/pingcap/tidb/sessionctx/variable.glob..func174()
      /tidb/sessionctx/variable/sysvar.go:904 +0x5e
  github.com/pingcap/tidb/sessionctx/variable.(*SysVar).ValidateWithRelaxedValidation()
      /tidb/sessionctx/variable/variable.go:361 +0x1c7
  github.com/pingcap/tidb/domain.(*Domain).rebuildSysVarCache()
      /tidb/domain/sysvar_cache.go:146 +0x844
  github.com/pingcap/tidb/domain.(*Domain).LoadSysVarCacheLoop()
      /tidb/domain/domain.go:1448 +0xa8
  github.com/pingcap/tidb/session.BootstrapSession()
      /tidb/session/session.go:3350 +0x6d3
  github.com/pingcap/tidb/domain.(*Domain).GetSessionCache()
      /tidb/domain/sysvar_cache.go:62 +0x59
  github.com/pingcap/tidb/session.(*session).loadCommonGlobalVariablesIfNeeded()
      /tidb/session/session.go:3669 +0x104
  github.com/pingcap/tidb/session.(*session).ExecuteStmt()
      /tidb/session/session.go:2148 +0x145
  github.com/pingcap/tidb/session.(*session).ExecuteInternal()
      /tidb/session/session.go:1678 +0x31b
  github.com/pingcap/tidb/domain.(*Domain).LoadPrivilegeLoop()
      /tidb/domain/domain.go:1392 +0x130
  github.com/pingcap/tidb/session.BootstrapSession()
      /tidb/session/session.go:3343 +0x684
  main.createStoreAndDomain()
      /tidb/tidb-server/main.go:351 +0x304
  main.main()

Data race detected.

@winkyao winkyao added the type/bug The issue is confirmed as a bug. label Mar 20, 2023
@davidsearle-antithesis
Copy link

This is Dave from Antithesis who originally logged this in the TiDB forum.

@ti-chi-bot ti-chi-bot added may-affects-4.0 This bug maybe affects 4.0.x versions. may-affects-5.0 This bug maybe affects 5.0.x versions. may-affects-5.1 This bug maybe affects 5.1.x versions. may-affects-5.2 This bug maybe affects 5.2.x versions. may-affects-5.3 This bug maybe affects 5.3.x versions. may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.1 may-affects-6.5 labels Mar 21, 2023
@zyguan
Copy link
Contributor

zyguan commented Mar 21, 2023

@3pointer @joccau PTAL, there is data race on logBackupEnabled.

@davidsearle-antithesis
Copy link

davidsearle-antithesis commented Mar 21, 2023 via email

@zyguan
Copy link
Contributor

zyguan commented Mar 21, 2023

I have two other data races I can share with you. Who can I talk to, to show how I'm finding these....?

Thanks, @davidsearle-antithesis . Can you post the stacks of the other two data races here? In most cases, the stack info is enough to investigate data race issues.

@davidsearle-antithesis
Copy link

davidsearle-antithesis commented Mar 21, 2023 via email

@mjonss
Copy link
Contributor

mjonss commented May 4, 2023

Just as a note, I and Daniël had a meeting with David yesterday and will follow up from there regarding possible collaborations.

@BornChanger
Copy link
Contributor

@davidsearle-antithesis can you please tell me if the problem is still there against current TiDB version?

@3pointer
Copy link
Contributor

3pointer commented Mar 25, 2024

should be fixed in #40759, the race code are removed in this PR

@cfzjywxk
Copy link
Contributor

cfzjywxk commented May 6, 2024

@MyonKeminta
Could you help to verify if it still exists?

@MyonKeminta
Copy link
Contributor

@3pointer Did you encounter this issue again? Is there any more details?

@davidsearle-antithesis
Copy link

davidsearle-antithesis commented May 6, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-7.1 affects-7.5 affects-8.1 may-affects-4.0 This bug maybe affects 4.0.x versions. may-affects-5.0 This bug maybe affects 5.0.x versions. may-affects-5.1 This bug maybe affects 5.1.x versions. may-affects-5.2 This bug maybe affects 5.2.x versions. may-affects-5.3 This bug maybe affects 5.3.x versions. may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.1 may-affects-6.5 severity/major sig/transaction SIG:Transaction type/bug The issue is confirmed as a bug.
Projects
None yet
Development

No branches or pull requests