Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lots of PreCommit1 sectors unscheduled to idle workers #4955

Closed
dodohack opened this issue Nov 21, 2020 · 7 comments
Closed

Lots of PreCommit1 sectors unscheduled to idle workers #4955

dodohack opened this issue Nov 21, 2020 · 7 comments

Comments

@dodohack
Copy link

I'm running lotus v1.2.1 release on mainnet.

I have lots of long lasting sectors stuck in PC1 which is not scheduled to worker3960x or worker7402 which should be able to take several PC1 jobs.

Here is the list of sectors, workers and sectors currently running by my workers.

Is there a way to manually schedule these PC1 sectors?

root@storager:/home/aries# lotus-miner sectors list
ID  State       OnChain  Active  Expiration                    Deals  
0   Proving     YES      YES     1800659 (in 1 year 24 weeks)  CC     
1   Proving     YES      YES     1800659 (in 1 year 24 weeks)  CC     
2   PreCommit1  NO       NO      n/a                           CC     
3   PreCommit1  NO       NO      n/a                           CC     
4   Proving     YES      YES     1800659 (in 1 year 24 weeks)  CC     
6   Proving     YES      YES     1800659 (in 1 year 24 weeks)  CC     
7   PreCommit1  NO       NO      n/a                           CC     
8   Proving     YES      YES     1800659 (in 1 year 24 weeks)  CC     
9   Proving     YES      YES     1800659 (in 1 year 24 weeks)  CC     
10  PreCommit1  NO       NO      n/a                           CC     
11  PreCommit1  NO       NO      n/a                           CC     
12  PreCommit2  NO       NO      n/a                           CC     
13  Removing    NO       NO      n/a                           CC     
14  PreCommit2  NO       NO      n/a                           CC     
15  Proving     YES      NO      1806419 (in 1 year 24 weeks)  CC     
16  Proving     YES      NO      1806419 (in 1 year 24 weeks)  CC     
17  Proving     YES      NO      1806419 (in 1 year 24 weeks)  CC     
18  PreCommit1  NO       NO      n/a                           CC     
19  PreCommit1  NO       NO      n/a                           CC     
20  Committing  NO       NO      n/a                           CC     
21  Proving     YES      NO      1806419 (in 1 year 24 weeks)  CC     
22  PreCommit1  NO       NO      n/a                           CC     
23  PreCommit1  NO       NO      n/a                           CC     
24  PreCommit1  NO       NO      n/a                           CC     
25  PreCommit1  NO       NO      n/a                           CC     
26  PreCommit1  NO       NO      n/a                           CC     
27  PreCommit1  NO       NO      n/a                           CC     
28  PreCommit1  NO       NO      n/a                           CC     
29  WaitDeals   NO       NO      n/a                           2      
root@storager:/home/aries# lotus mpool pending --local
root@storager:/home/aries# lotus-miner sealing workers
Worker 08689063-5321-4c99-9317-00c2c8f84fd2, host worker7402
	CPU:  [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||      ] 44/48 core(s) in use
	RAM:  [|||                                                             ] 7% 35.42 GiB/503.6 GiB
	VMEM: [||||||||||                                                      ] 16% 155.4 GiB/950.7 GiB
	GPU: GeForce RTX 2080 Ti, used
Worker 11d06504-ac19-40cc-ae0e-a4dc54c81afd, host worker7402
	CPU:  [||||                                                            ] 3/48 core(s) in use
	RAM:  [|||||||||||||||||||||                                           ] 34% 173.4 GiB/503.6 GiB
	VMEM: [||||||||||||                                                    ] 20% 197.4 GiB/950.7 GiB
	GPU: GeForce RTX 2080 Ti, not used
Worker 79e425bf-e5de-4c50-a242-ce67d652fc77, host storager
	CPU:  [                                                                ] 0/56 core(s) in use
	RAM:  [||||                                                            ] 7% 18.69 GiB/251.8 GiB
	VMEM: [||||                                                            ] 7% 18.69 GiB/251.8 GiB
Worker 9c80ef74-4e8d-4139-be6f-1f0bef315bfc, host worker3960x
	CPU:  [                                                                ] 0/48 core(s) in use
	RAM:  [|                                                               ] 1% 4.13 GiB/251.7 GiB
	VMEM: [                                                                ] 0% 4.13 GiB/512.5 GiB
	GPU: Quadro RTX 4000, not used
Worker cb169b1c-c05b-40d1-b25e-cbb9798f8ff7, host worker3960x
	CPU:  [                                                                ] 0/48 core(s) in use
	RAM:  [|||||||||||||||||                                               ] 27% 69.16 GiB/251.7 GiB
	VMEM: [||||||||                                                        ] 13% 69.16 GiB/512.5 GiB
	GPU: Quadro RTX 4000, not used
root@storager:/home/aries# lotus-miner sealing jobs
ID        Sector  Worker    Hostname    Task  State    Time
8ebfe1d7  18      11d06504  worker7402  PC1   running  3h55m20.8s
e566929b  19      11d06504  worker7402  PC1   running  3h28m59.8s
69ddfbd5  22      11d06504  worker7402  PC1   running  3h5m58.7s
eb497301  20      08689063  worker7402  C2    running  34m31.9s
root@storager:/home/aries# 
@dodohack
Copy link
Author

Restart the miner will solve the issue, however is there a way to reschedule those sectors without restart miner.

@dodohack
Copy link
Author

However, after restarted the miner, workers did seal sectors, but sealing process quickly went into error which is described in this issue: #4865.

2020-11-21T16:23:25.820Z	WARN	sectors	storage-sealing/fsm.go:507	sector 10 got error event sealing.SectorSealPreCommit1Failed: ticket expired error: ticket expired: seal height: 247559, head: 255646
2020-11-21T16:23:25.820Z	INFO	sectors	storage-sealing/states_failed.go:26	SealPreCommit1Failed(10), waiting 59.17906322s before retrying
2020-11-21T16:23:25.908Z	WARN	sectors	storage-sealing/fsm.go:507	sector 3 got error event sealing.SectorSealPreCommit1Failed: ticket expired error: ticket expired: seal height: 247463, head: 255646
2020-11-21T16:23:25.909Z	INFO	sectors	storage-sealing/states_failed.go:26	SealPreCommit1Failed(3), waiting 59.09033129s before retrying
2020-11-21T16:23:25.942Z	WARN	sectors	storage-sealing/fsm.go:507	sector 7 got error event sealing.SectorSealPreCommit1Failed: ticket expired error: ticket expired: seal height: 247474, head: 255646
2020-11-21T16:23:25.943Z	INFO	sectors	storage-sealing/states_failed.go:26	SealPreCommit1Failed(7), waiting 59.056359648s before retrying
2020-11-21T16:23:29.001Z	DEBUG	advmgr	sector-storage/sched.go:354	SCHED 1 queued; 10 open windows

@jennijuju
Copy link
Member

However, after restarted the miner, workers did seal sectors, but sealing process quickly went into error which is described in this issue: #4865.

2020-11-21T16:23:25.820Z	WARN	sectors	storage-sealing/fsm.go:507	sector 10 got error event sealing.SectorSealPreCommit1Failed: ticket expired error: ticket expired: seal height: 247559, head: 255646
2020-11-21T16:23:25.820Z	INFO	sectors	storage-sealing/states_failed.go:26	SealPreCommit1Failed(10), waiting 59.17906322s before retrying
2020-11-21T16:23:25.908Z	WARN	sectors	storage-sealing/fsm.go:507	sector 3 got error event sealing.SectorSealPreCommit1Failed: ticket expired error: ticket expired: seal height: 247463, head: 255646
2020-11-21T16:23:25.909Z	INFO	sectors	storage-sealing/states_failed.go:26	SealPreCommit1Failed(3), waiting 59.09033129s before retrying
2020-11-21T16:23:25.942Z	WARN	sectors	storage-sealing/fsm.go:507	sector 7 got error event sealing.SectorSealPreCommit1Failed: ticket expired error: ticket expired: seal height: 247474, head: 255646
2020-11-21T16:23:25.943Z	INFO	sectors	storage-sealing/states_failed.go:26	SealPreCommit1Failed(7), waiting 59.056359648s before retrying
2020-11-21T16:23:29.001Z	DEBUG	advmgr	sector-storage/sched.go:354	SCHED 1 queued; 10 open windows

Please try this #4876

@jennijuju
Copy link
Member

Restart the miner will solve the issue, however is there a way to reschedule those sectors without restart miner.

was your working running when you were updating the node?

@jennijuju jennijuju added support need/author-input Hint: Needs Author Input labels Nov 23, 2020
@dodohack
Copy link
Author

Restart the miner will solve the issue, however is there a way to reschedule those sectors without restart miner.

was your working running when you were updating the node?

I have stopped all (lotus, miner and worker) when upgrade to v1.2.0/v1.2.1.

@github-actions
Copy link

Oops, seems like we needed more information for this issue, please comment with more details or this issue will be closed in 24 hours.

@github-actions
Copy link

This issue was closed because it is missing author input.

@TippyFlitsUK TippyFlitsUK removed the need/author-input Hint: Needs Author Input label Mar 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants