-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allow change resumission times #503
Conversation
- add option: "strategy": {"customized_script_header_template_file": ""}, - add option: `sge_pe_name`
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Update pbs.py
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
improve path check (deepmodeling#501)
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #503 +/- ##
==========================================
- Coverage 60.19% 60.17% -0.03%
==========================================
Files 39 39
Lines 3859 3859
==========================================
- Hits 2323 2322 -1
- Misses 1536 1537 +1 ☔ View full report in Codecov by Sentry. |
📝 WalkthroughWalkthroughThe changes in this pull request focus on the Changes
Sequence Diagram(s)sequenceDiagram
participant Job
participant Machine
Job->>Machine: Check current retry count
Machine-->>Job: Return retry count
Job->>Job: Set retry_count = current retry count
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
dpdispatcher/submission.py (1)
Line range hint
851-859
: LGTM! Consider minor improvement to error message formatting.The changes to the error-raising condition and the inclusion of the remote error message improve the error handling and provide more context for debugging. This is a good improvement.
Consider adding a newline character before the "For further actions" part of the error message to improve readability:
raise RuntimeError(err_msg) - self.submit_job() - if self.job_state != JobStatus.unsubmitted: - dlog.info( - f"job:{self.job_hash} re-submit after terminated; new job_id is {self.job_id}" - ) - time.sleep(0.2) - self.get_job_state() - dlog.info( - f"job:{self.job_hash} job_id:{self.job_id} after re-submitting; the state now is {repr(self.job_state)}" + raise RuntimeError( + f"{err_msg}\n" + f"For further actions, run the following command with proper flags: dpdisp submission {self.submission_hash}" + ) + self.submit_job() + if self.job_state != JobStatus.unsubmitted: + dlog.info( + f"job:{self.job_hash} re-submit after terminated; new job_id is {self.job_id}" + ) + time.sleep(0.2) + self.get_job_state() + dlog.info( + f"job:{self.job_hash} job_id:{self.job_id} after re-submitting; the state now is {repr(self.job_state)}"🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 848-848: dpdispatcher/submission.py#L848
Added line #L848 was not covered by tests
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- dpdispatcher/submission.py (1 hunks)
🧰 Additional context used
🪛 GitHub Check: codecov/patch
dpdispatcher/submission.py
[warning] 848-848: dpdispatcher/submission.py#L848
Added line #L848 was not covered by tests
🔇 Additional comments (2)
dpdispatcher/submission.py (2)
Line range hint
847-859
: Overall improvements to retry mechanism and error handlingThe changes in this pull request focus on enhancing the retry mechanism and error handling in the Job class. The modifications simplify the logic, provide more flexibility in configuring retry behavior, and improve error messages for better debugging. These are all positive improvements to the codebase.
Key improvements:
- Simplified retry logic using the machine's retry_count directly.
- More flexible configuration of retry behavior through the machine object.
- Enhanced error messages with additional context for debugging.
These changes will make the code more maintainable and easier to debug. Good job!
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 848-848: dpdispatcher/submission.py#L848
Added line #L848 was not covered by tests
847-849
: LGTM! Consider adding a comment to explain the retry mechanism.The changes to the retry logic simplify the code and improve flexibility by using the machine's retry_count directly. This is a good improvement.
Consider adding a brief comment to explain the retry mechanism, for example:
+ # Use the machine's retry_count to determine the maximum number of retries if hasattr(self.machine, "retry_count"): retry_count = self.machine.retry_count
To ensure the new line is properly tested, please run the following command:
If no tests are found, consider adding a new test case to cover this scenario.
✅ Verification successful
Verification Complete: Retry Mechanism is Properly Tested
The presence of
test_handle_unexpected_job_state
intests/test_class_job.py
confirms that the retry mechanism is adequately tested.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Check if there are any tests covering the Job.handle_unexpected_job_state method rg -i "def test.*handle_unexpected_job_state" tests/Length of output: 126
🧰 Tools
🪛 GitHub Check: codecov/patch
[warning] 848-848: dpdispatcher/submission.py#L848
Added line #L848 was not covered by tests
What is the reason to change it? |
To able to set/ disable resubmit fail jobs |
It seems to me that this PR just changes the definition of |
solve issue #502
hi @njzjz, please "test" this
Summary by CodeRabbit
Bug Fixes
Chores