Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix sympy version for specific instance #273

Closed
wants to merge 1 commit into from

Conversation

SmartManoj
Copy link
Contributor

@SmartManoj SmartManoj commented Dec 10, 2024

Fixes #265

Modify the build_instance_image function in swebench/harness/docker_build.py to check for the instance ID sympy__sympy-21612 and install antlr4-python3-runtime==4.7.2 if it matches.

  • Add a check for the specific instance ID sympy__sympy-21612
  • Append the installation command for antlr4-python3-runtime==4.7.2 to the repo_script_list if the instance ID matches

For more details, open the Copilot Workspace session.

Fixes swe-bench#265

Modify the `build_instance_image` function in `swebench/harness/docker_build.py` to check for the instance ID `sympy__sympy-21612` and install `antlr4-python3-runtime==4.7.2` if it matches.

* Add a check for the specific instance ID `sympy__sympy-21612`
* Append the installation command for `antlr4-python3-runtime==4.7.2` to the `repo_script_list` if the instance ID matches

---

For more details, open the [Copilot Workspace session](https://copilot-workspace.githubnext.com/swe-bench/SWE-bench/issues/265?shareId=XXXX-XXXX-XXXX-XXXX).
@john-b-yang
Copy link
Member

This fix is not necessary. The gold patch prediction works just fine.

$ ./test.sh
/opt/miniconda3/envs/sweb/lib/python3.10/runpy.py:126: RuntimeWarning: 'swebench.harness.run_evaluation' found in sys.modules after import of package 'swebench.harness', but prior to execution of 'swebench.harness.run_evaluation'; this may result in unpredictable behaviour
  warn(RuntimeWarning(msg))
Using gold predictions - ignoring predictions_path
Running 1 unevaluated instances...
Base image sweb.base.x86_64:latest already exists, skipping build.
Base images built successfully.
No environment images need to be built.
Found 1 existing instance images. Will reuse them.
Running 1 instances...
1 ran successfully, 0 failed: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:19<00:00, 19.55s/it]
All instances run.
Cleaning cached images...
Removed 0 images.
Total instances: 1
Instances submitted: 1
Instances completed: 1
Instances incomplete: 0
Instances resolved: 1
Instances unresolved: 0
Instances with empty patches: 0
Instances with errors: 0
Unstopped containers: 0
Unremoved images: 1
Report written to gold.gold.json

$ cat test.sh
python -m swebench.harness.run_evaluation \
    --predictions_path gold \
    --max_workers 1 \
    --run_id gold \
    --split test \
    --dataset_name princeton-nlp/SWE-bench \
    --instance_ids sympy__sympy-21612 \

In the future, if you have an issue, please provide a log like this one to prove that there is an issue with the instance. I appreciate your contributions, but please make sure to do an appropriate amount of auditing such that I can focus more on resolving the problem.

@SmartManoj
Copy link
Contributor Author

Trajectory link

The package was not installed in the environment by default, and the latest version is not compatible.
image
parsed traj


Solved traj

@john-b-yang
Copy link
Member

OpenHands maintains separate execution environments than SWE-bench.

Again, please please please run the SWE-bench code. If you are seeing an issue, please verify that this problem is actually reproducible with SWE-bench before posting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Regarding package version
2 participants