You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now in the evaluation directory, the directory structure is very flat, and it is hard to tell which subdirectories are utilities related to implementing benchmarks or doing basic tests for openhands (utils, integration_tests, regression, static), and which are actual benchmarks from the ML literature (everything else).
To make this more clear, we can move all benchmarks to live under the evaluation/benchmarks/ directory. In addition, all other files that have to do with evaluation (including documentation, github workflows, etc.) will need to be checked and changed to maintain consistency.
While we do this, we can also add some of the benchmarks that are missing from the evaluation/README.md documentation.
The text was updated successfully, but these errors were encountered:
What problem or use case are you trying to solve?
Right now in the evaluation directory, the directory structure is very flat, and it is hard to tell which subdirectories are utilities related to implementing benchmarks or doing basic tests for openhands (
utils
,integration_tests
,regression
,static
), and which are actual benchmarks from the ML literature (everything else).To make this more clear, we can move all benchmarks to live under the
evaluation/benchmarks/
directory. In addition, all other files that have to do with evaluation (including documentation, github workflows, etc.) will need to be checked and changed to maintain consistency.While we do this, we can also add some of the benchmarks that are missing from the
evaluation/README.md
documentation.The text was updated successfully, but these errors were encountered: