-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate imitation envs to seals #58
Conversation
…ng base POMDP to tabular env
Codecov Report
@@ Coverage Diff @@
## master #58 +/- ##
==========================================
Coverage 100.00% 100.00%
==========================================
Files 24 26 +2
Lines 752 982 +230
==========================================
+ Hits 752 982 +230
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed everything except imitation_examples.py
which I only skimmed. At a high-level design seems good and I definitely agree these are more at home in seals
than imitation
.
imitation_examples.py
probably shouldn't be called that -- a user doesn't care that it used to be in imitation
, do they? You might want to put the random matrix one and CliffWorld in different files, in fact, diagnostics/
has stuck to one file per environment and the other environments in seals
are just lightweight wrappers around existing environments.
It definitely needs more tests. We were lax in imitation
because it was just example code. But in seals environments are key deliverable. We've been maintaining 100% test coverage so far -- although comprehensiveness of tests matters more than raw line coverage.
One bug (?) which is hurting test coverage a lot is that nothing in imitation_examples.py
is actually being registered. This should be pretty obvious from the file having 0% code coverage. If the environment isn't registered, our tests won't pick up on it. If it is registered and under seals/
, it gets run automatically. I expect that'll get you to 80-90% coverage on that file for free, and if you write a few manual tests as well you'll be in good shape.
From a quick glance at CodeCov (worth taking a more detailed look yourself) in base_envs.py
it seems like TabularModelPOMDP
is totally untested (nothing using obs_from_space
) so that's one area to improve, though again you might get some coverage there from fixing the above, but it's not a bad idea to have some tests targeted at base_envs
directly.
When you've addressed these issues please request another review, and I can go over imitation_examples.py
at that point, but it's probably best for me to hold off until you relocate it/split it up/test it as that might introduce a lot of changes anyway.
src/seals/base_envs.py
Outdated
reward_matrix=reward_matrix, | ||
horizon=horizon, | ||
initial_state_dist=initial_state_dist, | ||
observation_matrix=np.eye(transition_matrix.shape[0]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something seems off here. We're basically giving a dummy observation matrix to the code, that never gets used because of the obs_from_state
override, and doesn't even produce the same values (one-hot coded vs integer index).
If we make change I suggested above to remove observation_matrix
from BaseTabularModelPOMDP
, you could just switch this to inherit directly from BaseTabularModelPOMDP
and get rid of the observation_matrix
here entirely. You'd probably need to make BaseTabularModelPOMDP
take observation_space
as an argument (specifying obs_dim
and obs_dtype
won't cut it if you want it to be discrete...), but that seems like a reasonable choice, then just move the current construct_obs_space
logic into TabularModelPOMDP
.
I'm not 100% satisfied with that, as it does seem like we'd probably want TabularModelMDP
to be-a TabularModelPOMDP
, but it doesn't seem like a major problem if they're both concrete classes and so specialized in different ways.
Alternatively if you wanted to keep the current hierarchy, you could just make observation_matrix
not bogus. Either keep it as-is and delete obs_from_state
(you get one hot vectors, which is OK) or change np.eye
to np.arange
(the observation space would be a bit weird there though).
…I/seals into imitation-envs-to-seals
On test coverage: you're just missing a single line in Should probably test I think we can make I think with those fixed we'll basically be at 100% coverage again :) Do let me know once everything addressed and I'll re-review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM apart from one comment to add a helper function in diagnostics/__init__.py
to avoid polluting module namespace.
I assume that you moved imitation_examples.py
to cliff_world.py
and random_trans.py
without any modifications -- I didn't re-review those, let me know if there was any changes I should take a closer look at.
This PR migrates imitation environments to seals, in order to solve HumanCompatibleAI/imitation#501.