Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test(rjy): add discrete pendulum env #395

Merged
merged 7 commits into from
Jun 28, 2022
Merged

Conversation

nighood
Copy link
Collaborator

@nighood nighood commented Jun 22, 2022

Description

Related Issue

TODO

Check List

  • merge the latest version source branch/repo, and resolve all the conflicts
  • pass style check
  • pass all the tests

@nighood nighood changed the title test(rjy):add discrete pendulum env (WIP)test(rjy): add discrete pendulum env Jun 23, 2022
@nighood nighood added the test Test(unittest, performance, efficiency, compatibility) label Jun 23, 2022
@@ -11,7 +11,7 @@
@ENV_REGISTRY.register('pendulum')
class PendulumEnv(BaseEnv):

def __init__(self, cfg: dict) -> None:
def __init__(self, cfg: dict, is_continue=True) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't use is_xxx or use_xxx, just continuous is ok

if self._act_scale:
action = affine_transform(action, min_val=self._env.action_space.low, max_val=self._env.action_space.high)
obs, rew, done, info = self._env.step(action)
self._final_eval_reward += rew
obs = to_ndarray(obs).astype(np.float32)
rew = to_ndarray([rew]).astype(np.float32) # wrapped to be transfered to a array with shape (1,)
# rew = to_ndarray([rew]).astype(np.float32) # wrapped to be transfered to a array with shape (1,)
rew = np.array([rew], dtype=np.float32)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why use this implementation, it has no difference with original version

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I ran into such a problem when testing: AttributeError: 'list' object has no attribute 'shape'. So I'm guessing if it's the problem here, and later I find out that it's the problem with the parameters passed in earlier.

@@ -24,6 +24,9 @@ def __init__(self, cfg: dict) -> None:
self._reward_space = gym.spaces.Box(
low=-1 * (3.14 * 3.14 + 0.1 * 8 * 8 + 0.001 * 2 * 2), high=0.0, shape=(1, ), dtype=np.float32
)
# require discrete env
self._is_continue=is_continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should modify self._action_space when continuous=False, so you can use consistent interface in other parts like random_action method

@@ -58,12 +61,17 @@ def seed(self, seed: int, dynamic_seed: bool = True) -> None:

def step(self, action: np.ndarray) -> BaseEnvTimestep:
assert isinstance(action, np.ndarray), type(action)
# if require discrete env, convert actions to [-2 ~ 2] float actions
if not self._is_continue:
action=(action-(self._discrete_action_num-1)/2)/((self._discrete_action_num-1)/4)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to transform it to [-1, 1] here, then affine_transform will make it to [-2, 2]

@@ -58,12 +61,17 @@ def seed(self, seed: int, dynamic_seed: bool = True) -> None:

def step(self, action: np.ndarray) -> BaseEnvTimestep:
assert isinstance(action, np.ndarray), type(action)
# if require discrete env, convert actions to [-2 ~ 2] float actions
if not self._is_continue:
action=(action-(self._discrete_action_num-1)/2)/((self._discrete_action_num-1)/4)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just (action / (self._discrete_action_num-1)) * 2 - 1

@PaParaZz1 PaParaZz1 changed the title (WIP)test(rjy): add discrete pendulum env test(rjy): add discrete pendulum env Jun 27, 2022
@codecov
Copy link

codecov bot commented Jun 28, 2022

Codecov Report

Merging #395 (6b7f526) into main (63029a4) will increase coverage by 0.13%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main     #395      +/-   ##
==========================================
+ Coverage   85.61%   85.74%   +0.13%     
==========================================
  Files         524      524              
  Lines       41108    41499     +391     
==========================================
+ Hits        35195    35584     +389     
- Misses       5913     5915       +2     
Flag Coverage Δ
unittests 85.74% <ø> (+0.13%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
ding/framework/message_queue/tests/test_nng.py 42.30% <0.00%> (-57.70%) ⬇️
ding/interaction/tests/test_utils/stream.py 75.00% <0.00%> (-2.78%) ⬇️
ding/utils/tests/test_k8s_launcher.py 88.67% <0.00%> (-1.71%) ⬇️
ding/interaction/base/network.py 96.34% <0.00%> (-1.25%) ⬇️
ding/framework/tests/test_parallel.py 96.26% <0.00%> (-0.71%) ⬇️
ding/envs/env_manager/subprocess_env_manager.py 76.78% <0.00%> (-0.65%) ⬇️
ding/framework/middleware/collector.py 87.03% <0.00%> (-0.47%) ⬇️
ding/worker/replay_buffer/advanced_buffer.py 92.77% <0.00%> (-0.32%) ⬇️
ding/entry/serial_entry_preference_based_irl.py 78.57% <0.00%> (-0.31%) ⬇️
ding/interaction/master/master.py 90.16% <0.00%> (-0.30%) ⬇️
... and 219 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 63029a4...6b7f526. Read the comment docs.

@PaParaZz1 PaParaZz1 added env Questions about RL environment and removed test Test(unittest, performance, efficiency, compatibility) labels Jun 28, 2022
@PaParaZz1 PaParaZz1 merged commit b89d477 into opendilab:main Jun 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
env Questions about RL environment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants