Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature(zc): add bcq algorithm #640

Merged
merged 11 commits into from
May 30, 2023
Merged

feature(zc): add bcq algorithm #640

merged 11 commits into from
May 30, 2023

Conversation

Super1ce
Copy link
Contributor

@Super1ce Super1ce commented Apr 10, 2023

Description

Add bcq algorithm

TODO

Add algorithm and config

BCQ paper :https://arxiv.org/pdf/1812.02900.pdf
reference implementation :https://github.com/sfujim/BCQ

Check List

  • merge the latest version source branch/repo, and resolve all the conflicts
  • pass style check
  • pass all the tests

@PaParaZz1 PaParaZz1 changed the title feature(zc):add bcq feature(zc): add bcq Apr 11, 2023
@PaParaZz1 PaParaZz1 added the algo Add new algorithm or improve old one label Apr 11, 2023
@PaParaZz1
Copy link
Member

add paper link and a training curve figure in PR

from ding.utils import set_pkg_seed
from dizoo.d4rl.envs import D4RLEnv
from dizoo.d4rl.config.halfcheetah_medium_bcq_config import main_config, create_config
# from dizoo.d4rl.config.halfcheetah_medium_expert_edac_config import main_config,create_config
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove unused codes.

vae_hidden_dims: List = [750, 750],
phi: float = 0.05
) -> None:
super(BCQ, self).__init__()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add note for arguments.

different computation graph, including ``compute_actor`` and ``compute_critic`` in QAC.
Mode compute_actor:
Arguments:
- inputs (:obj:`torch.Tensor`): Observation data, defaults to tensor.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arguments object type is Union[torch.Tensor, Dict[str, torch.Tensor]] for inputs?


def train(args):
# launch from anywhere
config = Path(__file__).absolute().parent.parent / 'config' / args.config
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import from dizoo config file directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I copy this from other d4rl_*_main.py

@codecov
Copy link

codecov bot commented Apr 24, 2023

Codecov Report

Merging #640 (9162870) into main (7eb342c) will increase coverage by 0.15%.
The diff coverage is 25.88%.

❗ Current head 9162870 differs from pull request most recent head 4d4c997. Consider uploading reports for the commit 4d4c997 to get more accurate results

@@            Coverage Diff             @@
##             main     #640      +/-   ##
==========================================
+ Coverage   82.34%   82.49%   +0.15%     
==========================================
  Files         584      583       -1     
  Lines       47270    47698     +428     
==========================================
+ Hits        38924    39350     +426     
- Misses       8346     8348       +2     
Flag Coverage Δ
unittests 82.49% <25.88%> (+0.15%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
ding/policy/bcq.py 22.58% <22.58%> (ø)
ding/model/template/bcq.py 25.37% <25.37%> (ø)
ding/model/template/__init__.py 100.00% <100.00%> (ø)
ding/policy/__init__.py 100.00% <100.00%> (ø)
ding/policy/command_mode_policy_instance.py 93.60% <100.00%> (ø)

... and 276 files with indirect coverage changes

@PaParaZz1 PaParaZz1 changed the title feature(zc): add bcq feature(zc): add bcq algorithm May 23, 2023
@PaParaZz1 PaParaZz1 merged commit 6029beb into opendilab:main May 30, 2023
@Super1ce Super1ce deleted the bcq branch November 28, 2023 07:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
algo Add new algorithm or improve old one
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants