Skip to content
View TomorrowIsAnOtherDay's full-sized avatar
🐔
vacation
🐔
vacation
  • Tokyo
  • 21:10 (UTC -12:00)

Organizations

@benchmarking-rl

Block or report TomorrowIsAnOtherDay

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

PARL

前百度强化学习方向负责人,在文心一言模型组负责RLHF技术研发。

团队除了支持公司内外的业务之外,还深耕前沿技术,开展的工作线条包括但不限于:

  • 高性能RL并行框架PARL的研发(https://github.com/PaddlePaddle/PARL, 3.2k star)
  • 参与业内的国际RL赛事(我们团队已经连续三年在NeurIPS RL 赛事中拿下冠军名次了)
  • 学术论文的投稿
  • 机器人控制(自动驾驶&无人机、四足机械狗控制)

公司内业务支持包括超大规模LLM对齐、信息流推荐、搜索引擎、百度地图、广告排序、百度智能云(能源调度、信号灯控制)等。

Pinned Loading

  1. PaddlePaddle/PARL PaddlePaddle/PARL Public

    A high-performance distributed training framework for Reinforcement Learning

    Python 3.3k 818

  2. tensorgo tensorgo Public

    Using the tensorgo API for TensorFlow Async Model Parallel

    Python 36 7