Skip to content
@RLHFlow

RLHFlow

Code for the Workflow of Reinforcement Learning from Human Feedback (RLHF)

Popular repositories Loading

  1. RLHF-Reward-Modeling RLHF-Reward-Modeling Public

    Recipes to train reward model for RLHF.

    Python 1.2k 82

  2. Online-RLHF Online-RLHF Public

    A recipe for online RLHF and online iterative DPO.

    Python 474 46

  3. Directional-Preference-Alignment Directional-Preference-Alignment Public

    Directional Preference Alignment

    54 3

  4. RAFT RAFT Public

    This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or rejection sampling fine-tuning.

    Python 22 3

  5. RLHFlow.github.io RLHFlow.github.io Public

    Webpage for RLHFlow

    HTML 9

  6. Online-DPO-R1 Online-DPO-R1 Public

    Codebase for Iterative DPO Using Rule-based Rewards

    Python 5

Repositories

Showing 7 of 7 repositories
  • Online-DPO-R1 Public

    Codebase for Iterative DPO Using Rule-based Rewards

    RLHFlow/Online-DPO-R1’s past year of commit activity
    Python 5 0 0 0 Updated Feb 12, 2025
  • RLHF-Reward-Modeling Public

    Recipes to train reward model for RLHF.

    RLHFlow/RLHF-Reward-Modeling’s past year of commit activity
    Python 1,154 Apache-2.0 82 15 2 Updated Feb 9, 2025
  • RLHFlow.github.io Public

    Webpage for RLHFlow

    RLHFlow/RLHFlow.github.io’s past year of commit activity
    HTML 9 0 0 0 Updated Feb 1, 2025
  • Online-RLHF Public

    A recipe for online RLHF and online iterative DPO.

    RLHFlow/Online-RLHF’s past year of commit activity
    Python 474 46 12 0 Updated Dec 28, 2024
  • Directional-Preference-Alignment Public

    Directional Preference Alignment

    RLHFlow/Directional-Preference-Alignment’s past year of commit activity
    54 Apache-2.0 3 2 0 Updated Sep 23, 2024
  • RAFT Public

    This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or rejection sampling fine-tuning.

    RLHFlow/RAFT’s past year of commit activity
    Python 22 3 0 0 Updated Sep 22, 2024
  • .github Public
    RLHFlow/.github’s past year of commit activity
    0 0 0 0 Updated May 26, 2024

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Python HTML

Most used topics

Loading…