scalable-oversight

Here are 3 public repositories matching this topic...

Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".

reasoning llm scalable-oversight

[ACL 2024] Code for the paper "ALaRM: Align Language Models via Hierarchical Rewards Modeling"

alignment rlhf scalable-oversight

Source code for 'Understanding impacts of human feedback via influence functions'

Add a description, image, and links to the scalable-oversight topic page so that developers can more easily learn about it.

To associate your repository with the scalable-oversight topic, visit your repo's landing page and select "manage topics."