preference-learning

Star

Here are 32 public repositories matching this topic...

allenai / reward-bench

Star

RewardBench: the first evaluation tool for reward models.

preference-learning rlhf

Updated Oct 15, 2024
Python

tournesol-app / tournesol

Star

Free and open source code of the https://tournesol.app platform. Meet the community on Discord https://discord.gg/WvcSG55Bf3

python youtube django reactjs django-rest-framework dataset recommendation-engine preference-learning social-choice ai-ethics bradley-terry-model golden-ratio-optimization preference-aggregation

Updated Oct 17, 2024
Python

IAAR-Shanghai / ICSFSurvey

Star

Explore concepts like Self-Correct, Self-Refine, Self-Improve, Self-Contradict, Self-Play, and Self-Knowledge, alongside o1-like reasoning elevation🍓 and hallucination alleviation💤.

decoding self-improvement knowledge-distillation data-augmentation reasoning self-consistency preference-learning hallucination self-correction attention-head large-language-models chain-of-thought large-language-model internal-consistency self-feedback self-refine self-correct

Updated Oct 13, 2024
Jupyter Notebook

qxcv / magical

Star

The MAGICAL benchmark suite for robust imitation learning (NeurIPS 2020)

reinforcement-learning imitation-learning preference-learning reinforcement-learning-environments

Updated Dec 5, 2023
Python

SMARTlab-Purdue / SAN-NaviSTAR

Star

This repository contains the source code for our paper: "NaviSTAR: Socially Aware Robot Navigation with Hybrid Spatio-Temporal Graph Transformer and Preference Learning". For more details, please refer to our project website at https://sites.google.com/view/san-navistar.

machine-learning reinforcement-learning transformer preference-learning robot-navigation socially-aware-navigation

Updated Jun 30, 2024
Python

sail-sg / dice

Star

Official implementation of Bootstrapping Language Models via DPO Implicit Rewards

alignment preference-learning large-language-models rlhf

Updated Jul 29, 2024
Python

JanoschMenke / metis

Star

Python-based GUI to collect Feedback of Chemist in Molecules

machine-learning drug-discovery human-in-the-loop preference-learning de-novo-drug-design generative-ai

Updated Oct 15, 2024
Python

gao-g / prelude

Star

Code for the paper "Aligning LLM Agents by Learning Latent Preference from User Edits".

transformers alignment user-feedback edits interpretability preference-learning gpt4 llm llms human-feedback

Updated Oct 17, 2024
Python

typoverflow / WiseRL

Star

PyTorch implementations for Offline Preference-Based RL (PbRL) algorithms

reinforcement-learning pytorch preference-learning

Updated Sep 28, 2024
Python

vicgalle / configurable-safety-tuning

Sponsor

Star

Data and models for the paper "Configurable Safety Tuning of Language Models with Synthetic Preference Data"

alignment safety preference-learning dpo llm

Updated Jul 27, 2024
Python

julilien / PLDepth

Star

Code for "Monocular Depth Estimation via Listwise Ranking using the Plackett-Luce Model" as published at CVPR 2021.

machine-learning deep-learning learning-to-rank cvpr weakly-supervised-learning preference-learning monocular-depth monocular-depth-estimation plackett-luce cvpr2021 relative-depth

Updated Feb 3, 2024
Python

SMARTlab-Purdue / SAN-FAPL

Star

This repository contains the source code for our paper: "Feedback-efficient Active Preference Learning for Socially Aware Robot Navigation", accepted to IROS-2022. For more details, please refer to our project website at https://sites.google.com/view/san-fapl.

machine-learning reinforcement-learning learning-from-demonstration preference-learning robot-navigation socially-aware-navigation

Updated Oct 17, 2022
Python

aaronpmishkin / gaussian_processes

Star

Preference Learning with Gaussian Processes and Bayesian Optimization

machine-learning gaussian-processes bayesian-optimization preference-learning value-charts

Updated Aug 10, 2017
Python

Intelligent-Systems-Group / jpl-framework

Star

Java framework for Preference Learning

machine-learning collaborative-filtering ranking preference-learning label-ranking object-ranking

Updated Mar 5, 2018
Java

98k-bot / GAN-Assisted-Preference-Based-Learning

Star

A paper under AAAI-20 review

gan reinforcement-learning-algorithms preference-learning

Updated Aug 27, 2019
Python

makgyver / PRL

Star

[P]reference and [R]ule [L]earning algorithm implementation for Python 3 (https://arxiv.org/abs/1812.07895)

machine-learning algorithm game-theory preference-learning

Updated Mar 17, 2019
Python

albiboni / User-RecSys

Star

Code for the project: "Analysis of Recommendation-systems based on User Preferences".

preferences booking user preferences-learning reccomender booking-system user-preferences preference-learning reccommendation reccomendersystem

Updated Mar 6, 2018
Python

aleksa-sukovic / iclr2024-reward-design-for-justifiable-rl

Star

Code for the paper "Reward Design for Justifiable Sequential Decision-Making"; ICLR 2024

reinforcement-learning alignment preference-learning reward-design preference-based-reinforcement-learning

Updated Feb 27, 2024
Jupyter Notebook

LemurPwned / bradley-terry-ui

Star

UI for straightforward Bradley-Terry feedback loop

ui alignment preference-learning bradley-terry-model bradley-terry

Updated Aug 24, 2024
Python

Rahgooy / MDFT

Star

In this project, we design a recurrent neural network to simulate a cognitive model of decision-making called Multi Alternative Decision Field Theory (MDFT). We train this RNN to learn the parameters of MDFT.

machine-learning decision-making recurrent-neural-networks preference-learning cognitive-models

Updated Jul 25, 2024
Python

Improve this page

Add a description, image, and links to the preference-learning topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the preference-learning topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

preference-learning

Here are 32 public repositories matching this topic...

allenai / reward-bench

tournesol-app / tournesol

IAAR-Shanghai / ICSFSurvey

qxcv / magical

SMARTlab-Purdue / SAN-NaviSTAR

sail-sg / dice

JanoschMenke / metis

gao-g / prelude

typoverflow / WiseRL

vicgalle / configurable-safety-tuning

julilien / PLDepth

SMARTlab-Purdue / SAN-FAPL

aaronpmishkin / gaussian_processes

Intelligent-Systems-Group / jpl-framework

98k-bot / GAN-Assisted-Preference-Based-Learning

makgyver / PRL

albiboni / User-RecSys

aleksa-sukovic / iclr2024-reward-design-for-justifiable-rl

LemurPwned / bradley-terry-ui

Rahgooy / MDFT

Improve this page

Add this topic to your repo