Skip to content
View mnoukhov's full-sized avatar

Highlights

  • Pro

Block or report mnoukhov

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. async_rlhf async_rlhf Public

    Code and Configs for Asynchronous RLHF: Faster and More Efficient RL for Language Models

    Python 19 1

  2. elastic-reset elastic-reset Public

    Code and Experiments for "Language Model Alignment with Elastic Reset" (NeurIPS 2023)

    Python 5

  3. vwxyzjn/summarize_from_feedback_details vwxyzjn/summarize_from_feedback_details Public

    Python 120 16

  4. emergent-compete emergent-compete Public

    Code for Emergent Communication under Competition (AAMAS 2021)

    Jupyter Notebook 10 1

  5. huggingface/trl huggingface/trl Public

    Train transformer language models with reinforcement learning.

    Python 10.4k 1.3k

  6. lecture-notes lecture-notes Public

    LaTeX lecture notes CS/ML courses at University of Waterloo and Universite de Montreal

    TeX 10 8