Skip to content
Change the repository type filter

All

    Repositories list

    • GAMABench

      Public
      Benchmarking LLMs' Gaming Ability in Multi-Agent Environments
      Jupyter Notebook
      GNU General Public License v3.0
      04200Updated Nov 25, 2024Nov 25, 2024
    • Benchmarking LLMs' Psychological Portrayal
      Python
      GNU General Public License v3.0
      26801Updated Nov 21, 2024Nov 21, 2024
    • Benchmarking LLMs' Emotional Alignment with Humans
      Python
      GNU General Public License v3.0
      46911Updated Sep 25, 2024Sep 25, 2024
    • Code and Results of the Paper Titled: Revisiting the Reliability of Psychological Scales on Large Language Models
      Python
      02900Updated Sep 24, 2024Sep 24, 2024
    • Code and data for our paper "On the Resilience of Multi-Agent Systems with Malicious Agents"
      Python
      GNU General Public License v3.0
      01300Updated Aug 5, 2024Aug 5, 2024
    • ECHO

      Public
      Evaluating AI Chatbots’ Role-Play Ability
      Python
      GNU General Public License v3.0
      0200Updated Apr 30, 2024Apr 30, 2024
    • HTML
      2100Updated Feb 13, 2023Feb 13, 2023
    • Python
      3100Updated Jan 29, 2023Jan 29, 2023
    • AEON

      Public
      An automated tool to evaluate the quality of textual adversarial examples.
      Python
      MIT License
      1800Updated Jul 19, 2022Jul 19, 2022
    • A collection of datasets for machine learning for big code
      MIT License
      54600Updated Oct 8, 2021Oct 8, 2021