Sea AI Lab

All

69 repositories

oat
Public
🌾 OAT: Online AlignmenT for LLMs
thompson-sampling alignment distributed-training dueling-bandits dpo distributed-rl llm rlhf llm-aligment online-alignment
Python
•
Apache License 2.0
•5•37•2•1•Updated Nov 28, 2024Nov 28, 2024
sailcompass
Public
Python
•0•3•0•0•Updated Nov 28, 2024Nov 28, 2024
InfNeRF
Public
InfNeRF: Towards Infinite Scale NeRF Rendering with O(log n) Space Complexity
Python
•
Apache License 2.0
•0•4•0•0•Updated Nov 27, 2024Nov 27, 2024
optim4rl
Public
Optim4RL is a Jax framework of learning to optimize for reinforcement learning.
reinforcement-learning optimization optimizer reinforcement-learning-algorithms optimization-algorithms meta-learning jax learning-to-learn optimizers meta-learning-algorithms
Python
•
Apache License 2.0
•2•24•0•0•Updated Nov 27, 2024Nov 27, 2024
zero-bubble-pipeline-parallelism
Public
Zero Bubble Pipeline Parallelism
Python
•
Other
•2.4k•285•18•0•Updated Nov 14, 2024Nov 14, 2024
Meta-Unlearning
Public
Python
•1•17•1•0•Updated Nov 12, 2024Nov 12, 2024
VocabularyParallelism
Public
Vocabulary Parallelism
Python
•
Other
•2.4k•10•0•0•Updated Nov 11, 2024Nov 11, 2024
sdft
Public
[ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".
language-model self-distillation supervised-finetuning
Shell
•4•100•4•0•Updated Nov 2, 2024Nov 2, 2024
Cheating-LLM-Benchmarks
Public
[SafeGenAi @ NeurIPS 2024] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates
Jupyter Notebook
•
MIT License
•0•63•0•0•Updated Oct 23, 2024Oct 23, 2024
P-DoS
Public
[ArXiv 2024] Denial-of-Service Poisoning Attacks on Large Language Models
Python
•2•14•0•0•Updated Oct 22, 2024Oct 22, 2024
CPO
Public
[NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.
Python
•2•67•2•1•Updated Oct 18, 2024Oct 18, 2024
SimLayerKV
Public
The official implementation of paper: SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction.
Python
•0•40•2•0•Updated Oct 18, 2024Oct 18, 2024
Attention-Sink
Public
[ATTRIB @ NeurIPS 2024 Oral] When Attention Sink Emerges in Language Models: An Empirical View
language-model attention-mechanism large-language-models attention-sink
Python
•
MIT License
•1•30•0•0•Updated Oct 17, 2024Oct 17, 2024
closer-look-LLM-unlearning
Public
The official code of the paper "A Closer Look at Machine Unlearning for Large Language Models".
Python
•1•13•0•0•Updated Oct 11, 2024Oct 11, 2024
regmix
Public
🧬 RegMix: Data Mixture as Regression for Language Model Pre-training
Jupyter Notebook
•
MIT License
•4•90•0•0•Updated Oct 3, 2024Oct 3, 2024
scaling-with-vocab
Public
[NeurIPS-2024] 📈 Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies https://arxiv.org/abs/2407.13623
Python
•4•71•1•0•Updated Sep 26, 2024Sep 26, 2024
sailor-llm
Public
[EMNLP-2024] ⚓️ Sailor: Open Language Models for South-East Asia
indonesia thai language-model sea vietnam lao malay
Python
•
MIT License
•9•114•0•0•Updated Sep 4, 2024Sep 4, 2024
envpool
Public
C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
robotics gym high-performance-computing cpp17 box2d vizdoom parallel-processing threadpool pybind11 atari-games
C++
•
Apache License 2.0
•100•1.1k•61•10•Updated Aug 12, 2024Aug 12, 2024
I-FSJ
Public
Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses (NeurIPS 2024)
Python
•
MIT License
•7•49•0•0•Updated Aug 3, 2024Aug 3, 2024
dice
Public
Official implementation of Bootstrapping Language Models via DPO Implicit Rewards
alignment preference-learning large-language-models rlhf
Python
•
MIT License
•2•39•0•0•Updated Jul 29, 2024Jul 29, 2024
lorahub
Public
[COLM 2024] LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition
Python
•
MIT License
•36•599•3•1•Updated Jul 22, 2024Jul 22, 2024
sailcraft
Public
🚢 Data Toolkit for Sailor Language Models
data-deduplication data-cleaning
Python
•8•82•0•0•Updated Jul 11, 2024Jul 11, 2024
Adan
Public
Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
deep-learning optimizer pytorch artificial-intelligence moe resnet vit diffusion mae fairseq
Python
•
Apache License 2.0
•64•762•3•0•Updated Jul 2, 2024Jul 2, 2024
zero-bubble-megatron-deepspeed
Public archive
Zero Bubble Pipeline Parallelism implemented on Megatron-Deepspeed
Python
•
Other
•2.4k•3•0•0•Updated Jun 27, 2024Jun 27, 2024
metaformer
Public
MetaFormer Baselines for Vision (TPAMI 2024)
transformer metaformer starrelu
Python
•
Apache License 2.0
•28•424•6•0•Updated Jun 1, 2024Jun 1, 2024
poolformer
Public
PoolFormer: MetaFormer Is Actually What You Need for Vision (CVPR 2022 Oral)
transformer image-classification mlp pooling pytorch
Python
•
Apache License 2.0
•117•1.3k•12•2•Updated Jun 1, 2024Jun 1, 2024
d4ft
Public
A JAX library for Density Functional Theory.
Python
•
Apache License 2.0
•5•42•16•0•Updated May 4, 2024May 4, 2024
finetune-fair-diffusion
Public
Code of the paper: Finetuning Text-to-Image Diffusion Models for Fairness
text-to-image fairness diffusion-models trustworthy-ai
Python
•
MIT License
•2•39•1•0•Updated Apr 26, 2024Apr 26, 2024
MDT
Public
Masked Diffusion Transformer is the SOTA for image synthesis. (ICCV 2023)
Python
•
Apache License 2.0
•38•528•17•1•Updated Apr 23, 2024Apr 23, 2024
CLoT
Public
CVPR'24, Official Codebase of our Paper: "Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation".
association multimodal-deep-learning humor-generation large-language-models leap-of-thought
Python
•15•299•16•1•Updated Apr 13, 2024Apr 13, 2024