Skip to content
@basetenlabs

Baseten

Machine learning infrastructure for developers

Welcome to Baseten

Baseten is an AI infrastructure platform. We combine applied performance research, distributed multi-cloud infrastructure, and developer tooling to run models of all modalities in production.

Get started:

  • Deploy an open-source model in two clicks from the model library.
  • Read our docs to package and serve a fine-tuned or custom model.

Pinned Loading

  1. truss truss Public

    The simplest way to serve AI/ML models in production

    Python 936 76

  2. truss-examples truss-examples Public

    Examples of models deployable with Truss

    Python 148 38

Repositories

Showing 10 of 45 repositories
  • truss Public

    The simplest way to serve AI/ML models in production

    basetenlabs/truss’s past year of commit activity
    Python 936 MIT 76 62 (5 issues need help) 15 Updated Jan 18, 2025
  • .github Public
    basetenlabs/.github’s past year of commit activity
    0 0 0 0 Updated Jan 13, 2025
  • truss-examples Public

    Examples of models deployable with Truss

    basetenlabs/truss-examples’s past year of commit activity
    Python 148 MIT 37 11 53 Updated Jan 13, 2025
  • autoscaler Public Forked from kubernetes/autoscaler

    Autoscaling components for Kubernetes

    basetenlabs/autoscaler’s past year of commit activity
    Go 0 Apache-2.0 4,080 0 3 Updated Dec 11, 2024
  • axolotl Public Forked from axolotl-ai-cloud/axolotl

    Go ahead and axolotl questions

    basetenlabs/axolotl’s past year of commit activity
    Python 0 Apache-2.0 930 0 2 Updated Nov 7, 2024
  • HackMIT-2024 Public
    basetenlabs/HackMIT-2024’s past year of commit activity
    Jupyter Notebook 2 1 0 0 Updated Sep 14, 2024
  • basetenlabs/Workshop-TRT-LLM’s past year of commit activity
    Python 16 11 0 0 Updated Jun 26, 2024
  • gpu-operator Public Forked from NVIDIA/gpu-operator

    NVIDIA GPU Operator creates/configures/manages GPUs atop Kubernetes

    basetenlabs/gpu-operator’s past year of commit activity
    Go 0 Apache-2.0 324 0 3 Updated Apr 19, 2024
  • TensorRT-LLM Public Forked from NVIDIA/TensorRT-LLM

    TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.

    basetenlabs/TensorRT-LLM’s past year of commit activity
    C++ 0 Apache-2.0 1,082 0 0 Updated Apr 2, 2024
  • triton-inference-server Public Forked from triton-inference-server/server

    The Triton Inference Server provides an optimized cloud and edge inferencing solution.

    basetenlabs/triton-inference-server’s past year of commit activity
    Python 0 BSD-3-Clause 1,529 0 0 Updated Jan 11, 2024

Top languages

Loading…

Most used topics

Loading…