Skip to content

Latest commit

 

History

History
253 lines (187 loc) · 13 KB

README.md

File metadata and controls

253 lines (187 loc) · 13 KB

A cloud-native distributed storage system

English | 简体中文

📄 Documents || 🌐 Official Website || 🏠 Forum

✨ Contents

About Curve

Curve is a modern storage system developed by netease, currently supporting file storage(CurveFS) and block storage(CurveBS). Now it's hosted at CNCF as a sandbox project.

The core application scenarios of CurveBS mainly include:

  • the performance, mixed, capacity cloud disk or persistent volume of virtual machine/container, and remote disks of physical machines
  • high-performance separation of storage and computation architecture: high-performance and low latency architecture based on RDMA+SPDK, supporting the separation deployment structure of various databases such as MySQL and Kafka

The core application scenarios of CurveFS mainly include:

  • the cost-effective storage in AI training scene
  • the hot and cold data automation layered storage in big data scenarios
  • the cost-effective shared file storage on the public cloud: It can be used for business scenarios such as AI, big data, file sharing
  • Hybrid storage: Hot data is stored in the local IDC, cold data is stored in public cloud
High Performance | More stable | Easy Operation | Cloud Native
  • High Performance : CurveBS vs CephBS

    CurveBS: v1.2.0

    CephBS: L/N Performance: CurveBS random read and write performance far exceeds CephBS in the block storage scenario.

    Environment:3 replicas on a 6-node cluster, each node has 20xSATA SSD, 2xE5-2660 v4 and 256GB memory.

    Single Vol:

    Multi Vols:

  • More stable

    • The stability of the common abnormal Curve is better than that of Ceph in the block storage scenario.
      Fault Case One Disk Failure Slow Disk Detect One Server Failure Server Suspend Animation
      CephBS jitter 7s Continuous io jitter jitter 7s unrecoverable
      CurveBS jitter 4s no effect jitter 4s jitter 4s
  • Easy Operation

    • We have developed CurveAdm to help O&M staff.
      tools CephAdm CurveAdm
      easy Installation ✔️ ✔️
      easy Deployment ❌(slightly more steps) ✔️
      playground ✔️
      Multi-Cluster Management ✔️
      easy Expansion ❌(slightly more steps) ✔️
      easy Upgrade ✔️ ✔️
      easy to stop service ✔️
      easy Cleaning ✔️
      Deployment environment testing ✔️
      Operational audit ✔️
      Peripheral component deployment ✔️
      easy log reporting ✔️
      Cluster status statistics reporting ✔️
      Error code classification and solutions ✔️
    • Ops CurveBS ops is more friendly than CephBS in the block storage scenario.
      Ops scenarios Upgrade clients Balance
      CephBS do not support live upgrade via plug-in with IO influence
      CurveBS support live upgrade with second jitter auto with no influence on IO
  • Cloud Native

Docking OpenStack
Docking Kubernates
  • Use Curve CSI Driver, The plugin implements the Container Storage Interface(CSI) between Container Orchestrator(CO) and Curve cluster. It allows dynamically provisioning curve volumes and attaching them to workloads.
  • For details of the documentation, see CSI Curve Driver Doc.
Docking PolarDB | PG
  • It serves as the underlying storage base for polardb for postgresql in the form of storage and computation separation, providing data consistency assurance for upper layer database applications, extreme elasticity scaling, and high performance HTAP.

  • Deployment details can be found at PolarDB | PG Advanced Deployment(CurveBS).

More...
  • Curve can also be used as cloud storage middleware using S3-compatible object storage as the data storage engine, providing cost-effective shared file storage for public cloud users.

Curve Architecture

Curve on Hybrid Cloud

Curve supports deployment in private and public cloud environments, and can also be used in a hybrid cloud:

One of them, CurveFS shared file storage system, can be elasticly scaled to public cloud storage, which can provide users with greater capacity elasticity, lower cost, and better performance experience.

Curve on Public Cloud

In a public cloud environment, users can deploy CurveFS clusters to replace the shared file storage system provided by cloud vendors and use cloud disks for acceleration, which can greatly reduce business costs, with the following deployment architecture:

Design Documentation

CurveBS quick start

In order to improve the operation and maintenance convenience of Curve, we designed and developed the CurveAdm project, which is mainly used for deploying and managing Curve clusters. Currently, it supports the deployment of CurveBS & CurveFS (scaleout, upgrade and other functions are under development), please refer to the CurveAdm User Manual for related documentation, and install the CurveAdm tool according to the manual before deploying the Curve cluster.

Deploy an All-in-one experience environment

Please refer to the CurveBS cluster deployment steps in the CurveAdm user manual. For standalone experience, please use the "Cluster Topology File - Standalone Deployment" template.

The command tools' instructions

FIO Curve block storage engine

Fio Curve engine is added, you can clone https://github.com/opencurve/fio and compile the fio tool with our engine(depend on nebd lib), fio command line example:

$ ./fio --thread --rw=randwrite --bs=4k --ioengine=nebd --nebd=cbd:pool//pfstest_test_ --iodepth=10 --runtime=120 --numjobs=10 --time_based --group_reporting --name=curve-fio-test

If you have any questions during performance testing, please check the Curve block storage performance tuning guide.

CurveFS quick start

Please use CurveAdm tool to deploy CurveFS,see CurveFS Deployment Process, and the CurveFS Command Instructions.

Test environment configuration

Please refer to the Test environment configuration

Practical

Governance

See Governance.

Contribute us

Participation in the Curve project is described in the Curve Open Source Community Guidelines and is subject to a contributor contract. We welcome your contribution!

Code of Conduct

Curve follows the CNCF Code of Conduct.

LICENSE

Curve is distributed under the Apache 2.0 LICENSE.

Release Cycle

  • CURVE release cycle:Half a year for major version, 1~2 months for minor version

  • Versioning format: We use a sequence of three digits and a suffix (x.y.z{-suffix}), x is the major version, y is the minor version, and z is for bugfix. The suffix is for distinguishing beta (-beta), RC (-rc) and GA version (without any suffix). Major version x will increase 1 every half year, and y will increase every 1~2 months. After a version is released, number z will increase if there's any bugfix.

Branch

All the developments will be done under master branch. If there's any new version to establish, a new branch release-x.y will be pulled from the master, and the new version will be released from this branch.

Feedback & Contact

  • Github Issues:You are sincerely welcomed to issue any bugs you came across or any suggestions through Github issues. If you have any question you can refer to our FAQ or join our user group for more details.
  • FAQ:Frequently asked question in our user group, and we'll keep working on it.
  • User group:We use Wechat group currently.
  • Double Week Meetings: We have an online community meeting every two weeks which talk about what Curve is doing and planning to do. The time and links of the meeting are public in the user group and Double Week Meetings.