gix is a command-line interface (CLI) to access git repositories. It's written to optimize the user-experience, and perform as good or better than the canonical implementation.
Furthermore it provides an easy and safe to use API in the form of various small crates for implementing your own tools in a breeze. Please see 'Development Status' for a listing of all crates and their capabilities.
Please note that from 2020-09-17, the development speed will be reduced greatly. I will do my best to create at least one commit per day ramp it up from there to eventually arrive at a new baseline velocity. It will be lower than what it was before, and I hope 1/2 to 2/3 of that will be realistic.
This is entirely unrelated to the project and I still can't wait to use gitoxide
on a daily basis once the first high-level commands
become available.
- please note that all functionality comes from the
gitoxide-core
library, which mirrors these capabilities and itself relies on allgit-*
crates. - limit amount of threads used in operations that support it.
- choose between 'human' and 'json' output formats
- the
gix
program - convenient and for humans- init - initialize a new non-bare repository with a
main
branch - clone - initialize a local copy of a remote repository
- init - initialize a new non-bare repository with a
- the
gixp
program (plumbing) - lower level commands for use in automation- pack
- pack verify
- pack index verify including each object sha1 and statistics
- pack explode, useful for transforming packs into loose objects for inspection or restoration
- verify written objects (by reading them back from disk)
- pack-receive - receive a whole pack produced by pack-send or git-upload-pack, useful for
clone
like operations. - pack-send - create a pack and send it using the pack protocol to stdout, similar to 'git-upload-pack', for consumption by pack-receive or git-receive-pack
- pack-index
- index from data - create an index file by streaming a pack file as done during clone
- support for thin packs (as needed for fetch/pull)
- index from data - create an index file by streaming a pack file as done during clone
- commit-graph
- verify - assure that a commit-graph is consistent
- remote-ref-list
- list all (or given) references from a remote at the given URL
- pack
- types to represent hash digests to identify git objects.
- used to abstract over different kinds of hashes, like SHA1 and the upcoming SHA256
- decode (zero-copy) borrowed objects
- commit
- tree
- tag
- encode owned objects
- commit
- tree
- tag
- transform borrowed to owned objects
- API documentation with examples
- loose objects
- traverse
- read
- into memory
- streaming
- verify checksum
- streaming write for blobs
- buffer write for small in-memory objects/non-blobs to bring IO down to open-read-close == 3 syscalls
- packs
- traverse pack index
- 'object' abstraction
- decode (zero copy)
- verify checksum
- simple and fast pack traversal
- decode
- full objects
- deltified objects
- streaming
- decode a pack from
Read
input -
Read
toIterator
of entries- read as is, verify hash, and restore partial packs
- create index from pack alone (much faster than git)
- resolve 'thin' packs
- decode a pack from
- encode
- Add support for zlib-ng for 2.5x compression performance and 20% faster decompression
- create new pack
- create 'thin' pack
- verify pack with statistics
- brute force - less memory
- indexed - faster, but more memory
- advanced
- Multi-Pack index file (MIDX)
- 'bitmap' file
- API documentation
- Some examples
- sink
- write objects and obtain id
- alternates
- database that act as link to other known git ODBs on disk
- safe with cycles and recursive configurations
- multi-line with comments and quotes
- multi-odb
- an ODB for object lookup from multiple lower level ODB at once
- promisor
- It's vague, but these seems to be like index files allowing to fetch objects from a server on demand.
- As documented here: https://www.git-scm.com/docs/git-clone#_git_urls
- parse
- ssh URLs and SCP like syntax
- file, git, and SSH
- paths (OS paths, without need for UTF-8)
- username expansion for ssh and git urls
- convert URL to string
- API documentation with examples
- abstract over protocol versions to allow delegates to deal only with a single way of doing things
- credentials
- via git-credentials
- via pure Rust implementation if no git is installed
- fetch & clone
- detailed progress
- control credentials provider to fill, approve and reject
- command: ls-ref
- parse V1 refs as provided during handshake
- parse V2 refs
- handle empty refs, AKA PKT-LINE(zero-id SP "capabilities^{}" NUL capability-list)
- initialize and validate command arguments and features sanely
- abort early for ls-remote capabilities
- packfile negotiation
- delegate can support for all fetch features, including shallow, deepen, etc.
- receive parsed shallow refs
- push
- API documentation with examples
- PKT-Line
- encode
- decode (zero-copy)
- error line
- V2 additions
- side-band mode
-
Read
from packet line with (optional) progress support via sidebands -
Write
with built-in packet line encoding
- No matter what we do here, timeouts must be supported to prevent hanging forever and to make interrupts destructor-safe.
- client
- general purpose
connect(…)
for clients- file:// launches service application
- ssh:// launches service application in a remote shell using ssh
- git:// establishes a tcp connection to a git daemon
- http(s):// establishes connections to web server
- pass context for scheme specific configuration, like timeouts
- git://
- V1 handshake
- send values + receive data with sidebands
-
support for receiving 'shallow' refs in case the remote repository is shallow itself (I presume)- Since V2 doesn't seem to support that, let's skip this until there is an actual need. No completionist :D
- V2 handshake
- send command request, receive response with sideband support
- V1 handshake
- http(s)://
- set identity for basic authentication
- V1 handshake
- send values + receive data with sidebands
- V2 handshake
- send command request, receive response with sideband support
-
'dumb'- we opt out using this protocol seems too slow to be useful, unless it downloads entire packs for clones?
- authentication failures are communicated by io::ErrorKind::PermissionDenied, allowing other layers to retry with authentication
- general purpose
- server
- general purpose
accept(…)
for servers
- general purpose
- API documentation with examples
- handle git index files for primary use by the git-repository while crafting new commits
- API documentation with examples
- read-only access
- Graph lookup of commit information to obtain timestamps, generation and parents, and extra edges
- Bloom filter index
- Bloom filter data
- create and update graphs and graph files
- API documentation with examples
- read
- line-wise parsing with decent error messages
- decode value
- boolean
- integer
- color
- path (incl. resolution)
- include
- includeIf
- write
- keep comments and whitespace, and only change lines that are affected by actual changes, to allow truly non-destructive editing
- API documentation with examples
- initialize
- Proper configuration depending on platform (e.g. ignorecase, filemode, …)
- Signed commits and tags
- clone
- shallow
- namespaces support
- sparse checkout support
- execute hooks
- .gitignore handling
- checkout/stage conversions clean + smudge as in .gitattributes
- rev-parsing and ref history
- worktree
- remotes with push and pull
- configuration
- merging
- stashing
- Use Commit Graph to speed up certain queries
- API documentation with examples
- create a bundle from an archive
- extract a branch from a bundle into a repository
- Handle symbolic references and packed references
- discover them in typical folder structures
- name validation
- API documentation with examples
- read and write a git-index file
- add and remove entries
- API documentation with examples
- diffing of git-object::Tree structures
- diffing, merging, working with hunks of data
- find differences between various states, i.e. index, working tree, commit-tree
- API documentation with examples
- interrupt-handler feature toggle
- Interruption for computations when receiving SIGTERM and SIGINT
- can be entirely didsabled with the disable-interrupts feature toggle
- io-pipe feature toggle
- a unix like pipeline for bytes
- parallel feature toggle
- When on…
in_parallel
join
- When off all functions execute serially
- When on…
- fast-sha1
- provides a faster SHA1 implementation using CPU intrinsics
- API documentation
- a terminal user interface seeking to replace and improve on
tig
- Verify huge packs
- Explode a pack to disk
- Generate huge pack from a lot of loose objects
- A simple
git-hours
clone - Open up SQL for git using sqlite virtual tables. Check out gitqlite as well. What would an MVP look like? Maybe even something that could ship with gitoxide.
curl -LSfs https://raw.githubusercontent.com/Byron/gitoxide/main/ci/install.sh | \
sh -s -- --git Byron/gitoxide --crate gix-max-termion
See the releases section for manual installation and various alternative builds that are slimmer or smaller, depending on your needs, for Linux, MacOS and Windows.
cargo
is the Rust package manager which can easily be obtained through rustup. With it, you can build your own binary
effortlessly and for your particular CPU for additional performance gains.
# The default installation, 'max'
cargo install gitoxide
# On linux, it's a little faster to compile the termion version, which also results in slightly smaller binaries
cargo install gitoxide --no-default-features --features max-termion
# For smaller binaries and even faster build times that are traded for a less fancy CLI implementation, use `lean`
# or `lean-termion` respectively.
cargo install gitoxide --no-default-features --features lean
Once installed, there are two binaries:
- gix
- high level commands, porcelain, for every-day use, optimized for a pleasant user experience
- gixp
- low level commands, plumbing, for use in more specialized cases
- a pure-rust implementation of git
- including transport, object database, references, cli and tui
- a simple command-line interface is provided for the most common git operations, optimized for user experience. A simple-git if you so will.
- be the go-to implementation for anyone who wants to solve problems around git, and become
the alternative to
GitPython
in the process. - become the foundation for a free distributed alternative to GitHub, and maybe even GitHub itself
- learn from the best to write the best possible idiomatic Rust
- libgit2 is a fantastic resource to see what abstractions work, we will use them
- use Rust's type system to make misuse impossible
- be the best performing implementation
- use Rust's type system to optimize for work not done without being hard to use
- make use of parallelism from the get go
- assure on-disk consistency
- assure reads never interfere with concurrent writes
- assure multiple concurrent writes don't cause trouble
- take shortcuts, but not in quality
- binaries may use
anyhow::Error
exhaustively, knowing these errors are solely user-facing. - libraries use light-weight custom errors implemented using
quick-error
orthiserror
. - internationalization is nothing we are concerned with right now.
- IO errors due to insufficient amount of open file handles don't always lead to operation failure
- binaries may use
- Cross platform support, including Windows
- With the tools and experience available here there is no reason not to support Windows.
- Windows is testsed on CI and failures do prevent releases.
- replicate
git
command functionality perfectlygit
isgit
, and there is no reason to not use it. Our path is the one of simplicity to make getting started with git easy.
- be incompatible to git
- the on-disk format must remain compatible, and we will never contend with it.
- use async IO everywhere
- for the most part, git operations are heavily relying on memory mapped IO as well as CPU to decompress data, which doesn't lend itself well to async IO out of the box.
- Use
blocking
as well asgit-features::interrupt
to bring operations into the async world and to control long running operations. - When connecting or streaming over TCP connections, especially when receiving on the server, async seems like a must though, but behind a feature flag.
Provide a CLI to for the most basic user journey:
- initialize a repository
- clone a repository
- create a commit
- add a remote
- push
- create (thin) pack
Cargo uses feature toggles to control which dependencies are pulled in, allowing users to specialize crates to fit their usage. Ideally, these should be additive. This guide documents which features are available for each of the crates provided here and how they function.
The top-level command-line interface.
- fast
- Makes the crate execute as fast as possible by supporting parallel computation of otherwise long-running functions as well as fast, hardware accelerated hashing.
- If disabled, the binary will be visibly smaller.
- http
- support synchronous 'http' and 'https' transports (e.g. for clone, fetch and push) at the expense of compile times and binary size
- (mutually exclusive)
- pretty-cli
- Use
clap
3.0 to build the prettiest, best documented and most user-friendly CLI at the expense of binary size. - provides a terminal user interface for detailed and exhaustive progress.
- provides a line renderer for leaner progress
- Use
- lean-cli
- Use
argh
to produce a usable binary with decent documentation that is smallest in size, usually 300kb less thanpretty-cli
. - If
pretty-cli
is enabled as well,lean-cli
will take precedence, and you pay for building unnecessary dependencies. - provides a line renderer for lean but pretty progress
- Use
- pretty-cli
- prodash-render-line-crossterm or prodash-render-line-termion (mutually exclusive)
- The
--verbose
flag will be powered by an interactive progress mechanism that doubles as log as well as interactive progress that appears after a short duration.
- The
There are convenience features, which combine common choices of the above into one name
- max = pretty-cli + fast + prodash-render-tui-crossterm + http
- default, for unix and windows
- max-termion = pretty-cli + fast + prodash-render-tui-termion + http
- for unix only, faster compile times, a little smaller
- lean = lean-cli + fast + prodash-render-line-crossterm
- for unix and windows, significantly smaller than max, but without
--progress
terminal user interface.
- for unix and windows, significantly smaller than max, but without
- lean-termion = lean-cli + fast + prodash-render-line-termion
- for unix only, faster compile times, a little smaller
- light = lean-cli + fast
- crossplatform by nature as this comes with simplified log based progress
- small = lean-cli
- As small as it can possibly be, no threading, no fast sha1, log based progress only, no cleanup of temporary files on interrupt
A crate to help controlling which capabilities are available from the top-level crate that uses gitoxide-core
or any other
gitoxide
crate that uses git-features
.
All feature toggles are additive.
- parallel
- Use scoped threads and channels to parallelize common workloads on multiple objects. If enabled, it is used everywhere where it makes sense.
- As caches are likely to be used and instantiated per thread, more memory will be used on top of the costs for threads.
- fast-sha1
- a multi-crate implementation that can use hardware acceleration, thus bearing the potential for up to 2Gb/s throughput on CPUs that support it, like AMD Ryzen or Intel Core i3.
- mutually-exclusive
- interrupt-handler
- Listen to interrupts and termination requests and provide long-running operations tooling to allow aborting the input stream.
- Note that
git_features::interrupt::init_handler()
must be called at the start of the application.
- Note that
- If the application already sets a handler, this handler will have no effect.
- If unset, these utilities can still be triggered programmatically. However, interrupting with Ctrl+C or SIGTERM may lead to leaking temporary files.
- Listen to interrupts and termination requests and provide long-running operations tooling to allow aborting the input stream.
- disable-interrupts (takes precedence if interrupt-handler is set as well)
- If set, interrupts cannot be triggered programmatically and it's up to the user to inject means of supporting interrupts.
- Useful if there is multiple interruptible operations at the same time that should be triggered independently. After all, this facility is a global one.
- Probably useful for server implementations.
- interrupt-handler
- io-pipe
- an in-memory unidirectional pipe using
bytes
as efficient transfer mechanism
- an in-memory unidirectional pipe using
- http-client-curl
- Adds support for the http and https transports using the Rust bindings for
libcurl
- Adds support for the http and https transports using the Rust bindings for
What follows is feature toggles to control serialization of all public facing simple data types.
- serde1
- Data structures implement
serde::Serialize
andserde::Deserialize
- Data structures implement
The feature above is provided by the crates:
- git-object
- git-url
- git-odb
- git-protocol
- gitoxide-core
Both terms are coming from the git
implementation itself, even though it won't necessarily point out which commands are plumbing and which
are porcelain.
The term plumbing refers to lower-level, more rarely used commands that complement porcelain by being invoked by it or by hand for certain use
cases.
The term porcelain refers to those with a decent user experience, they are primarily intended for use by humans.
In any case, both types of programs must self-document their capabilities using through the --help
flag.
From there, we can derive a few rules to adhere to unless there are good reasons not to:
- does not show any progress or logging output by default
- if supported and logging is enabled, it will show timestamps in UTC
- it does not need a git repository, but instead takes all required information via the command-line
- Provides output to stderr by default to provide progress information. There is no need to allow disabling it, but it shouldn't show up unless the operation takes some time.
- If timestamps are shown, they are in localtime.
- Non-progress information goes to stdout.
- fetches using protocol V1 and stateful connections, i.e. ssh, git, file, may hang
- This can be fixed by making response parsing.
- Note that this does not affect cloning, which works fine.
- lean and light and small builds don't support non-UTF-8 paths in the CLI
- This is because they depend on
argh
, which does not yet support parsing OsStrings. We however believe it eventually will do so and thus don't move on topico-args
. - Only one level of sub-commands are supported due to a limitation of
argh
, which forces porcelain to limit itself as well despite usingclap
. We deem this acceptable for plumbing commands and think that porcelain will be high-level and smart enough to not ever require deeply nested sub-commands.
- This is because they depend on
- Packfiles use memory maps
- Even though they are comfortable to use and fast, they squelch IO errors.
- potential remedy: We could generalize the Pack to make it possible to work on in-memory buffers directly. That way, one would initialize a Pack by reading the whole file into memory, thus not squelching IO errors at the expense of latency as well as memory efficiency.
- Packfiles cannot load files bigger than 2^31 or 2^32 on 32 bit systems
- As these systems cannot address more memory than that.
- potential remedy: implement a sliding window to map and unmap portions of the file as needed.
- However, those who need to access big packs on these systems would rather resort to
git
itself, allowing our implementation to be simpler and potentially more performant.
- However, those who need to access big packs on these systems would rather resort to
- Objects larger than 32 bits cannot be loaded on 32 bit systems
- in-memory representations objects cannot handle objects greater than the amount of addressable memory.
- This should not affect git LFS though.
- CRC32 implementation doesn't use SIMD
- Probably at no cost one could upgrade to the crc32fast crate, but it looks unmaintained and builds more slowly.
- git-url might be more restrictive than what git allows as for the most part, it uses a browser grade URL parser.
- Thus far there is no proof for this, and as potential remedy we could certainly re-implement exactly what git does to handle its URLs.
- itertools (MIT Licensed)
- We use the
izip!
macro in code
- We use the
- deflate2 (MIT Licensed)
- We use various abstractions to implement decompression and compression directly on top of the rather low-level
miniz_oxide
crate
- We use various abstractions to implement decompression and compression directly on top of the rather low-level
This project is licensed under either of
- Apache License, Version 2.0, (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
- Originally I was really fascinated by this problem
and believe that with
gitoxide
it will be possible to provide the fastest solution for it. - I have been absolutely blown away by
git
from the first time I experienced git more than 13 years ago, and tried to implement it in various shapes and forms multiple times. Now with Rust I finally feel to have found the right tool for the job!