forked from SearchScale/lucene-cuvs
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This is an initial take on introducing a fully hermetic, reproducible build for lucene-cuvs. Broadly speaking, it does the following: 1. Get all dependencies we need, including CUDA and a base C++ toolchain. 2. Take the base dependencies to build a Clang/LLVM toolchain that is used as host compiler for CUDA. 3. Build the C++ shared object. This also provides the infrastructure to target AMD GPUs (though the kernels need to be written first :D) Additionally this commit: - Changes raft to the new cuvs repo where applicable. - Bumps all dependencies to more or less the latest version. - Removes redundant files (pom.xml, CMakeLists.txt, obsolete headers). - Introduces pre-commit hooks for all files.
- Loading branch information
1 parent
89afa74
commit ed82e37
Showing
61 changed files
with
11,400 additions
and
555 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
# Don't inherit PATH and LD_LIBRARY_PATH. | ||
build --incompatible_strict_action_env | ||
|
||
# Use a prebuilt JDK instead of relying on the host's java runtime. | ||
common --java_runtime_version=remotejdk_21 | ||
common --tool_java_runtime_version=remotejdk_21 | ||
|
||
# TODO(aaronmondal): Remove after https://github.com/bazelbuild/bazel/pull/22001 | ||
build --noincompatible_sandbox_hermetic_tmp | ||
|
||
# Enforce C++20 as the default for rules_cc, regardless of toolchain config. | ||
build --cxxopt=-std=c++20 --host_cxxopt=-std=c++20 | ||
|
||
# Since expect rules_cc targets to be mainly exec_tools, use O3. | ||
build --cxxopt=-O3 --host_cxxopt=-O3 | ||
|
||
# Forbid network access unless explicitly enabled. | ||
build --sandbox_default_allow_network=false | ||
|
||
# Use correct runfile locations. | ||
build --nolegacy_external_runfiles | ||
|
||
# Enable sandboxing for exclusive tests like GPU performance tests. | ||
test --incompatible_exclusive_test_sandboxed | ||
|
||
# Make sure rules_cc uses the correct transition mechanism. | ||
build --incompatible_enable_cc_toolchain_resolution | ||
|
||
# Propagate tags such as no-remote for precompilations to downstream actions. | ||
common --incompatible_allow_tags_propagation | ||
|
||
# Bzlmod configuration. | ||
common --enable_bzlmod | ||
common --registry=https://raw.githubusercontent.com/bazelbuild/bazel-central-registry/main/ | ||
common --registry=https://raw.githubusercontent.com/eomii/bazel-eomii-registry/main/ | ||
|
||
# Remote optimizations. | ||
build --remote_build_event_upload=minimal | ||
build --remote_download_minimal | ||
build --nolegacy_important_outputs | ||
|
||
# Smaller profiling. Careful. Disabling this might explode remote cache usage. | ||
build --slim_profile | ||
build --experimental_profile_include_target_label | ||
build --noexperimental_profile_include_primary_output | ||
|
||
# Nix-generated action env for rules_ll. | ||
try-import %workspace%/.bazelrc.ll | ||
|
||
# Nix-generated flags for LRE. | ||
try-import %workspace%/.bazelrc.lre | ||
|
||
# Allow user-side customization. | ||
try-import %workspace%/.bazelrc.user |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
8.0.0-pre.20240516.1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
use flake --impure |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,26 @@ | ||
build | ||
target | ||
target | ||
|
||
# Generated by Bazel | ||
/bazel-* | ||
|
||
# Generated by rules_ll | ||
/.bazelrc.ll | ||
|
||
# Generated by LRE | ||
/.bazelrc.lre | ||
|
||
# Custom user-side configuration | ||
/.bazelrc.user | ||
|
||
# Generated by direnv | ||
/.direnv | ||
|
||
# Generated by the pre-commit nix flake module | ||
/.pre-commit-config.yaml | ||
|
||
# Generated by `ll up` | ||
/kustomize.yaml | ||
|
||
# NativeLink's local Pulumi stack. | ||
/Pulumi.dev.yaml |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
# See `lucene` and `cuda` subdirectories. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,134 @@ | ||
module( | ||
name = "lucene-cuvs", | ||
version = "0.0.0", | ||
compatibility_level = 0, | ||
) | ||
|
||
# Platform support. A base requirement for everything else. | ||
# | ||
# See: https://github.com/bazelbuild/platforms | ||
bazel_dep(name = "platforms", version = "0.0.10") | ||
|
||
# C++ rules. Don't use Bazel's legacy builtin rules. | ||
# | ||
# See: https://github.com/bazelbuild/rules_cc | ||
bazel_dep(name = "rules_cc", version = "0.0.9") | ||
|
||
# Basic starlark extensions. Always good to have available. | ||
# | ||
# See: https://github.com/bazelbuild/bazel-skylib | ||
bazel_dep(name = "bazel_skylib", version = "1.6.1") | ||
|
||
# Java rules. Don't use Bazel's legacy builtin rules. | ||
# | ||
# See: https://bazel.build/reference/be/java for the rules, | ||
# https://github.com/bazelbuild/rules_java for the repository. | ||
bazel_dep(name = "rules_java", version = "7.5.0") | ||
|
||
# Bug-fix to prevent an annoying debug message because of duplicate maven repos. | ||
# | ||
# TODO(aaronmondal): Remove this after: | ||
# https://github.com/bazelbuild/rules_jvm_external/issues/916 | ||
bazel_dep(name = "protobuf", version = "26.0.bcr.1") | ||
|
||
# Java dependencies. | ||
# | ||
# Run `bazel query "@maven//..."` to print available targets. | ||
# | ||
# See: https://github.com/bazelbuild/rules_jvm_external/blob/master/docs/bzlmod.md | ||
bazel_dep(name = "rules_jvm_external", version = "6.1") | ||
|
||
maven = use_extension("@rules_jvm_external//:extensions.bzl", "maven") | ||
maven.install( | ||
artifacts = [ | ||
"org.apache.lucene:lucene-core:9.9.0", | ||
"org.apache.lucene:lucene-codecs:9.9.0", | ||
"com.opencsv:opencsv:5.3", | ||
"commons-io:commons-io:2.15.1", | ||
"com.github.fommil:jniloader:1.1", | ||
], | ||
lock_file = "//:maven_install.json", | ||
) | ||
use_repo(maven, "maven", "unpinned_maven") | ||
|
||
# JNI rules for C++/Java FFIs. | ||
# | ||
# We apply some visibility patching to directly access the jni headers. | ||
# | ||
# See: https://github.com/fmeum/rules_jni | ||
bazel_dep(name = "rules_jni", version = "0.9.1") | ||
git_override( | ||
module_name = "rules_jni", | ||
commit = "7cb9c69d4d1f9ca2fae93d21d9c3498a9d0657a0", | ||
patch_strip = 1, | ||
patches = ["//patches:rules_jni_public_headers.diff"], | ||
remote = "https://github.com/fmeum/rules_jni", | ||
) | ||
|
||
# The Clang/LLVM toolchain and rules for CUDA compilation. | ||
# | ||
# See: https://github.com/eomii/rules_ll | ||
bazel_dep(name = "rules_ll", version = "0") | ||
git_override( | ||
module_name = "rules_ll", | ||
# Note: Keep this commit in sync with the one in flake.nix. | ||
commit = "3ee809512cfb605a00fe5eb938eab0e4f8705204", | ||
remote = "https://github.com/eomii/rules_ll", | ||
) | ||
|
||
# We need explicit access to the `@llvm-project` workspace for OpenMP. The | ||
# `llvm_project_overlay` extension aggregates patches across all modules. This | ||
# means that rules_ll's patches remain implicitly applied and caches are | ||
# identical with any other project using rules_ll at the same commit. | ||
# | ||
# Don't mix this module up with the `llvm-project` module in the | ||
# `bazel-central-registry`. The module we're using here is from the | ||
# `bazel-eomii-registry`. Upstreaming the patch aggregation logic or finding | ||
# a different solution is still a work in progress at | ||
# https://github.com/llvm/llvm-project/pull/88927. | ||
# | ||
# See: https://github.com/eomii/bazel-eomii-registry/tree/main/modules/llvm-project-overlay | ||
bazel_dep(name = "llvm-project-overlay", version = "17-init-bcr.3") | ||
|
||
llvm_project_overlay = use_extension( | ||
"@llvm-project-overlay//utils/bazel:extensions.bzl", | ||
"llvm_project_overlay", | ||
) | ||
use_repo( | ||
llvm_project_overlay, | ||
"llvm-project", | ||
) | ||
|
||
# The demo dataset. Available via the `@dataset//file` Bazel target. | ||
# | ||
# See: https://bazel.build/rules/lib/repo/http#http_file | ||
http_file = use_repo_rule( | ||
"@bazel_tools//tools/build_defs/repo:http.bzl", | ||
"http_file", | ||
) | ||
|
||
http_file( | ||
name = "dataset", | ||
downloaded_file_path = "dataset.zip", # Must have a `.zip` extension. | ||
integrity = "sha256-gHb64BruF4r2U+dxbdKoz1yJqHIOXzQxyrtb0va32L4=", | ||
url = "https://cdn.openai.com/API/examples/data/vector_database_wikipedia_articles_embedded.zip", | ||
) | ||
|
||
# External dependencies. See the `thirdparty` directory for build files. | ||
# | ||
# See: https://bazel.build/external/extension | ||
lucene_cuvs_deps = use_extension( | ||
"@lucene-cuvs//:extensions.bzl", | ||
"lucene_cuvs_dependencies", | ||
) | ||
use_repo( | ||
lucene_cuvs_deps, | ||
"cccl", # https://github.com/NVIDIA/cccl | ||
"cutlass", # https://github.com/NVIDIA/cutlass | ||
"cuvs", # https://github.com/rapidsai/cuvs | ||
"fmt", # https://github.com/fmtlib/fmt | ||
"local-remote-execution", # https://github.com/TraceMachina/nativelink/tree/main/local-remote-execution | ||
"raft", # https://github.com/rapidsai/raft | ||
"rmm", # https://github.com/rapidsai/rmm | ||
"spdlog", # https://github.com/gabime/spdlog | ||
) |
Oops, something went wrong.