forked from openxla/xla
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update OpenXLA-pin to Nov24 #3
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
wbmc
commented
Dec 5, 2023
- Update from PR
…ary. The library can be (and is) tested. PiperOrigin-RevId: 582142162
…ions CustomKernel and CustomFusion are already unique enough, no need to put them into a unique namespace. PiperOrigin-RevId: 582174192
PiperOrigin-RevId: 582177793
PiperOrigin-RevId: 582199198
Serializing them is in line with all other module properties which affect compilation (aliasing, layout, etc.), and not serializing creates an impure compilation environment where IR does not and can not capture semantics of the module. PiperOrigin-RevId: 582290809
We do not need to check the backend config field. PiperOrigin-RevId: 582292178
PiperOrigin-RevId: 582305685
PiperOrigin-RevId: 582343268
…isualization The HTML codepath has bitrotted, is not tested, and isn't currently working. Let's use the same approach as for fusion visualization, as it is working. PiperOrigin-RevId: 582346207
Imported from GitHub PR openxla#6964 Here is a fix for the oncoming build brakes due to recebt changes in GpuDriver API. Besides, I have also fixed the issue with headers in xla/service/gpu/ir_emitter_unnested.cc: otherwise, this would generate linker errors on ROCM platform when TF_HIPBLASLT=0 @xla-rotation: would you have a look, please ? Copybara import of the project: -- 14c2e30 by Pavel Emeliyanenko <[email protected]>: fixing buildbrakes -- 22b0962 by Pavel Emeliyanenko <[email protected]>: fixing buildifier warnings Merging this change closes openxla#6964 COPYBARA_INTEGRATE_REVIEW=openxla#6964 from ROCmSoftwarePlatform:ci_rocm_build_brakes_231113 22b0962 PiperOrigin-RevId: 582361553
- Make loop detection more accurate by recording the latest instance of an instruction with matching fingerprint. - If the loop value allocation type isn't supported by the optimizer, still allow that tensor to get alternate memory allocation using the usual MSA algorithm. - Export the minimum num loop iteration as a field in the proto. PiperOrigin-RevId: 582363814
PiperOrigin-RevId: 582369654
Updates LLVM usage to match [ed86e740effa](llvm/llvm-project@ed86e740effa) PiperOrigin-RevId: 582371041
PiperOrigin-RevId: 582385702
… shardings when enumerating sharding strategies for those ops. This is as opposed to the previous approach of using sharding propagation to infer operand shardings given the dot/conv. This approach does not work when one is looking to shard the contraction dimension and is therefore less cleaner than this new approach. PiperOrigin-RevId: 582397850
PiperOrigin-RevId: 582399236
PiperOrigin-RevId: 582419743
Re-arrange structs/classes declarations in kernel.h to avoid forward declaring arguments types. PiperOrigin-RevId: 582428745
PiperOrigin-RevId: 582440759
PiperOrigin-RevId: 582447653
Add a boolean field, no_parallel_gpu_op, to CollectiveBackendConfig. This field asserts that an asynchronous collective operation does not execute in parallel with other operations in GPU. The default value of the attribute is false, which should lead to conservative runtime behavior. Add BackendConfig test for the field. Add gpu-schedule-postprocessing pass, to refine the attribute value. Add test cases for the pass. PiperOrigin-RevId: 582457930
PiperOrigin-RevId: 582458600
…nc ops to be efficiently scheduled. PiperOrigin-RevId: 582470977
…that are not necessarily in the same computation as the use. PiperOrigin-RevId: 582495253
…ys and values to std::string_view. I plan to add a caller that has a std::vector<char>, and this saves a copy in that case. PiperOrigin-RevId: 582500430
PiperOrigin-RevId: 582526517
Updates LLVM usage to match [5d6304f01742](llvm/llvm-project@5d6304f01742) PiperOrigin-RevId: 582539795
… integrate PiperOrigin-RevId: 584838739
PiperOrigin-RevId: 584840548
PiperOrigin-RevId: 584841378
Imported from GitHub PR openxla#7201 ncclGetLastError return the last log entry generated at the "WARN/ERROR" level. Here is an example of the new error: ``` NCCL operation ncclCommInitRank(&comm, nranks, id, rank) failed: unhandled cuda error (run with NCCL_DEBUG=INFO for details). Last NCCL warning(error) log entry (may be unrelated) 'Cuda failure 2 'out of memory''.; current tracing scope: all-reduce-start.285; current profiling annotation: XlaModule:#hlo_module=pjit__wrapped_step_fn,program_id=25#. ``` The new part is: ``` Last NCCL warning(error) log entry (may be unrelated) 'Cuda failure 2 'out of memory''. ``` Copybara import of the project: -- 348df80 by Frederic Bastien <[email protected]>: Add extra error information when NCCL error out. Merging this change closes openxla#7201 COPYBARA_INTEGRATE_REVIEW=openxla#7201 from nouiz:nccl_warn_log_as_error_upstream 348df80 PiperOrigin-RevId: 584842170
…pu docker image Imported from GitHub PR openxla#7237 The "devel" docker image is not updated and does not seem maintained anymore. We probably should recommend a most up-to-date image. Copybara import of the project: -- 3412f30 by Mehdi Amini <[email protected]>: Update build_from_source.md doc to point to latest-gpu docker image The "devel" docker image is not updated and does not seem maintained anymore. We probably should recommend a most up-to-date image. -- 4b7a36b by Mehdi Amini <[email protected]>: Update build_from_source.md Merging this change closes openxla#7237 COPYBARA_INTEGRATE_REVIEW=openxla#7237 from joker-eph:patch-2 4b7a36b PiperOrigin-RevId: 584848729
PiperOrigin-RevId: 584852376
Obviously, an experienced XLA engineer would know that IsElementwise/IsOpElementwise/IsElementwiseImpl/IsElementwiseOnOperand are very different functions and one should be very careful when using them. PiperOrigin-RevId: 584873063
This is needed once we want to enable it by default. PiperOrigin-RevId: 584873521
PiperOrigin-RevId: 584878714
PiperOrigin-RevId: 584907007
…n --xla_gpu_autotuner_level=0 is set Instead, pick the first tiling available. This is consistent with autotuner_level=0 behavior in non-deviceless mode, and allows for better QOL while developing without a (matching) GPU. PiperOrigin-RevId: 584908154
…itTritonFusion xla/tests:dot_operation_test_autotune_disabled_gpu_a100 was flaky because of this. PiperOrigin-RevId: 584935941
…rough decoding APIs PiperOrigin-RevId: 584940021
PiperOrigin-RevId: 584941044
This adds the necessary changes in XlaBuilder API, verifier, and shape inference following StableHLO rules for unbounded dynamism. Implicit broadcasting support in XlaBuilder API will be addressed in a follow up CL. PiperOrigin-RevId: 584967526
PiperOrigin-RevId: 585006293
Updates LLVM usage to match [af7a1453526a](llvm/llvm-project@af7a1453526a) PiperOrigin-RevId: 585072558
PiperOrigin-RevId: 585083299
Boundary functions seemed like a nice and easy abstraction for fusions, but they turned out to be too difficult to use in practice. The main problem is that everything is still based on HloInstructions, whose users and operands are difficult to traverse in general. The solution introduced here is to introduce an HloFusionAdaptor class with a simple interface, and an HloInstructionAdaptor which always behaves as if the HLO was completely unfused. If I had more time, I would have made smaller change. PiperOrigin-RevId: 585087631
This is another step towards tile analysis being able to tile all HLOs. Tiling dot requires some care to ensure that output dimensions are mapped to the appropriate dimensions. The [StableHLO specification for dot_general](https://github.com/openxla/stablehlo/blob/main/docs/spec.md#dot_general) describes how output dimensions are constructed from the input dimensions and the operation's attributes. PiperOrigin-RevId: 585095167
They don't use gml_st dialect. PiperOrigin-RevId: 585099478
This op is unsupported by tile analysis. Adding a test so that we don't shoot ourselves in the foot by using `isElementwise` method, for example. PiperOrigin-RevId: 585123986
We can revert this when/if we need this. PiperOrigin-RevId: 585124862
wbmc
pushed a commit
that referenced
this pull request
Jan 19, 2024
…art #3 PiperOrigin-RevId: 599039077
ApsarasX
pushed a commit
that referenced
this pull request
Apr 8, 2024
Currently we look for ptxas and nvlink in a few different places on the host machine, then we choose the first found binary without taking its version into account. If the chosen binary doesn't fulfill our version requirements we will later fail even if there was a suitable ptxas or nvlink in the search path in the first place. This change makes it take the version of each binary into account when going through the search path. Unsuitable binaries will be discarded right away and the search continues until we are out of locations to check. This should help with host environments that have multiple CUDA toolkits installed and should make ptxas and nvlink selection more robust. The concreate changes: 1. `FindCudaExecutable` now also takes a minimum version and a list of forbidden (think buggy) versions that are supposed to be skipped. 2. `WarnIfBadPtxAsVersion` has been removed. It was checking for ptxas < 11.1 which is way older than our minimum supported version of 11.8 and was not doing anything given the check described in #3. 3. There was another version check for `ptxas` in `NVPTXCompiler::ChooseLinkingMethod` which was checking for `version(ptxas)` < 11.8. This has also been removed/replace by the version check described in #4. 4. Version checking for `ptxas` and `nvlink` has been consolidated into 2 methods `FindPtxAsExectuable` and `FindNvLinkExecutable`. These methods hard code the current minimum version (and the list of excluded versions) of each tool in one place. It's still not great but at least less spaghetti-like. PiperOrigin-RevId: 618797392
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.