Skip to content

Commit

Permalink
Add gsoc 2024 projects
Browse files Browse the repository at this point in the history
  • Loading branch information
Chaitanya-Shahare committed Oct 18, 2024
1 parent 0dae9ad commit 8707b65
Show file tree
Hide file tree
Showing 4 changed files with 291 additions and 53 deletions.
11 changes: 10 additions & 1 deletion content/gsoc-projects/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,19 @@ tags: []
draft: false
---



## GSoC 2024 Projects

Welcome prospective Google Summer of Code 2024 Students! This document is your starting point to finding interesting and important projects for LLVM, Clang, and other related sub-projects. This list of projects is not only developed for Google Summer of Code, but open projects that really need developers to work on and are very beneficial for the LLVM community.

We encourage you to look through this list and see which projects excite you and match well with your skill set. We also invite proposals not on this list. More information and discussion about GSoC can be found in [discourse](https://discourse.llvm.org/c/community/gsoc). If you have questions about a particular project please find the relevant entry in discourse, check previous discussion and ask. If there is no such entry or you would like to propose an idea please create a new entry. Feedback from the community is a requirement for your proposal to be considered and hopefully accepted.

The LLVM project has participated in Google Summer of Code for several years and has had some very successful projects. We hope that this year is no different and look forward to hearing your proposals. For information on how to submit a proposal, please visit the Google Summer of Code main [website](https://summerofcode.withgoogle.com/).


{{< datamap
"gsoc_projects" "gsoc_projects"
"gsoc_2024_projects" "gsoc_2024_projects"
"title"
"description"
"expected_result"
Expand Down
1 change: 1 addition & 0 deletions data/gsoc_2023_projects.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
gsoc_2023_projects:
280 changes: 280 additions & 0 deletions data/gsoc_2024_projects.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,280 @@
gsoc_2024_projects:
- title: "Remove undefined behavior from tests"
description: |
Many of LLVM's unit tests have been reduced automatically from larger tests. Previous-generation reduction tools used undef and poison as placeholders everywhere, as well as introduced undefined behavior (UB). Tests with UB are not desirable because 1) they are fragile since in the future the compiler may start optimizing more aggressively and break the test, and 2) it breaks translation validation tools such as Alive2 (since it's correct to translate a function that is always UB into anything).
The major steps include:
1. Replace known patterns such as branch on undef/poison, memory accesses with invalid pointers, etc with non-UB patterns.
2. Use Alive2 to detect further patterns (by searching for tests that are always UB).
3. Report any LLVM bug found by Alive2 that is exposed when removing UB.
expected_result: "The majority of LLVM's unit tests will be free of UB."
skills: "Experience with scripting (Python or PHP) is required. Experience with regular expressions is encouraged."
project_size: "Either medium or large."
difficulty: "Medium"
mentors:
- name: "Nuno Lopes"
url: "https://web.ist.utl.pt/nuno.lopes/"
discourse_url: "https://discourse.llvm.org/t/gsoc-2004-remove-undefined-behavior-from-tests/77236"

- title: "Automatically generate TableGen file for SPIR-V instruction set"
description: |
The existing file that describes the SPIR-V instruction set in LLVM was manually created and is not always complete or up to date. Whenever new instructions need to be added to the SPIR-V backend, the file must be amended. In addition, since it is not created in a systematic way, there are often slight discrepancies between how an instruction is described in the SPIR-V spec and how it is declared in the TableGen file.
This project proposes creating a script capable of generating a complete TableGen file that describes the SPIR-V instruction set given the JSON grammar available in the KhronosGroup/SPIRV-Headers repository, and updating SPIR-V backend code to use the new definitions.
expected_result: |
1. The SPIR-V instruction set's definition in TableGen is replaced with autogenerated content.
2. A script and documentation are provided to regenerate definitions from the JSON grammar.
3. SPIR-V backend is updated to use new autogenerated definitions.
skills: "Experience with scripting and intermediate knowledge of C++. Previous experience with LLVM/TableGen is a bonus but not required."
project_size: "Medium (175 hour)"
mentors:
- name: "Natalie Chouinard"
url: "https://github.com/sudonatalie/"
- name: "Nathan Gauër"
url: "https://github.com/keenuts/"
discourse_url: "https://discourse.llvm.org/t/clang-automatically-generate-tablegen-file-for-spir-v-instruction-set/76369"

- title: "LLVM bitstream integration with CAS (content-addressable storage)"
description: |
The LLVM bitstream file format is used for serialization of intermediate compiler artifacts, such as LLVM IR or Clang modules. This project aims to integrate the LLVM CAS library into the LLVM bitstream file format by factoring out frequently duplicated parts of bitstream files into separate CAS objects, reducing storage requirements.
expected_result: |
There's a way to configure the LLVM bitstream writer/reader to use CAS as the backing storage.
skills: "Intermediate knowledge of C++, familiarity with data serialization, and self-motivation."
project_size: "Medium or large"
mentors:
- name: "Jan Svoboda"
url: "https://github.com/jansvoboda11/"
- name: "Steven Wu"
url: "https://github.com/cachemeifyoucan/"
discourse_url: "https://discourse.llvm.org/t/llvm-bitstream-integration-with-cas-content-addressable-storage/76757"

- title: "Add 3-way comparison intrinsics"
description: |
3-way comparisons return -1, 0, or 1 based on whether values are lower, equal, or greater. The goal of this project is to implement new 3-way comparison intrinsics and improve optimization results by implementing legalization/expansion support in LLVM's backend and integrating them into the clang and rustc frontends.
expected_result: |
Full support for intrinsics in backend and optimization passes, ideally with frontend integration.
skills: "Intermediate knowledge of C++."
project_size: "Medium or large"
difficulty: "Medium"
mentors:
- name: "Nikita Popov"
url: "https://github.com/nikic"
- name: "Dhruv Chawla"
url: "https://github.com/dc03"
discourse_url: "https://discourse.llvm.org/t/llvm-add-3-way-comparison-intrinsics/76807"

- title: "Improve the LLVM.org Website Look and Feel"
description: |
The llvm.org website serves as the central hub for information about the LLVM project. Over time, the website has evolved organically, prompting the need for a redesign to enhance its modernity, structure, and ease of maintenance. This project aims to create a modern static website that improves navigation, taxonomy, and usability, reflecting the essence of LLVM.org.
expected_result: |
A modern, coherent website that improves navigation, content discoverability, mobile support, and accessibility. The project will involve community engagement to gather feedback and ensure a successful implementation.
skills: "Knowledge in web development with static site generators, HTML, CSS, Bootstrap, and Markdown."
difficulty: "Hard"
project_size: "Large"
mentors:
- name: "Tanya Lattner"
url: "https://github.com/tlattner"
- name: "Vassil Vassilev"
url: "https://github.com/vgvassilev"
discourse_url: "https://discourse.llvm.org/t/improve-the-llvm-org-website-look-and-feel/76864"

- title: "Out-of-process execution for clang-repl"
description: |
The Clang compiler supports various languages such as C, C++, ObjC, and ObjC++. Clang-Repl is an efficient interpreter that makes the C++ language more user-friendly, using the Orcv2 JIT infrastructure within the same process. However, this design has two significant drawbacks: it can't be used on devices with limited resources, and crashes in user code crash the entire process. This project aims to move Clang-Repl to an out-of-process execution model to address these issues.
expected_result: |
Implement out-of-process execution of statements with Clang-Repl.
Demonstrate support for some ez-clang use cases.
Research restart/continue approaches upon crashes.
Stretch goal: Design versatile reliability approach for crash recovery.
skills: "Intermediate knowledge of C++, Understanding of LLVM and the LLVM JIT in particular."
project_size: "Either medium or large."
difficulty: "Medium"
mentors:
- name: "Vassil Vassilev"
url: "https://github.com/vgvassilev"
discourse_url: "https://discourse.llvm.org/t/clang-out-of-process-execution-for-clang-repl/68225"

- title: "Support clang plugins on Windows"
description: |
Clang supports extending the compiler with plugins, enabling extra user-defined actions during compilation. While plugins work on Unix and Darwin, they are not supported on Windows due to platform differences. This project aims to expose the participant to a broad cross-section of the LLVM codebase, classifying APIs and implementing changes to support plugins on Windows.
expected_result: |
Implement clang plugin support on Windows.
Extend the working prototype and the annotation tool.
skills: "Intermediate knowledge of C++, experience with Windows compilation and linking model."
project_size: "Either medium or large."
difficulty: "Medium"
mentors:
- name: "Vassil Vassilev"
url: "https://github.com/vgvassilev"
- name: "Saleem Abdulrasool"
url: "https://github.com/compnerd"
discourse_url: "https://discourse.llvm.org/t/support-clang-plugins-on-windows/76408"

- title: "On Demand Parsing in Clang"
description: |
Clang currently parses a sequence of characters as they appear, linearly. However, most end-user code only uses a small portion of the entire translation unit. This project proposes an on-demand parsing approach where heavy compiling C++ entities are processed only when required. This approach aims to reduce peak memory usage and improve compile times for sparse translation units.
expected_result: |
Design and implement on-demand compilation for non-templated functions and classes.
Run performance benchmarks on relevant codebases and prepare a report.
Prepare a community RFC document.
Stretch goal: Support templates.
skills: "Knowledge of C++, deeper understanding of Clang, Clang AST and Preprocessor."
project_size: "Large"
difficulty: "Hard"
mentors:
- name: "Vassil Vassilev"
url: "https://github.com/vgvassilev"
- name: "Matheus Izvekov"
url: "https://github.com/mizvekov"
discourse_url: "https://discourse.llvm.org/t/on-demand-parsing-in-clang/76912"

- title: "Improve Clang-Doc Usability"
description: |
Clang-Doc is a C/C++ documentation generation tool created as an alternative to Doxygen. While it can generate documentation in Markdown and HTML, it has usability issues, lacks support for some constructs, and doesn't scale well for large codebases. This project aims to improve Clang-Doc to the point where it can be used to generate documentation for large projects like LLVM.
expected_result: |
Improve usability of Clang-Doc and resolve existing limitations.
Enable Clang-Doc to generate documentation for large codebases like LLVM.
skills: "Experience with web technologies (HTML, CSS, JS), intermediate knowledge of C++. Experience with Clang/LibTooling is a bonus."
project_size: "Either medium or large."
difficulty: "Medium"
mentors:
- name: "Petr Hosek"
url: "https://github.com/petrhosek"
- name: "Paul Kirth"
url: "https://github.com/ilovepi"
discourse_url: "https://discourse.llvm.org/t/improve-clang-doc-usability/76996"

- title: "Rich Disassembler for LLDB"
description: |
This project aims to annotate LLDB’s disassembler output with the location and lifetime of source variables using variable location information from debug info. This rich disassembler output should be exposed as structured data and made available through LLDB’s scripting API. In a terminal, LLDB should render the annotations as text.
expected_result: |
Produce annotated disassembly showing variable lifetime and location.
Expose rich disassembler output through LLDB’s scripting API.
skills: |
Required:
- Good understanding of C++
- Familiarity with using a debugger on the terminal
- Familiarity with assembler dialects for machine code (x86_64 or AArch64)
Desired:
- Compiler knowledge, including data flow and control flow analysis
- Experience navigating debug information (DWARF)
project_size: "Medium (~175h)"
difficulty: "Hard"
mentors:
- name: "Adrian Prantl"
url: "mailto:[email protected]"
- name: "Jonas Devlieghere"
url: "mailto:[email protected]"
discourse_url: "https://discourse.llvm.org/t/rich-disassembler-for-lldb/76952"

- title: "GPU Delta Debugging"
description: |
LLVM-reduce and similar tools perform delta debugging but are less useful if many implicit constraints exist. This project aims to develop a GPU-aware version, especially for execution time bugs, that can be used in conjunction with LLVM/OpenMP GPU-record-and-replay or a GPU loader script to minimize GPU test cases more efficiently and effectively.
expected_result: "A tool to reduce GPU errors without losing the original error. Optionally, other properties could also be the focus of the reduction."
skills: "Good understanding of C++, familiarity with GPUs and LLVM-IR."
project_size: "Medium"
difficulty: "Medium"
mentors:
- name: "Konstantinos Parasyris"
url: "mailto:[email protected]"
- name: "Johannes Doerfert"
url: "mailto:[email protected]"
discourse_url: "https://discourse.llvm.org/t/gsoc-2024-gpu-delta-debugging/77237"

- title: "Offloading libcxx"
description: |
Modern C++ defines parallel algorithms as part of the standard library. This project aims to extend the implementation of these algorithms using OpenMP, including GPU offload where reasonable. The goal is to explore different algorithms and options for executing them on both the host and accelerator devices, especially GPUs, automatically via OpenMP.
expected_result: "Improvements to the prototype support of offloading in libcxx. Evaluations against other offloading approaches and documentation on missing parts and shortcomings."
skills: "Good understanding of C++ and C++ standard algorithms, familiarity with GPUs and (OpenMP) offloading."
project_size: "Large"
difficulty: "Medium"
mentors:
- name: "Johannes Doerfert"
url: "mailto:[email protected]"
- name: "Tom Scogland"
url: "mailto:[email protected]"
- name: "Tom Deakin"
url: "mailto:[email protected]"
discourse_url: "https://discourse.llvm.org/t/gsoc-2024-offloading-libcxx/77238"

- title: "The 1001 thresholds in LLVM"
description: |
LLVM has many thresholds and flags to avoid costly cases, but it’s unclear if these thresholds are useful or their impact. This project aims to explore these thresholds, identify when they are hit, and assess how we should select their values and whether different profiles are needed.
expected_result: "Statistical evidence on the impact of various thresholds inside LLVM's codebase, including compile time changes, impact on transformations, and performance measurements."
skills: "Profiling skills and knowledge of statistical reasoning."
project_size: "Medium"
difficulty: "Easy"
mentors:
- name: "Jan Hueckelheim"
url: "mailto:[email protected]"
- name: "Johannes Doerfert"
url: "mailto:[email protected]"
- name: "William Moses"
url: "mailto:[email protected]"
discourse_url: "https://discourse.llvm.org/t/gsoc-2024-the-1001-thresholds-in-llvm/77235"

- title: "Performance tuning the GPU libc"
description: |
Work has begun on a libc library targeting GPUs, allowing users to call functions like malloc or memcpy while executing on the GPU. The goal is to benchmark the implementations of certain libc functions on the GPU and write more optimal implementations.
expected_result: "In-depth performance for libc functions. Overhead of GPU-to-CPU remote procedure calls. More optimal implementations of 'libc' functions."
skills: "Profiling skills and understanding of GPU architecture."
project_size: "Small"
difficulty: "Easy"
mentors:
- name: "Joseph Huber"
url: "mailto:[email protected]"
- name: "Johannes Doerfert"
url: "mailto:[email protected]"
discourse_url: "https://discourse.llvm.org/t/libc-gsoc-2024-performance-and-testing-in-the-gpu-libc/77042"

- title: "Improve GPU First Framework"
description: |
GPU First is a methodology and framework that enables existing host code to execute on a GPU without user modifications. The project aims to port host code to handle RPC and explore support for MPI among multiple thread blocks on a single GPU or multiple GPUs.
expected_result: "A more efficient GPU First framework that supports both NVIDIA and AMD GPUs. Optionally, upstream the framework."
skills: "Good understanding of C++ and GPU architecture, familiarity with GPUs and LLVM IR."
project_size: "Medium"
difficulty: "Medium"
mentors:
- name: "Shilei Tian"
url: "mailto:[email protected]"
- name: "Johannes Doerfert"
url: "mailto:[email protected]"
- name: "Joseph Huber"
url: "mailto:[email protected]"
discourse_url: "https://discourse.llvm.org/t/openmp-gsoc-2024-improve-gpu-first-framework/77048"

- title: "Compile GPU kernels using ClangIR"
description: |
The ClangIR project aims to establish a new intermediate representation (IR) for Clang built on top of MLIR. This project focuses on identifying and implementing missing features in ClangIR to compile GPU kernels in OpenCL C language to LLVM-IR for the SPIR-V target.
expected_result: "Polybench-GPU's 2DCONV, GEMM, and CORR OpenCL kernels can be compiled with ClangIR to LLVM-IR for SPIR-V."
skills: "Intermediate C++ programming skills and familiarity with basic compiler design concepts are required. Prior experience with LLVM IR, MLIR, Clang, or GPU programming is a plus."
project_size: "Large"
difficulty: "Medium"
mentors:
- name: "Julian Oppermann"
url: "https://github.com/jopperm"
- name: "Victor Lomüller"
url: "https://github.com/Naghasan"
- name: "Bruno Cardoso Lopes"
url: "https://github.com/bcardosolopes"
discourse_url: "https://discourse.llvm.org/t/clangir-compile-gpu-kernels-using-clangir/76984"

- title: "Half precision in LLVM libc"
description: |
Half precision is an IEEE 754 floating-point format standardized as _Float16 in C23. The goal of this project is to implement C23 half precision math functions in the LLVM libc library.
expected_result: |
Setup generated headers for various compilers and architectures.
Implement basic math operations for half precision data types.
Implement optimizations using compiler builtins or hardware instructions.
Investigate higher math functions for half precision if time permits.
skills: "Intermediate C++ programming skills and familiarity with basic compiler design concepts are required. Prior experience with LLVM IR, MLIR, Clang, or GPU programming is a plus."
project_size: "Large"
difficulty: "Easy/Medium"
mentors:
- name: "Tue Ly"
url: "mailto:[email protected]"
- name: "Joseph Huber"
url: "mailto:[email protected]"
discourse_url: "https://discourse.llvm.org/t/libc-gsoc-2024-half-precision-in-llvm-libc/77027"
Loading

0 comments on commit 8707b65

Please sign in to comment.