Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor CI using ecmwf-actions #66

Merged
merged 4 commits into from
Nov 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
107 changes: 107 additions & 0 deletions .github/tools/install-nvhpc.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
#!/bin/sh

# Install NVHPC
# https://github.com/nemequ/pgi-travis
#
# Originally written for Squash <https://github.com/quixdb/squash> by
# Evan Nemerson. For documentation, bug reports, support requests,
# etc. please use <https://github.com/nemequ/pgi-travis>.
#
# To the extent possible under law, the author(s) of this script have
# waived all copyright and related or neighboring rights to this work.
# See <https://creativecommons.org/publicdomain/zero/1.0/> for
# details.

version=21.9

TEMPORARY_FILES="${TMPDIR:-/tmp}"
export NVHPC_INSTALL_DIR=$(pwd)/nvhpc-install
export NVHPC_SILENT=true
while [ $# != 0 ]; do
case "$1" in
"--prefix")
export NVHPC_INSTALL_DIR="$2"; shift
;;
"--tmpdir")
TEMPORARY_FILES="$2"; shift
;;
"--verbose")
export NVHPC_SILENT=false;
;;
"--version")
version="$2"; shift
;;
*)
echo "Unrecognized argument '$1'"
exit 1
;;
esac
shift
done

case "$(uname -m)" in
x86_64|ppc64le|aarch64)
;;
*)
echo "Unknown architecture: $(uname -m)" >&2
exit 1
;;
esac

if [ -d "${NVHPC_INSTALL_DIR}" ]; then
if [[ $(find "${NVHPC_INSTALL_DIR}" -name "nvc" | wc -l) == 1 ]]; then
echo "NVHPC already installed at ${NVHPC_INSTALL_DIR}"
exit
fi
fi

# Example download URL for version 21.9
# https://developer.download.nvidia.com/hpc-sdk/21.9/nvhpc_2020_219_Linux_x86_64_cuda_11.0.tar.gz

ver="$(echo $version | tr -d . )"
URL=$(curl -s "https://developer.nvidia.com/nvidia-hpc-sdk-$ver-downloads" | grep -oP "https://developer.download.nvidia.com/hpc-sdk/([0-9]{2}\.[0-9]+)/nvhpc_([0-9]{4})_([0-9]+)_Linux_$(uname -m)_cuda_([0-9\.]+).tar.gz" | sort | tail -1)
FOLDER="$(basename "$(echo "${URL}" | grep -oP '[^/]+$')" .tar.gz)"

if [ ! -d "${TEMPORARY_FILES}/${FOLDER}" ]; then
echo "Downloading ${TEMPORARY_FILES}/${FOLDER} from URL [${URL}]"
mkdir -p ${TEMPORARY_FILES}
curl --location \
--user-agent "pgi-travis (https://github.com/nemequ/pgi-travis)" \
"${URL}" | tar zx -C "${TEMPORARY_FILES}"
else
echo "Download already present in ${TEMPORARY_FILES}/${FOLDER}"
fi

echo "+ ${TEMPORARY_FILES}/${FOLDER}/install"
"${TEMPORARY_FILES}/${FOLDER}/install"

#comment out to cleanup
#rm -rf "${TEMPORARY_FILES}/${FOLDER}"

NVHPC_VERSION=$(basename "${NVHPC_INSTALL_DIR}"/Linux_$(uname -m)/*.*/)

# Use gcc which is available in PATH
${NVHPC_INSTALL_DIR}/Linux_$(uname -m)/${NVHPC_VERSION}/compilers/bin/makelocalrc \
-x ${NVHPC_INSTALL_DIR}/Linux_$(uname -m)/${NVHPC_VERSION}/compilers/bin \
-gcc $(which gcc) \
-gpp $(which g++) \
-g77 $(which gfortran)

cat > ${NVHPC_INSTALL_DIR}/env.sh << EOF
### Variables
export NVHPC_INSTALL_DIR=${NVHPC_INSTALL_DIR}
export NVHPC_VERSION=${NVHPC_VERSION}
export NVHPC_DIR=\${NVHPC_INSTALL_DIR}/Linux_$(uname -m)/\${NVHPC_VERSION}

### Compilers
export PATH=\${NVHPC_DIR}/compilers/bin:\${PATH}
export NVHPC_LIBRARY_PATH=\${NVHPC_DIR}/compilers/lib
export LD_LIBRARY_PATH=\${NVHPC_LIBRARY_PATH}

### MPI
export MPI_HOME=\${NVHPC_DIR}/comm_libs/mpi
export PATH=\${MPI_HOME}/bin:\${PATH}
EOF

cat ${NVHPC_INSTALL_DIR}/env.sh

214 changes: 214 additions & 0 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,214 @@
name: build

# Controls when the action will run
on:

# Trigger the workflow on all pushes, except on tag creation
push:
branches:
- '**'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we restrict the runs on push to the main branch only, please? Pull requests are already enabled below, as well as manual workflow dispatches for early testing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with that. Having the tests run by themselves for any commit when experimenting on a branch is very useful and should not be deactivated. It's better to be aware of a problem as soon as possible than at the final step when opening a pull request.
And having to run the tests by hand seems to defeat the purpose of continuous integration.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is a matter of taste and resource management. I often push non-final things to branches, where I know full well that they won't pass testing and so I would not want to waste resources on default testing for that. For near-completion integration testing, I find Draft Pull Requests useful, which do test on every individual push. But hey, if your workflow varies and you'd like this, we can enable this for the public GH-hosted runners.

However, on the other PR for the ECMWF HPC testing, I would like to keep this restricted, as this is a shared and somewhat limited resource that is used across multiple repositories and by multiple developers. Spamming the GPU-debug-queue with non-final branch pushes can impact others quite easily here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I would like to keep it for the public GH-hosted runners please.
And I understand you don't want to overload your GPU-queue.

tags-ignore:
- '**'

# Trigger the workflow on all pull requests
pull_request: ~

# Allow workflow to be dispatched on demand
workflow_dispatch: ~

env:
FIELD_API_TOOLS: ${{ github.workspace }}/.github/tools
CTEST_PARALLEL_LEVEL: 1
CACHE_SUFFIX: v0 # Increase to force new cache to be created
DEV_ALLOC_SIZE: 1000000

jobs:
ci:
name: ci

strategy:
fail-fast: false # false: try to complete all jobs

matrix:
build_type: [RelWithDebInfo]
name:
- linux gnu-10
- linux gnu-14
- linux nvhpc-23.5
- linux intel-classic
- linux intel-modern
- macos

include:

- name: linux gnu-10
os: ubuntu-22.04
compiler: gnu-10
compiler_cc: gcc-10
compiler_cxx: g++-10
compiler_fc: gfortran-10
python-version: '3.8'
caching: true

- name: linux gnu-14
os: ubuntu-24.04
compiler: gnu-14
compiler_cc: gcc-14
compiler_cxx: g++-14
compiler_fc: gfortran-14
python-version: '3.11'
caching: true

- name: linux nvhpc-23.5
os: ubuntu-22.04
compiler: nvhpc-23.5
compiler_cc: nvc
compiler_cxx: nvc++
compiler_fc: nvfortran
python-version: '3.8'
caching: true

- name : linux intel-classic
os: ubuntu-22.04
compiler: intel-classic
compiler_cc: icc
compiler_cxx: icpc
compiler_fc: ifort
python-version: '3.8'
caching: true

- name : linux intel-modern
os: ubuntu-24.04
compiler: intel-modern
compiler_cc: icx
compiler_cxx: icpx
compiler_fc: ifx
python-version: '3.8'
caching: true

- name: macos
# Xcode compiler requires empty environment variables, so we pass null (~) here
os: macos-13
compiler: clang-14
compiler_cc: ~
compiler_cxx: ~
compiler_fc: gfortran-13
python-version: '3.11'
caching: true

runs-on: ${{ matrix.os }}
steps:
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}

- name: Checkout Repository
uses: actions/checkout@v2

- name: Environment
run: |
echo "DEPS_DIR=${{ runner.temp }}/deps" >> $GITHUB_ENV
echo "CC=${{ matrix.compiler_cc }}" >> $GITHUB_ENV
echo "CXX=${{ matrix.compiler_cxx }}" >> $GITHUB_ENV
echo "FC=${{ matrix.compiler_fc }}" >> $GITHUB_ENV

if [[ "${{ matrix.os }}" =~ macos ]]; then
export HOMEBREW_NO_INSTALLED_DEPENDENTS_CHECK=1
export HOMEBREW_NO_AUTO_UPDATE=1
export HOMEBREW_NO_INSTALL_CLEANUP=1
export SDKROOT=$(xcrun --show-sdk-path)
echo "HOMEBREW_NO_INSTALLED_DEPENDENTS_CHECK=1" >> $GITHUB_ENV
echo "HOMEBREW_NO_AUTO_UPDATE=1" >> $GITHUB_ENV
echo "HOMEBREW_NO_INSTALL_CLEANUP=1" >> $GITHUB_ENV
echo "SDKROOT=$(xcrun --show-sdk-path)" >> $GITHUB_ENV
brew install libomp
brew install coreutils
else
sudo apt-get update
fi

printenv

- name: Cache Dependencies
# There seems to be a problem with cached NVHPC dependencies, leading to SIGILL perhaps due to slightly different architectures
if: matrix.caching
id: deps-cache
uses: pat-s/[email protected]
with:
path: ${{ env.DEPS_DIR }}
key: deps-${{ matrix.os }}-${{ matrix.compiler }}-${{ matrix.build_type }}-${{ env.CACHE_SUFFIX }}

# Free up disk space for nvhpc
- name: Free Disk Space (Ubuntu)
uses: jlumbroso/free-disk-space@main
if: contains( matrix.compiler, 'nvhpc' )
continue-on-error: true
with:
# this might remove tools that are actually needed,
# if set to "true" but frees about 6 GB
tool-cache: false

# all of these default to true, but feel free to set to
# "false" if necessary for your workflow
android: true
dotnet: true
haskell: true
large-packages: true
docker-images: true
swap-storage: true

- name: Install NVHPC compiler
if: contains( matrix.compiler, 'nvhpc' )
shell: bash -eux {0}
run: |
${FIELD_API_TOOLS}/install-nvhpc.sh --prefix /opt/nvhpc --version 23.5
source /opt/nvhpc/env.sh
echo "${NVHPC_DIR}/compilers/bin" >> $GITHUB_PATH

- name: Download Intel compiler
if: contains( matrix.compiler, 'intel' )
run: |
wget https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB
sudo apt-key add GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB
rm GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB
sudo add-apt-repository "deb https://apt.repos.intel.com/oneapi all main"

- name: Install Intel classic compiler
if: contains( matrix.compiler, 'intel-classic' )
run: |
sudo apt update
sudo apt install \
intel-oneapi-compiler-fortran-2023.2.0 \
intel-oneapi-compiler-dpcpp-cpp-and-cpp-classic-2023.2.0 \
intel-oneapi-mpi-devel-2021.10.0 \
intel-oneapi-mkl-2023.2.0
source /opt/intel/oneapi/setvars.sh
printenv >> $GITHUB_ENV
echo "CACHE_SUFFIX=$CC-$($CC -dumpversion)" >> $GITHUB_ENV

- name: Install Intel modern compiler
if: contains( matrix.compiler, 'intel-modern' )
run: |
sudo apt update
sudo apt install intel-hpckit
source /opt/intel/oneapi/setvars.sh
printenv >> $GITHUB_ENV
echo "CACHE_SUFFIX=$CC-$($CC -dumpversion)" >> $GITHUB_ENV

- name: Build & Test
id: build-test
uses: ecmwf-actions/build-package@v2
with:
self_coverage: false
force_build: true
cache_suffix: "${{ matrix.build_type }}-${{ env.CACHE_SUFFIX }}"
recreate_cache: ${{ matrix.caching == false }}
dependencies: |
ecmwf/ecbuild
ecmwf-ifs/fiat
dependency_branch: develop
dependency_cmake_options: |
ecmwf-ifs/fiat: "-DCMAKE_BUILD_TYPE=${{ matrix.build_type }} -DENABLE_TESTS=OFF
cmake_options: "-DCMAKE_BUILD_TYPE=${{ matrix.build_type }} ${{ matrix.cmake_options }} -DENABLE_ACC=OFF"
ctest_options: "${{ matrix.ctest_options }}"
70 changes: 0 additions & 70 deletions .github/workflows/fieldapi_gnu.yml

This file was deleted.

Loading
Loading