Skip to content

Update Corpus Metadata contents

Daniele Guido edited this page Dec 12, 2024 · 1 revision

The Impresso Datalab website retrieves the datasets collection from a private repository using the loader feature of AstroJS config (v5). However, the datset file is stored in a private repository. To access the private repository, you'll need to configure a GitHub fine-grained personal access token (PAT) and set it as a secret in this repository. In order to allow the GitHub Actions workflow to build the Docker image for the website, you need to:

  1. Create a Fine-Grained Personal Access Token (PAT) with the following scopes:

     Repository: Read access to the private repository containing the datasets **and** access to repository ontents and commits.
    
  2. Add the PAT as a Secret in your repository:

     Go to Settings > Secrets and variables > Actions.
     Click New repository secret.
     Name the secret: PAT_GITHUB_TOKEN.
     Paste the token as the value.
    

Example of configuration on our YAML GitHub action file

Mke sure that the dockerfile gets the right value on its GITHUB_TOKEN args

- name: Build and push
  uses: docker/build-push-action@v6
  with:
    push: true
    tags: impresso/impresso-datalab:latest
    build-args: |
      PUBLIC_VERSION=latest
      PUBLIC_GIT_COMMIT_SHA=${{ env.PUBLIC_GIT_COMMIT_SHA }}
      PUBLIC_BUILD_DATE=${{ env.PUBLIC_BUILD_DATE }}
      PUBLIC_GIT_BRANCH=${{ env.PUBLIC_GIT_BRANCH }}
      PUBLIC_GIT_REMOTE=${{ env.PUBLIC_GIT_REMOTE }}
      PUBLIC_GIT_TAG=${{ env.PUBLIC_GIT_TAG }}
      PUBLIC_IMPRESSO_DATALAB_SITE=${{ secrets.IMPRESSO_DATALAB_SITE }}
      PUBLIC_IMPRESSO_DATALAB_BASE=${{ secrets.PUBLIC_IMPRESSO_DATALAB_BASE }}
      PUBLIC_IMPRESSO_API_PATH=***
      PUBLIC_IMPRESSO_WS_API_PATH=***
      GITHUB_TOKEN=${{ secrets.PAT_GITHUB_TOKEN }}
Clone this wiki locally