Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get the create-report async task in shape #138

Merged
merged 6 commits into from
Jan 16, 2024
Merged

Conversation

bcspragu
Copy link
Collaborator

@bcspragu bcspragu commented Jan 12, 2024

This PR makes assorted tweaks to the runner's create-report task, like:

  • Adding new pacta.FileTypes for assorted HTML-adjacent types present in the report (see below)
    • Includes an UNKNOWN type in case I miss anything, we can always go back and re-label them based on file extension
  • Fix some issues with paths, use the portfolio UUID as the directory prefix (to avoid collisions)
  • Publish the event when we're done so the main app can record the info

Mostly unrelated changes:

Some resources that get created from this:

select * from blob WHERE blob_uri LIKE 'az://reports/%';

            id             |                                                   blob_uri                                                   | file_type |        file_name         |          created_at
---------------------------+--------------------------------------------------------------------------------------------------------------+-----------+--------------------------+-------------------------------
 blob.9c6a2be6135288dc141a | az://reports/afcc58f1-486c-437c-9fce-0a4774cd4128/index.html                                                 | html      | index.html               | 2024-01-12 22:24:07.695008+00
 blob.47508626ad7c784151d6 | az://reports/afcc58f1-486c-437c-9fce-0a4774cd4128/libs/accessible-code-block-0.0.1/empty-anchor.js           | js        | empty-anchor.js          | 2024-01-12 22:24:07.695008+00
 blob.e952045a95b60b0744fd | az://reports/afcc58f1-486c-437c-9fce-0a4774cd4128/libs/anchor-sections-1.1.0/anchor-sections-hash.css        | css       | anchor-sections-hash.css | 2024-01-12 22:24:07.695008+00
 blob.59b3c81bad6b701bbf83 | az://reports/afcc58f1-486c-437c-9fce-0a4774cd4128/libs/anchor-sections-1.1.0/anchor-sections.js              | js        | anchor-sections.js       | 2024-01-12 22:24:07.695008+00
 blob.7edd40c453bac1c53754 | az://reports/afcc58f1-486c-437c-9fce-0a4774cd4128/libs/gitbook-2.6.7/css/fontawesome/fontawesome-webfont.ttf | ttf       | fontawesome-webfont.ttf  | 2024-01-12 22:24:07.695008+00
 blob.7eda957bd02cbf3ab01f | az://reports/afcc58f1-486c-437c-9fce-0a4774cd4128/libs/gitbook-2.6.7/css/plugin-search.css                   | css       | plugin-search.css        | 2024-01-12 22:24:07.695008+00

Serving these files will be a follow-up PR

This PR makes assorted tweaks to the runner's `create-report` task, like:
- Adding new `pacta.FileType`s for assorted HTML-adjacent types present in the report (see below)
  - Includes an `UNKNOWN` type in case I miss anything, we can always go back and re-label them based on file extension
- Fix some issues with paths, use the portfolio UUID as the directory prefix (to avoid collisions)
- Publish the event when we're done so the main app can record the info

Mostly unrelated changes:
- Fixes #135 by pulling `handleMissingTranslation` from the `$nuxt` object instead of `inject`
- Add a helper script for iterating quickly on the runner, `build_and_load_runner.sh`

Some resources that get created from this:

```
select * from blob WHERE blob_uri LIKE 'az://reports/%';

            id             |                                                   blob_uri                                                   | file_type |        file_name         |          created_at
---------------------------+--------------------------------------------------------------------------------------------------------------+-----------+--------------------------+-------------------------------
 blob.9c6a2be6135288dc141a | az://reports/afcc58f1-486c-437c-9fce-0a4774cd4128/index.html                                                 | html      | index.html               | 2024-01-12 22:24:07.695008+00
 blob.47508626ad7c784151d6 | az://reports/afcc58f1-486c-437c-9fce-0a4774cd4128/libs/accessible-code-block-0.0.1/empty-anchor.js           | js        | empty-anchor.js          | 2024-01-12 22:24:07.695008+00
 blob.e952045a95b60b0744fd | az://reports/afcc58f1-486c-437c-9fce-0a4774cd4128/libs/anchor-sections-1.1.0/anchor-sections-hash.css        | css       | anchor-sections-hash.css | 2024-01-12 22:24:07.695008+00
 blob.59b3c81bad6b701bbf83 | az://reports/afcc58f1-486c-437c-9fce-0a4774cd4128/libs/anchor-sections-1.1.0/anchor-sections.js              | js        | anchor-sections.js       | 2024-01-12 22:24:07.695008+00
 blob.7edd40c453bac1c53754 | az://reports/afcc58f1-486c-437c-9fce-0a4774cd4128/libs/gitbook-2.6.7/css/fontawesome/fontawesome-webfont.ttf | ttf       | fontawesome-webfont.ttf  | 2024-01-12 22:24:07.695008+00
 blob.7eda957bd02cbf3ab01f | az://reports/afcc58f1-486c-437c-9fce-0a4774cd4128/libs/gitbook-2.6.7/css/plugin-search.css                   | css       | plugin-search.css        | 2024-01-12 22:24:07.695008+00

```
@bcspragu bcspragu requested a review from gbdubs January 12, 2024 23:52
Copy link
Contributor

@gbdubs gbdubs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woo!

@@ -0,0 +1,20 @@
#!/bin/bash
set -euo pipefail
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I LOVE THIS IT WILL MAKE LIFE EASIER THANK YOU

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HAPPY TO HELP YEAH I WASNT ENJOYING DOING THIS MANUALLY EITHER

FileType_CSS,
FileType_JS,
FileType_TTF,
FileType_UNKNOWN,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I don't think we want unknown in this list, since it's used to test things like convertibility to postgres, and we don't want "" to be persistable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^^ discussed above

const values = computed(() => {
return missingTranslations.value
})
return {
provide: {
missingTranslations: {
handleMissingTranslation,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this, great, elegant solution.

frontend/i18n.config.ts Outdated Show resolved Hide resolved
frontend/i18n.config.ts Show resolved Hide resolved
'css',
'js',
'ttf',
''
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not add '' as a valid enum type, we don't want to persist it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed in the comment above, I think we do. I'm also open to calling this unknown or whatever you think makes sense here

db/sqldb/blob.go Outdated
if b.FileName == "" {
return fmt.Errorf("blob missing FileName")
}
// A blank FileType is valid, just means we don't recognize the file type yet.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd disagree with that - we require a file type to be set, and I think that's a pretty straightforward invariant to maintain for the course of development. Is there a reason to support this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The context here is that the create-report action generates a whole directory of files, including HTML, JSON, CSS, JavaScript, fonts, and text files. I imagine the PACTA folks will want to modify these reports over time, potentially introducing new files and resources that should be served as part of the report.

If one of those new files is of a type we haven't enumerated (videos? various image formats? XML files? etc), we still want to persist the file and make a best-effort attempt to serve it. My solution is to say "well, we didn't recognize the type of the file at the time of report creation", which is UNKNOWN.

Happy to chat more if that doesn't make sense or you have another preferred approach.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I guess I'm not sold there. In my mind the value of an enum is confidence you'll get back a value you expect. I think if we (at root) don't trust the data we have coming in to be of the type(s) that we expect, and we don't have a need to handle them specifically, we should simply remove the enum constraint generally, and store it as a string-based file extension.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to throw the baby out with the bathwater here. In most cases (currently all), the blob is one of a small list of types. It's just that the alternative when we encounter an unknown extension is to ignore the file entirely, which is definitely not the option we want.

And the file type enum is more generally useful than a string extension. For example, yml and yaml are both YAML files

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this check back, as UNKNOWN is now 'unknown'

return fmt.Errorf("given blob wasn't a JSON-formatted portfolio, %q", fileNameWithExt)
}
fileNames = append(fileNames, strings.TrimSuffix(fileNameWithExt, ".json"))
destPath := filepath.Join("/", "mnt", "processed_portfolios", fileNameWithExt)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mnt?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://tldp.org/LDP/Linux-Filesystem-Hierarchy/html/mnt.html

Basically, /mnt is a generic place for external things being mounted in. It's a convention Alex used in the stub repo, and it makes as much sense as any other location.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -15,11 +15,12 @@ bazel run //scripts:run_server -- --use_azure_runner

### Creating a new docker image to run locally

When testing locally (e.g. without `--use_azure_runner`), you can build and tag a runner image locally and use that. To do that, run `bazel run //scripts:build_and_load_runner`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This leaves ambiguous what happens if you're not using either of these options? Could you make that a bit clearer?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GC, made this section more explicit

@bcspragu bcspragu merged commit df77dc4 into main Jan 16, 2024
2 checks passed
@bcspragu bcspragu deleted the brandon/create-report branch January 16, 2024 02:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Vue warnings about setup-only functionality called outside setup
2 participants