-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Finish integrating new parser into pipeline
This PR finishes integrating the new parser logic from [1] into our pipeline. It parses the `processed_portfolios.json` file from the output directory (in this case `/home/portfolio-parser/output`) and uses that to both correlate input + output files as well as upload the output CSV files. Since the R code now includes a row count, we no longer need to parse the files manually. This all mostly works as expected. A few sharp edges (relying on UUIDs from the R code) are noted in the PR, and there's metadata produced by the new code (both at the input file level and the output file level) that we aren't currently recording anywhere. Adjacent changes: - In creating the `parser`, I also duplicated the `taskrunner` package. That has been hoisted to the top level and de-duped - Assorted refactorings and renamings to make sure the `pactaparser` image gets invoked correctly [1] https://github.com/RMI-PACTA/workflow.portfolio.parsing
- Loading branch information
Showing
14 changed files
with
143 additions
and
276 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
load("@io_bazel_rules_go//go:def.bzl", "go_library") | ||
|
||
go_library( | ||
name = "parsed", | ||
srcs = ["parsed.go"], | ||
importpath = "github.com/RMI/pacta/async/parsed", | ||
visibility = ["//visibility:public"], | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
// Package parsed just holds the domain types for dealing with the output of the | ||
// ParsePortfolio async task. | ||
package parsed | ||
|
||
type SourceFile struct { | ||
InputFilename string `json:"input_filename"` | ||
InputMD5 string `json:"input_md5"` | ||
SystemInfo SystemInfo `json:"system_info"` | ||
InputEntries int `json:"input_entries"` | ||
GroupCols []string `json:"group_cols"` | ||
SubportfoliosCount int `json:"subportfolios_count"` | ||
Portfolios []Portfolio `json:"portfolios"` | ||
Errors [][]string `json:"errors"` | ||
} | ||
|
||
type SystemInfo struct { | ||
Timestamp string `json:"timestamp"` | ||
Package string `json:"package"` | ||
PackageVersion string `json:"packageVersion"` | ||
RVersion string `json:"RVersion"` | ||
Dependencies []Dependency `json:"dependencies"` | ||
} | ||
|
||
type Dependency struct { | ||
Package string `json:"package"` | ||
Version string `json:"version"` | ||
} | ||
|
||
type Portfolio struct { | ||
OutputMD5 string `json:"output_md5"` | ||
OutputFilename string `json:"output_filename"` | ||
OutputRows int `json:"output_rows"` | ||
PortfolioName string `json:"portfolio_name"` | ||
InvestorName string `json:"investor_name"` | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
Oops, something went wrong.