Skip to content
This repository has been archived by the owner on Dec 14, 2023. It is now read-only.

add more robust validation and fixing of portfolio data input #379

Open
Clare2D opened this issue Jan 13, 2021 · 3 comments
Open

add more robust validation and fixing of portfolio data input #379

Clare2D opened this issue Jan 13, 2021 · 3 comments
Assignees
Labels
bug Something isn't working high-priority

Comments

@Clare2D
Copy link
Contributor

Clare2D commented Jan 13, 2021

One report wasn't generating because of the following errors:

-- web_tool_script_1.R ---------------------------------------------------------
There were 14 warnings (use warnings() to see them)
Error: Problem with mutate() input value_usd.
x non-numeric argument to binary operator
i Input value_usd is if_else(...).
Backtrace:
x

  1. +-global::process_raw_portfolio(...)
  2. | -global::calculate_value_usd_with_fin_data(portfolio)
  3. | -%>%(...)
  4. | +-base::withVisible(eval(quote(_fseq(_lhs)), env, env))
  5. | -base::eval(quote(_fseq(_lhs)), env, env)
  6. | -base::eval(quote(_fseq(_lhs)), env, env)
  7. | -_fseq(_lhs)
  8. | -magrittr::freduce(value, _function_list)
  9. | +-base::withVisible(function_list[k])
  10. | -function_list[k]
  11. | +-dplyr::mutate(...)
  12. | -dplyr:::mutate.data.frame(...)
  13. | -dplyr:::mutate_cols(.data, ...)
  14. | +-base::withCallingHandlers(...)
  15. | -mask$eval_all_mutate(dots[[i]])
  16. +-dplyr::if_else(...)
  17. -base::.handleSimpleError(...)
  18. -dplyr:::h(simpleError(msg, call))
    Execution halted

This seems to me like an issue in the number formatting of the portfolio file (Swiss). Something to test with the new input reader/audit functionality, hopefully meaning this error doesn't appear again. Just wanted to flag. I can provide the portfolio input file that created this.

FYI @cjyetman

@Clare2D Clare2D added the bug Something isn't working label Jan 13, 2021
@cjyetman
Copy link
Member

pretty sure this happens when a portfolio has weird number formatting in either number_of_shares or unit_share_price

it’s triggered here
https://github.com/2DegreesInvesting/PACTA_analysis/blob/e7f8ca4e182282425f2d4bc2a8617682c4ecc1c3/0_portfolio_input_check_functions.R#L542-L548

I'm wondering if maybe we should add in a forced coercion of the columns to numeric when the portfolio is read in here? It won't "fix" the problem (because much of the portfolio's data will be coerced to NA), but at least it won't trigger a difficult to debug error later on. I'm not 100% sure if that's a good idea.
https://github.com/2DegreesInvesting/PACTA_analysis/blob/496f7936b5cec7c29ff505cd0e4948e50d15d159/0_web_functions.R#L194-L230

@cjyetman
Copy link
Member

Alternatively, a fancier pre-processing of the portfolio CSVs needs to be done... either earlier on in our code, or ideally on the server when the portfolio is uploaded.

@cjyetman cjyetman self-assigned this Feb 15, 2021
@cjyetman cjyetman changed the title Input Data processing add more robust validation and fixing of portfolio data input Dec 4, 2021
@cjyetman
Copy link
Member

cjyetman commented Dec 4, 2021

changed issue name and this issue will be used to track a general task of improving validation and fixing of portfolio data when it is first read in

some things that should likely be checked and dealt with:

  • file encoding
  • file type (CSV, XLSX, etc.)
  • compliance with file type specification (e.g. a TSV as a CSV, or a CSV with a ; delimiter)
  • number formatting weirdness (e.g. 1 000.32, 1.000,32)
  • ISIN validation

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working high-priority
Projects
None yet
Development

No branches or pull requests

2 participants