Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better handle missing values #136

Open
1 of 2 tasks
teixeirak opened this issue Apr 3, 2020 · 6 comments
Open
1 of 2 tasks

Better handle missing values #136

teixeirak opened this issue Apr 3, 2020 · 6 comments

Comments

@teixeirak
Copy link
Contributor

teixeirak commented Apr 3, 2020

Currently, the calculator treats missing values essentially as zeros and still includes ecosystems with missing values in plots, with the following message below the plot: "Black dots indicate net values, and are displayed when all components are quantified. Missing values (particularly common for biophysical components) indicate that climate regulating values cannot be calculated because of insufficient data."

There are 3 cases where values may be missing:

1- Missing biophysics data is common because model projection maps from which we draw don’t cover all possible ecosystems for a given location.

2- The biogeochemical data are incomplete. For example, biome S2 (tropical shrubland) has NaN values in biome_defaults.csv, yet the calculator still comes up with and plots a value (below fig), treating missing values as zero. Note that the plotting here is incorrect: there is a black dot, supposedly indicating complete data.

image

3- The selected ecosystem is not described --for example, mountain grasslands (issue #135).

The solution includes the following:

  • More clearly distinguish missing values from zeros (e.g., remove white space above the message "Black dots..." so that users will see it; maybe print "no data" on the plot for these ecosystems). (applies too case 1 and maybe case 2* above).

  • When the ecosystem is not described (case 3) or probably when there is missing biogeochemical data (case 2*), don't allow selection of the biome in the first place, and somehow indicate to the user that it can't be selected because of missing data.

*For case 2, it could also be possible to plot the results that are available, but this would require very clear indication that data are missing. We could go with this option if its much easier to implement.

@caseyhofford
Copy link
Contributor

@teixeirak The PRs I have merged should address case 1 & 2.

Is case 3 would these currently be displayed as empty rows on the plot? Is the criteria for not allowing the user to select a biome that the resulting plot would be empty?

@teixeirak
Copy link
Contributor Author

@teixeirak The PRs I have merged should address case 1 & 2.

Is case 3 would these currently be displayed as empty rows on the plot? Is the criteria for not allowing the user to select a biome that the resulting plot would be empty?

If the ecosystem is not described (case 3), or if there are missing biochemical data (case 2), my ideal would be to not allow the user to select the biome. Could we add a criterion that the check box to select the ecosystem only shows up if all the biogeochemical data are there? This would mean that ecosystems in case 2 and case 3 never show up. I'd also like to include barren lands in this category: the calculator should indicate that they may be present, but not allow them to be selected.

@caseyhofford
Copy link
Contributor

Sounds good, it is definitely possible to exclude biomes based on missing data! Here is a list of fields that exist on the biomes returned from (R/biome.R)[https://github.com/ebimodeling/ghgvcR/blob/master/R/biome.R]. For the field names listed below do you know which ones being NaN/Zero should lead to a biome being excluded?

Ec_CH4
Ec_CO2
Ec_N2O
Ed_CH4_ag_wood_litter
Ed_CH4_litter
Ed_CH4_peat
Ed_CH4_root
Ed_CO2_ag_wood_litter
Ed_CO2_litter
Ed_CO2_peat
Ed_CO2_root
Ed_N2O_ag_wood_litter
Ed_N2O_litter
Ed_N2O_peat
Ed_N2O_root
FR_CH4
FR_CO2
FR_N2O
F_CH4
F_CO2
F_N2O
F_anth
OM_SOM
OM_ag
OM_litter
OM_peat
OM_root
OM_wood
T_A
T_E
age_transition
biophysical_net
code
dfc_ag_wood_litter
dfc_peat
dfc_root
dk_ag_wood_litter
dk_peat
dk_root
f_vSOC
fc_SOM
fc_ag_wood_litter
fc_peat
fc_root
in_synmap
k_SOM
k_ag_wood_litter
k_peat
k_root
latent
new_F_CH4
new_F_CO2
new_F_N2O
r
rd
sensible
sw_radiative_forcing
tR
termite

@teixeirak
Copy link
Contributor Author

Sorry for the delayed response.

Required for any calculation

(note that other parameters are, of course, technically required, but these are the ones that vary most among biomes and are essential for complete calculations)

OM_ag
OM_root
OM_wood
OM_litter
OM_SOM (or f_vSOC, used to calculate OM_SOM from maps)
F_CO2
F_CH4
F_N2O

Required for the biophysical calculations only

(I don't think we need to change anything in these cases)

sw_radiative_forcing
latent

@caseyhofford
Copy link
Contributor

caseyhofford commented Dec 27, 2020

No worries Krista, I am taking just as long to follow up here...

Two more questions:

  1. You mentioned that barren sites should not be selectable, but should be indicated:

@teixeirak
This would mean that ecosystems in case 2 and case 3 never show up. I'd also like to include barren lands in this category: the calculator should indicate that they may be present, but not allow them to be selected.

Should this be done with an additional note (like the black dot note)? Or should something only appear when a barren biome is possible for that selection? Conditionally displaying the message will require more time than simply excluding all barren biomes and adding a general note, but I is certainly possible.

I have addressed this in this PR #151 , let me know if this is not how it should look.

  1. I want to confirm that 0 values for the fields listed above are valid. I am excluding nulls and "NaN" values.

@teixeirak
Copy link
Contributor Author

1. I want to confirm that 0 values for the fields listed above are valid. I am excluding nulls and "NaN" values.

Correct.

Thanks for addressing this, and apologies for the very slow reply. (This came in while I was on vacation and got buried.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants