Note: This figure will become more complete, as more models register and provide model descriptions
+
+
+
+
+
+
+
+
Catalog of forecast submissions and evaluations
+
The catalog of submitted forecasts and the evaluation of the forecasts (“scores”) is available through the SpatioTemporal Asset Catalogs browser (below).
+
The catalog provides the code that you can use to access forecasts and scores.
+
+
+
+
+
+
\ No newline at end of file
diff --git a/catalog_files/figure-html/unnamed-chunk-1-1.png b/catalog_files/figure-html/unnamed-chunk-1-1.png
new file mode 100644
index 0000000000..af37f6bd72
Binary files /dev/null and b/catalog_files/figure-html/unnamed-chunk-1-1.png differ
diff --git a/img/USGS_logo_green.png b/img/USGS_logo_green.png
new file mode 100644
index 0000000000..d7c1d199d4
Binary files /dev/null and b/img/USGS_logo_green.png differ
diff --git a/img/workflow.png b/img/workflow.png
new file mode 100644
index 0000000000..8b1aed01c8
Binary files /dev/null and b/img/workflow.png differ
diff --git a/index.html b/index.html
new file mode 100644
index 0000000000..8c740c02d1
--- /dev/null
+++ b/index.html
@@ -0,0 +1,603 @@
+
+
+
+
+
+
+
+
+
+EFI-USGS River Chlorophyll Forecast Challenge - Forecasting Challenge
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Forecasting Challenge
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
We invite you to submit forecasts!
+
The EFI-USGS River Chlorophyll Forecasting Challenge is an open platform for the ecological and data science communities to forecast data from the U.S. Geological Survey (USGS) before they are collected.
+
The Challenge is hosted by the Ecological Forecasting Initiative Research Coordination Network and sponsored by the U.S. National Science Foundation. This challenge is co-hosted by the USGS Proxies Project, an effort supported by the Water Mission Area Water Quality Processes program to develop estimation methods for PFAS, harmful algal blooms, and metals, at multiple spatial and temporal scales.
+
+
Why a forecasting challenge?
+
Our vision is to use forecasts to advance theory and to support natural resource management. We can begin to realize this vision by creating and analyzing a catalog of forecasts from a range of ecological systems, spatiotemporal scales, and environmental gradients.
+
Our forecasting challenge is platform for the ecological and data science communities to advance skills in forecasting ecological systems and for generating forecasts that contribute to a synthetic understanding of patterns of predictability in ecology. Rewards for contributing are skill advancement, joy, and potential involved in manuscripts. We do not currently crown winner nor offer financial awards.
Our platform is designed to empower you to contribute by providing target data, numerical weather forecasts, and tutorials. We automatically score your forecasts using the latest NEON data. All forecasts and scores are publicly available through cloud storage and discoverable through our catalog.
Thomas, R. Q., Boettiger, C., Carey, C. C., Dietze, M. C., Johnson, L. R., Kenney, M. A., et al. (2023). The NEON Ecological Forecasting Challenge. Frontiers in Ecology and the Environment, 21(3), 112–113. https://doi.org/10.1002/fee.2616
We thank NEON for providing the freely available data and the EFI community for feedback on the design of the Challenge. This material is based upon work supported by the National Science Foundation under Grant DEB-1926388.
Page last updated on 2024-02-09
+
+
+
+
+
+
+
+
+
+
+
+
+
\ No newline at end of file
diff --git a/instructions.html b/instructions.html
new file mode 100644
index 0000000000..ef623844bb
--- /dev/null
+++ b/instructions.html
@@ -0,0 +1,898 @@
+
+
+
+
+
+
+
+
+
+EFI-USGS River Chlorophyll Forecast Challenge - How to forecast
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
How to forecast
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
1 tl;dr: How to submit a forecast
+
We provide an overview of the steps for submitting with the details below:
+
+
Explore the data (e.g., targets) and build your forecast model.
+
Register and describe your model at https://forms.gle/kg2Vkpho9BoMXSy57. You are not required to register if your forecast submission uses the word “example” in your model_id”. Any forecasts with “example” in the model_id will not be used in forecast evaluation analyses. Use neon4cast as the challenge you are registering for.
+
Generate a forecast!
+
Write the forecast output to a file that follows our standardized format (described below).
+
Submit your forecast using an R function (provided below).
+
Watch your forecast be evaluated as new data are collected.
+
+
+
+
2 Generating a forecast
+
+
2.1 All forecasting approaches are welcome
+
We encourage you to use any modeling approach to make a prediction about the future conditions at any of the NEON sites and variables.
+
+
+
2.2 Must include uncertainty
+
Forecasts require you to make an assessment of the confidence in your prediction of the future. You can represent your confidence (i.e., uncertainty in the forecast) using a distribution or numerically using an ensemble (or sample) of predictions.
+
+
+
2.3 Any model drivers/covariates/features are welcome
+
You can use any data as model inputs (including all of the forecast target data available to date). All sensor-based target data are available in with a 1 to 7 day delay (latency) from time of collection. You may want to use the updated target data to re-train a model or for use in data assimilation.
+
As a genuine forecasting challenge, you will need forecasted drivers if your model uses drivers as inputs. If you are interested in using forecasted meteorology, we are downloading and processing NOAA Global Ensemble Forecasting System (GEFS) weather forecasts for each NEON site. The NOAA GEFS forecasts extend 35-days ahead. More information about accessing the weather forecasts can be found here
+
+
+
2.4 Forecasts can be for a range of horizons
+
Forecasts can be submitted for 1 day to 1 year-ahead, depending on the variable. See the variable tables for the horizon that is associated with each variable.
+
+
+
2.5 Forecasts can be submitted everyday
+
Since forecasts can be submitted everyday, automation is important. We provide an example GitHub repository that can be used to automate your forecast with GitHub Actions. It also includes the use of a custom Docker Container eco4cast/rocker-neon4cast:latest that has many of the packages and functions needed to generate and submit forecasts.
+
We only evaluate forecasts for any weekly variables (e.g., beetles and ticks) that were submitted on the Sunday of each week. Therefore we recommend only submitting forecasts of the weekly variables on Sundays.
+
+
+
+
3 You can forecast at any of the NEON sites
+
If are you are getting started, we recommend a set of focal sites for each of the five “themes”. You are also welcome to submit forecasts to all or a subset of NEON sites . More information about NEON sites can be found in the site metadata and on NEON’s website
+
+
+
4 Forecast file format
+
The file is a csv format with the following columns:
+
+
project_id: use neon4cast
+
model_id: the short name of the model defined as the model_id in your registration. The model_id should have no spaces. model_id should reflect a method to forecast one or a set of target variables and must be unique to the neon4cast challenge.
+
datetime: forecast timestamp. Format %Y-%m-%d %H:%M:%S with UTC as the time zone. Forecasts submitted with a %Y-%m-%d format will be converted to a full datetime assuming UTC mid-night.
+
reference_datetime: The start of the forecast; this should be 0 times steps in the future. There should only be one value of reference_datetime in the file. Format is %Y-%m-%d %H:%M:%S with UTC as the time zone. Forecasts submitted with a %Y-%m-%d format will be converted to a full datetime assuming UTC mid-night.
+
duration: the time-step of the forecast. Use the value of P1D for a daily forecast, P1W for a weekly forecast, and PT30M for 30 minute forecast. This value should match the duration of the target variable that you are forecasting. Formatted as ISO 8601 duration
+
site_id: code for NEON site.
+
family name of the probability distribution that is described by the parameter values in the parameter column (see list below for accepted distribution). An ensemble forecast as a family of ensemble. See note below about family
+
parameter the parameters for the distribution (see note below about the parameter column) or the number of the ensemble members. For example, the parameters for a normal distribution are called mu and sigma.
+
variable: standardized variable name. It must match the variable name in the target file.
+
prediction: forecasted value for the parameter in the parameter column
+
+
+
+
5 Representing uncertainity
+
Uncertainty is represented through the family - parameter columns in the file that you submit.
+
+
5.0.1 Parameteric forecast
+
For a parametric forecast with a normal distribution, the family column would have the word normal to designate a normal distribution and the parameter column must have values of mu and sigma for each forecasted variable, site_id, depth, and time combination.
+
Parameteric forecasts for binary variables should use bernoulli as the family and prob as the parameter.
+
The following names and parameterization of the distribution are currently supported (family: parameters):
+
+
lognormal: mu, sigma
+
normal: mu,sigma
+
bernoulli: prob
+
beta: shape1, shape2
+
uniform: min, max
+
gamma: shape, rate
+
logistic: location, scale
+
exponential: rate
+
poisson: lambda
+
+
If you are submitting a forecast that is not in the supported list, we recommend using the ensemble format and sampling from your distribution to generate a set of ensemble members that represents your forecast distribution.
+
+
+
5.0.2 Ensemble (or sample) forecast
+
Ensemble (or sample) forecasts use the family value of ensemble and the parameter values are the ensemble index.
+
When forecasts using the ensemble family are scored, we assume that the ensemble is a set of equally likely realizations that are sampled from a distribution that is comparable to that of the observations (i.e., no broad adjustments are required to make the ensemble more consistent with observations). This is referred to as a “perfect ensemble” by Bröcker and Smith (2007). Ensemble (or sample) forecasts are transformed to a probability distribution function (e.g., dressed) using the default methods in the scoringRules R package (empirical version of the quantile decomposition for the crps).
+
+
+
5.1 Example forecasts
+
Here is an example of a forecast that uses a normal distribution:
For an ensemble (or sample) forecast, the family column uses the word ensemble to designate that it is a ensemble forecast and the parameter column is the ensemble member number (1, 2, 3 …)
Save your forecast as a csv file with the following naming convention:
+
theme_name-year-month-day-model_id.csv. Compressed csv files with the csv.gz extension are also accepted.
+
The theme_name options are: terrestrial_daily, terrestrial_30min, aquatics, beetles, ticks, or phenology.
+
The year, month, and day are the year, month, and day the reference_datetime (horizon = 0). For example, if a forecast starts today and tomorrow is the first forecasted day, horizon = 0 would be today, and used in the file name. model_id is the id for the model name that you specified in the model metadata Google Form (model_id has no spaces in it).
+
+
+
6.2 Uploading forecast
+
Individual forecast files can be uploaded any time.
+
Teams will submit their forecast csv files through an R function. The csv file can only contain one unique model_id and one unique project_id.
+
The function is available using the following code
After submission, our servers will process uploaded files by converting them to a parquet format on our public s3 storage. A pub_datetime column will be added that denotes when a forecast was submitted. Summaries are generated of the forecasts provide descriptive statistics of the forecast.
+
+
+
7.2 Evaluation
+
All forecasts are scored daily using new data until the full horizon of the forecast has been scored. Forecasts are scored using the crps function in the scoringRules R package. More information about the scoring metric can be found at here
+
+
+
7.3 Comparison
+
Forecast performance can be compared to the performance of baseline models. We are automatically submitting the following baseline models:
+
+
climatology: the normal distribution (mean and standard deviation) of that day-of-year in the historical observations
+
persistenceRW: a random walk model that assumes no change in the mean behavior. The random walk is initialized using the most resent observation.
+
mean: the historical mean of the data is submitted for the beetles theme.
Information and code for accessing the forecasts and scores can be found on our forecast catalog page.
+
+
+
+
8 Questions?
+
Thanks for reading this document!
+
+
If you still have questions about how to submit your forecast to the NEON Ecological Forecasting Challenge, we encourage you to email Dr. Quinn Thomas (rqthomas{at}vt.edu).
Research from the Ecological Forecasting Initiative Research Coordination Network.
+
+
Ecological Forecasting
+
Lewis, A., W. Woelmer, H. Wander, D. Howard, J. Smith, R. McClure, M. Lofton, N. Hammond, R. Corrigan, R.Q. Thomas, C.C. Carey. 2022. Increased adoption of best practices in ecological forecasting enables comparisons of forecastability across systems. Ecological Applications 32: e02500 https://doi.org/10.1002/eap.2500
+
Lewis, A. S. L., Rollinson, C. R., Allyn, A. J., Ashander, J., Brodie, S., Brookson, C. B., et al. (2023). The power of forecasts to advance ecological theory. Methods in Ecology and Evolution, 14(3), 746–756. https://doi.org/10.1111/2041-210X.13955
+
+
+
Manuscripts about the Challenge
+
Thomas, R. Q., Boettiger, C., Carey, C. C., Dietze, M. C., Johnson, L. R., Kenney, M. A., et al. (2023). The NEON Ecological Forecasting Challenge. Frontiers in Ecology and the Environment, 21(3), 112–113. https://doi.org/10.1002/fee.2616
+
Thomas, R.Q, R.P. McClure, T.N. Moore, W.M. Woelmer, C. Boettiger, R.J. Figueiredo, R.T. Hensley, C.C. Carey. Near-term forecasts of NEON lakes reveal gradients of environmental predictability across the U.S. Frontiers in Ecology and Environment 21: 220–226. https://doi.org/10.1002/fee.2623
+
Wheeler, K., M. Dietze, D. LeBauer, J. Peters, A.D. Richardson, R.Q. Thomas, K. Zhu, U. Bhat, S. Munch, R.F Buzbee, M. Chen, B. Goldstein, J.S. Guo, D. Hao, C. Jones, M. Kelly-Fair, H. Liu, C. Malmborg, N. Neupane. D. Pal, A. Ross, V. Shirey, Y. Song, M. Steen, E.A. Vance, W.M. Woelmer, J. Wynne and L. Zachmann. Predicting Spring Phenology in Deciduous Broadleaf Forests: An Open Community Forecast Challenge.
+
+
+
Details about the standards used in the challenge
+
Dietze, M., R.Q. Thomas, J. Peters, C. Boettiger, A. Shiklomanov, and J. Ashander. 2023. A community convention for ecological forecasting: output files and metadata v1.0. Ecosphere 14: e4686 https://doi.org/10.1002/ecs2.4686
+
+
+
Educational manuscripts
+
Moore, T.N., R.Q. Thomas, W.M. Woelmer, C.C Carey. 2022. Integrating ecological forecasting into undergraduate ecology curricula with an R Shiny application-based teaching module. Forecasting 4:604-633. https://doi.org/10.3390/forecast4030033
+
Peters, J. and R.Q. Thomas. 2021. Going Virtual: What We Learned from the Ecological Forecasting Initiative Research Coordination Network Virtual Workshop. Bulletin of the Ecological Society of America 102: e01828 https://doi.org/10.1002/bes2.1828
+
Willson, A.M., H. Gallo, J.A. Peters, A. Abeyta, N. Bueno Watts, C.C. Carey, T.N. Moore, G. Smies, R.Q. Thomas, W.M. Woelmer, and J.S. McLachlan. 2023. Assessing opportunities and inequities in undergraduate ecological forecasting education. Ecology and Evolution 13: e10001. https://doi.org/10.1002/ece3.10001
+
Woelmer, W. M., Bradley, L. M., Haber, L. T., Klinges, D. H., Lewis, A. S. L., Mohr, E. J., et al. (2021). Ten simple rules for training yourself in an emerging field. PLOS Computational Biology, 17(10), e1009440. https://doi.org/10.1371/journal.pcbi.1009440
+
Woelmer, W.M., T.N. Moore, M.E. Lofton, R.Q. Thomas, and C.C. Carey. 2023. Embedding communication concepts in forecasting training increases students’ understanding of ecological uncertainty Ecosphere 14: e4628 https://doi.org/10.1002/ecs2.4628
Below are forecasts submitted 30 days ago and include the observations used to evaluate them. Mouse over to see the team id, scroll to zoom. Only the top five performing models are shown. Information on how to access the scores can be found in our catalog