Skip to content

Commit

Permalink
Add in pvanalytics update (#1)
Browse files Browse the repository at this point in the history
* v0.1.1 Release (pvlib#132)

* change pypi classifier from pre-alpha to beta

* remove unnecessary docs/requirements.txt

* whatsnew v0.1.1

* include 0.1.1 in whatsnew index

* link zenodo in readme

* Added clipping time series example for Sphinx documentation.

* added sphinx documentation + examples for running the clipping mask.

* fixed pep8 formatting errors.

* added a new whatsnew rst file for version 0.1.2

* removed close plot to visualize in sphinx.

* removed trailing whitespace.

* added tight layout for plot sizing.

* Updated the docs based on @kanderso-nrel's recs.

* fixed pep8 warning.

* removed trailing whitespace-pep8 issue.

* Added placeholder scripts for each function for Sphinx documentation.

* added sphinx module for completeness score.

* Update docs/examples/clipping.py

Co-authored-by: Cliff Hansen <[email protected]>

* Update docs/examples/clipping.py

Co-authored-by: Cliff Hansen <[email protected]>

* added each of the python scripts for detecting stale data, interpolated data, and check for daily data completeness.

* updated the interpolated-periods documentation

* cleaning up doc strings.

* updated the naming conventions of the sphinx docs.

* fixed pep8 error on stale data periods docs.

* made updates to the Sphinx docs based on kanderso-nrel's feedback.

* fixed pep8 errors.

* Edited some of the language in the sphinx doc comments.

* Added pv-terms json.

* added the documentation files for hampel, tukey, and zscore outlier detection.

* update the documentation to include separate data files for each of the different issues, to avoid further confusion.

* updated the interpolated data docs to pull the correct csv.

* More docstring cleanup.

* Updated outlier code to use the new outlier csv file.

* updated the outliers routine to handle varying indices.

* Docstring cleaning.

* made updates to the hovertext and the _round edits per @kanderso-nrel's comments

* updated diff to round on docstring per @kanderso-nrel's comment.

* updated the whatsnew doc with the outliers documentation.

* updated the hovertext info.

* updated the routine with the bug fix for whatsnew, and removed the initial graphing.

* added prints to visualize the imported data in example docs.

* Update docs/whatsnew/0.1.2.rst

Co-authored-by: Kevin Anderson <[email protected]>

* added new commenting based on @cwhanse's recommendations.

* fixed improper spelling in comments

* Day night masking sphinx documentation (pvlib#139)

* update the day-night masking example.

* update the day-night masking routine.

* added the SERF east data for running the day-night mask examples.

* added the day-night masking routine.

* Added section for comparing day-night mask to PVlib sunrise-sunset times.

* added separate printouts for sunrise and sunset time comparisons.

* added vertical lines for sunrise + sunset in plots

* update the routine to remove hardcoded file name.

* added update to the whatsnew file.

* removed a newline to see if we could get git actions to work.

* Made updates to documentation based on @kanderso-nrel's recommendations.

* Update docs/examples/day-night-masking.py

Co-authored-by: Cliff Hansen <[email protected]>

* Removed default kwargs for pvlib SPA sunrise-sunset function.

* Updating the commenting.

* fixed pep8 line length

Co-authored-by: Perry <[email protected]>
Co-authored-by: Cliff Hansen <[email protected]>

* Irradiance sphinx documentation (pvlib#140)

* added initial files for all of the irradiance documentation (need to edit).

* added RMIS example data for irradiance Sphinx documentation.

* added the new qcrad function.

* update the examples for both qcrad functions.

* added qcrad-limits documentation.

* ensured outputs for all irradiance functions in examples.

* added plotting functionality for some of the examples.

* added graphics for all of the irradiance documentation.

* added new line at end of file to stop pep8 failure.

* Clean up of doc strings for irradiance documentation.

* Fixed the docstring PEP8 error.

* Update docs/examples/clearsky-limits-irradiance.py

Co-authored-by: Kevin Anderson <[email protected]>

* Update docs/examples/clearsky-limits-irradiance.py

Co-authored-by: Kevin Anderson <[email protected]>

* Update docs/examples/clearsky-limits-irradiance.py

Co-authored-by: Kevin Anderson <[email protected]>

* Update docs/examples/clearsky-limits-irradiance.py

Co-authored-by: Kevin Anderson <[email protected]>

* Update docs/examples/qcrad-limits-irradiance.py

Co-authored-by: Kevin Anderson <[email protected]>

* Update docs/examples/qcrad-limits-irradiance.py

Co-authored-by: Kevin Anderson <[email protected]>

* Update docs/examples/qcrad-consistency-irradiance.py

Co-authored-by: Kevin Anderson <[email protected]>

* Update docs/examples/qcrad-consistency-irradiance.py

Co-authored-by: Kevin Anderson <[email protected]>

* Removed 'sampled' reference from docstring when describing data

* changed py:func to py:meth in docstring

* Updated the routine to calculate extraterrestrial radition as dni_extra for check_irradiance_limits_qcrad() function.

* Renamed the routine Clearsky Limits for Daily Insolation

* removed pep8 issues

* added the documentation to the whatsnew file.

* Update docs/examples/clearsky-limits-irradiance.py

Co-authored-by: Cliff Hansen <[email protected]>

* Update docs/examples/clearsky-limits-irradiance.py

Co-authored-by: Cliff Hansen <[email protected]>

* Update docs/examples/qcrad-consistency-irradiance.py

Co-authored-by: Cliff Hansen <[email protected]>

* Update docs/examples/qcrad-consistency-irradiance.py

Co-authored-by: Cliff Hansen <[email protected]>

* Update docs/examples/qcrad-consistency-irradiance.py

Co-authored-by: Cliff Hansen <[email protected]>

* Update docs/examples/qcrad-limits-irradiance.py

Co-authored-by: Cliff Hansen <[email protected]>

* Update docs/examples/daily-insolation-limits-irradiance.py

Co-authored-by: Cliff Hansen <[email protected]>

* Update docs/examples/clearsky-limits-irradiance.py

Co-authored-by: Cliff Hansen <[email protected]>

* added day-night mask to clearsky-limits-irradiance documentation

* removed hardcoded path!

* Update docs/examples/daily-insolation-limits-irradiance.py

Co-authored-by: Cliff Hansen <[email protected]>

* Update docs/examples/daily-insolation-limits-irradiance.py

Co-authored-by: Cliff Hansen <[email protected]>

* Update docs/examples/daily-insolation-limits-irradiance.py

Co-authored-by: Cliff Hansen <[email protected]>

* switched the ordering of parameters in ) per @cwhanse's request.

* rearranged the order of inputs for irradiance_consistency_qcrad function in unit test.

* Update docs/examples/qcrad-consistency-irradiance.py

Co-authored-by: Cliff Hansen <[email protected]>

* updated clearsky-limits-irradiance example to comment on Ineichen model performance

* Update docs/examples/qcrad-limits-irradiance.py

Co-authored-by: Cliff Hansen <[email protected]>

* Update docs/examples/qcrad-consistency-irradiance.py

Co-authored-by: Cliff Hansen <[email protected]>

* Update docs/examples/daily-insolation-limits-irradiance.py

Co-authored-by: Cliff Hansen <[email protected]>

Co-authored-by: Perry <[email protected]>
Co-authored-by: Kevin Anderson <[email protected]>
Co-authored-by: Cliff Hansen <[email protected]>

Co-authored-by: Kevin Anderson <[email protected]>
Co-authored-by: Perry <[email protected]>
Co-authored-by: Cliff Hansen <[email protected]>
  • Loading branch information
4 people authored May 25, 2022
1 parent 334b5d5 commit 10d289f
Show file tree
Hide file tree
Showing 29 changed files with 15,033 additions and 14 deletions.
1 change: 0 additions & 1 deletion .readthedocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@ formats: all
python:
version: 3.7
install:
- requirements: docs/requirements.txt
- method: pip
path: .
extra_requirements:
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
![lint and test](https://github.com/pvlib/pvanalytics/workflows/lint%20and%20test/badge.svg)
[![Coverage Status](https://coveralls.io/repos/github/pvlib/pvanalytics/badge.svg?branch=master)](https://coveralls.io/github/pvlib/pvanalytics?branch=master)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.6110569.svg)](https://doi.org/10.5281/zenodo.6110569)


# PVAnalytics

Expand Down
84 changes: 84 additions & 0 deletions docs/examples/clearsky-limits-irradiance.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
"""
Clearsky Limits for Irradiance Data
===================================
Checking the clearsky limits of irradiance data.
"""

# %%
# Identifying and filtering out invalid irradiance data is a
# useful way to reduce noise during analysis. In this example,
# we use :py:func:`pvanalytics.quality.irradiance.clearsky_limits`
# to identify irradiance values that do not exceed
# a limit based on a clear-sky model. For this example we will
# use GHI data from the RMIS weather system located on the NREL campus in CO.

import pvanalytics
from pvanalytics.quality.irradiance import clearsky_limits
from pvanalytics.features.daytime import power_or_irradiance
import pvlib
import matplotlib.pyplot as plt
import pandas as pd
import pathlib

# %%
# First, read in data from the RMIS NREL system. This data set contains
# 5-minute right-aligned POA, GHI, DNI, DHI, and GNI measurements,
# but only the GHI is relevant here.

pvanalytics_dir = pathlib.Path(pvanalytics.__file__).parent
rmis_file = pvanalytics_dir / 'data' / 'irradiance_RMIS_NREL.csv'
data = pd.read_csv(rmis_file, index_col=0, parse_dates=True)
freq = '5T'
# Make the datetime index tz-aware.
data.index = data.index.tz_localize("Etc/GMT+7")


# %%
# Now model clear-sky irradiance for the location and times of the
# measured data. You can do this using
# :py:meth:`pvlib.location.Location.get_clearsky`, using the lat-long
# coordinates associated the RMIS NREL system.

location = pvlib.location.Location(39.7407, -105.1686)
clearsky = location.get_clearsky(data.index)

# %%
# Use :py:func:`pvanalytics.quality.irradiance.clearsky_limits`.
# Here, we check GHI data in field 'irradiance_ghi__7981'.
# :py:func:`pvanalytics.quality.irradiance.clearsky_limits`
# returns a mask that identifies data that falls between
# lower and upper limits. The defaults (used here)
# are upper bound of 110% of clear-sky GHI, and
# no lower bound.

clearsky_limit_mask = clearsky_limits(data['irradiance_ghi__7981'],
clearsky['ghi'])


# %%
# Mask nighttime values in the GHI time series using the
# :py:func:`pvanalytics.features.daytime.power_or_irradiance` function.
# We will then remove nighttime values from the GHI time series.

day_night_mask = power_or_irradiance(series=data['irradiance_ghi__7981'],
freq=freq)

# %%
# Plot the 'irradiance_ghi__7981' data stream and its associated clearsky GHI
# data stream. Mask the GHI time series by its clearsky_limit_mask for daytime
# periods.
# Please note that a simple Ineichen model with static monthly turbidities
# isn't always accurate, as in this case. Other models that may provide better
# clear-sky estimates include McClear or PSM3.
data['irradiance_ghi__7981'].plot()
clearsky['ghi'].plot()
data.loc[clearsky_limit_mask & day_night_mask][
'irradiance_ghi__7981'].plot(ls='', marker='.')
plt.legend(labels=["RMIS GHI", "Clearsky GHI",
"Under Clearsky Limit"],
loc="upper left")
plt.xlabel("Date")
plt.ylabel("GHI (W/m^2)")
plt.tight_layout()
plt.show()
71 changes: 71 additions & 0 deletions docs/examples/clipping.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
"""
Clipping Detection
===================
Identifying clipping periods using the PVAnalytics clipping module.
"""

# %%
# Identifying and removing clipping periods from AC power time series
# data aids in generating more accurate degradation analysis results,
# as using clipped data can lead to under-predicting degradation. In this
# example, we show how to use
# :py:func:`pvanalytics.features.clipping.geometric`
# to mask clipping periods in an AC power time series. We use a
# normalized time series example provided by the PV Fleets Initiative,
# where clipping periods are labeled as True, and non-clipping periods are
# labeled as False. This example is adapted from the DuraMAT DataHub
# clipping data set:
# https://datahub.duramat.org/dataset/inverter-clipping-ml-training-set-real-data

import pvanalytics
from pvanalytics.features.clipping import geometric
import matplotlib.pyplot as plt
import pandas as pd
import pathlib
import numpy as np

# %%
# First, read in the ac_power_inv_7539 example, and visualize a subset of the
# clipping periods via the "label" mask column.

pvanalytics_dir = pathlib.Path(pvanalytics.__file__).parent
ac_power_file_1 = pvanalytics_dir / 'data' / 'ac_power_inv_7539.csv'
data = pd.read_csv(ac_power_file_1, index_col=0, parse_dates=True)
data['label'] = data['label'].astype(bool)
# This is the known frequency of the time series. You may need to infer
# the frequency or set the frequency with your AC power time series.
freq = "15T"

data['value_normalized'].plot()
data.loc[data['label'], 'value_normalized'].plot(ls='', marker='o')
plt.legend(labels=["AC Power", "Labeled Clipping"],
title="Clipped")
plt.xticks(rotation=20)
plt.xlabel("Date")
plt.ylabel("Normalized AC Power")
plt.tight_layout()
plt.show()

# %%
# Now, use :py:func:`pvanalytics.features.clipping.geometric` to identify
# clipping periods in the time series. Re-plot the data subset with this mask.
predicted_clipping_mask = geometric(ac_power=data['value_normalized'],
freq=freq)
data['value_normalized'].plot()
data.loc[predicted_clipping_mask, 'value_normalized'].plot(ls='', marker='o')
plt.legend(labels=["AC Power", "Detected Clipping"],
title="Clipped")
plt.xticks(rotation=20)
plt.xlabel("Date")
plt.ylabel("Normalized AC Power")
plt.tight_layout()
plt.show()


# %%
# Compare the filter results to the ground-truth labeled data side-by-side,
# and generate an accuracy metric.
acc = 100 * np.sum(np.equal(data.label,
predicted_clipping_mask))/len(data.label)
print("Overall model prediction accuracy: " + str(round(acc, 2)) + "%")
69 changes: 69 additions & 0 deletions docs/examples/daily-insolation-limits-irradiance.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
"""
Clearsky Limits for Daily Insolation
====================================
Checking the clearsky limits for daily insolation data.
"""

# %%
# Identifying and filtering out invalid irradiance data is a
# useful way to reduce noise during analysis. In this example,
# we use :py:func:`pvanalytics.quality.irradiance.daily_insolation_limits`
# to determine when the daily insolation lies between a minimum
# and a maximum value. Irradiance measurements and clear-sky
# irradiance on each day are integrated with the trapezoid rule
# to calculate daily insolation. For this example we will use data
# from the RMIS weather system located on the NREL campus
# in Colorado, USA.

import pvanalytics
from pvanalytics.quality.irradiance import daily_insolation_limits
import pvlib
import matplotlib.pyplot as plt
import pandas as pd
import pathlib

# %%
# First, read in data from the RMIS NREL system. This data set contains
# 5-minute right-aligned data. It includes POA, GHI,
# DNI, DHI, and GNI measurements.

pvanalytics_dir = pathlib.Path(pvanalytics.__file__).parent
rmis_file = pvanalytics_dir / 'data' / 'irradiance_RMIS_NREL.csv'
data = pd.read_csv(rmis_file, index_col=0, parse_dates=True)
# Make the datetime index tz-aware.
data.index = data.index.tz_localize("Etc/GMT+7")

# %%
# Now model clear-sky irradiance for the location and times of the
# measured data:
location = pvlib.location.Location(39.7407, -105.1686)
clearsky = location.get_clearsky(data.index)

# %%
# Use :py:func:`pvanalytics.quality.irradiance.daily_insolation_limits`
# to identify if the daily insolation lies between a minimum
# and a maximum value. Here, we check GHI irradiance field
# 'irradiance_ghi__7981'.
# :py:func:`pvanalytics.quality.irradiance.daily_insolation_limits`
# returns a mask that identifies data that falls between
# lower and upper limits. The defaults (used here)
# are upper bound of 125% of clear-sky daily insolation,
# and lower bound of 40% of clear-sky daily insolation.

daily_insolation_mask = daily_insolation_limits(data['irradiance_ghi__7981'],
clearsky['ghi'])

# %%
# Plot the 'irradiance_ghi__7981' data stream and its associated clearsky GHI
# data stream. Mask the GHI time series by its daily_insolation_mask.
data['irradiance_ghi__7981'].plot()
clearsky['ghi'].plot()
data.loc[daily_insolation_mask, 'irradiance_ghi__7981'].plot(ls='', marker='.')
plt.legend(labels=["RMIS GHI", "Clearsky GHI",
"Within Daily Insolation Limit"],
loc="upper left")
plt.xlabel("Date")
plt.ylabel("GHI (W/m^2)")
plt.tight_layout()
plt.show()
89 changes: 89 additions & 0 deletions docs/examples/data-completeness.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
"""
Missing Data Periods
====================
Identifying days with missing data using a "completeness" score metric.
"""

# %%
# Identifying days with missing data and filtering these days out reduces noise
# when performing data analysis. This example shows how to use a
# daily data "completeness" score to identify and filter out days with missing
# data. This includes using
# :py:func:`pvanalytics.quality.gaps.completeness_score`,
# :py:func:`pvanalytics.quality.gaps.complete`, and
# :py:func:`pvanalytics.quality.gaps.trim_incomplete`.

import pvanalytics
from pvanalytics.quality import gaps
import matplotlib.pyplot as plt
import pandas as pd
import pathlib

# %%
# First, we import the AC power data stream that we are going to check for
# completeness. The time series we download is a normalized AC power time
# series from the PV Fleets Initiative, and is available via the DuraMAT
# DataHub:
# https://datahub.duramat.org/dataset/inverter-clipping-ml-training-set-real-data.
# This data set has a Pandas DateTime index, with the min-max normalized
# AC power time series represented in the 'value_normalized' column. The data
# is sampled at 15-minute intervals. This data set
# does contain NaN values.

pvanalytics_dir = pathlib.Path(pvanalytics.__file__).parent
file = pvanalytics_dir / 'data' / 'ac_power_inv_2173.csv'
data = pd.read_csv(file, index_col=0, parse_dates=True)
data = data.asfreq("15T")

# %%
# Now, we use :py:func:`pvanalytics.quality.gaps.completeness_score` to get the
# percentage of daily data that isn't NaN. This percentage score is calculated
# as the total number of non-NA values over a 24-hour period, meaning that
# nighttime values are expected.
data_completeness_score = gaps.completeness_score(data['value_normalized'])

# Visualize data completeness score as a time series.
data_completeness_score.plot()
plt.xlabel("Date")
plt.ylabel("Daily Completeness Score (Fractional)")
plt.tight_layout()
plt.show()

# %%
# We mask complete days, based on daily completeness score, using
# :py:func:`pvanalytics.quality.gaps.complete`.
min_completeness = 0.333
daily_completeness_mask = gaps.complete(data['value_normalized'],
minimum_completeness=min_completeness)

# Mask complete days, based on daily completeness score
data_completeness_score.plot()
data_completeness_score.loc[daily_completeness_mask].plot(ls='', marker='.')
data_completeness_score.loc[~daily_completeness_mask].plot(ls='', marker='.')
plt.axhline(y=min_completeness, color='r', linestyle='--')
plt.legend(labels=["Completeness Score", "Threshold met",
"Threshold not met", "Completeness Threshold (.33)"],
loc="upper left")
plt.xlabel("Date")
plt.ylabel("Daily Completeness Score (Fractional)")
plt.tight_layout()
plt.show()

# %%
# We trim the time series based on the completeness score, where the time
# series must have at least 10 consecutive days of data that meet the
# completeness threshold. This is done using
# :py:func:`pvanalytics.quality.gaps.trim_incomplete`.
number_consecutive_days = 10
completeness_trim_mask = gaps.trim_incomplete(data['value_normalized'],
days=number_consecutive_days)
# Re-visualize the time series with the data masked by the trim mask
data[completeness_trim_mask]['value_normalized'].plot()
data[~completeness_trim_mask]['value_normalized'].plot()
plt.legend(labels=[True, False],
title="Daily Data Passing")
plt.xlabel("Date")
plt.ylabel("Normalized AC Power")
plt.tight_layout()
plt.show()
Loading

0 comments on commit 10d289f

Please sign in to comment.