Recall similar days from the history #174

JackKelly · 2022-06-25T14:26:05Z

To help forecast individual PV power, and GSP power, find a set of "similar" periods from the history, and feed those "similar periods" into the ML model at inference time, along with the recent history.

For training to run as quickly as necessary, this probably requires that we can fit a representation of the history into RAM. (In production, we probably have time to load history from disk).

Could maybe use contrastive learning (#155) to get an encoder to map from, say, NWPs and satellite to a vector, such that vectors are similar when the resulting PV power is similar (and similarity of PV power could be judged with NMAE).

jacobbieker · 2022-06-28T07:46:53Z

We could use FAISS? Its what HuggingFace uses for similar search in their vectors, and a lot of other places, is quick, can run on the GPU, and scales well to millions of vectors, we chatted about it a little here: openclimatefix/satflow#65

JackKelly · 2022-06-28T18:43:26Z

SGTM! Thanks for digging out that discussion! I had a memory that we'd talked about this before but I could find the discussion! So thanks for the link!

I'm in two minds about how important it'll be to recall "similar" days from the history. Right now I'm definitely thinking this is a fairly low priority, not least because I suspect it'll be a fair chunk of work. But I am super-curious to see how well it works!

Please do shout if this is something you might be interested in exploring? And/or something we could ask a student to look into?

jacobbieker · 2022-06-29T07:48:35Z

Yeah, I'd be interested in tackling it! I think it could be interesting to see how well it works, and even for just having the similarity search could be helpful for finding examples similar to where the models fail more often too

JackKelly · 2022-06-29T08:54:50Z

Fab! That's be awesome! I've put it on the agenda for next week's meeting(s)!

jacobbieker · 2022-06-29T10:48:33Z

For the contrastive encoding, there is a PyTorch implementation here: https://github.com/rschwarz15/CPCV2-PyTorch

JackKelly · 2022-07-19T15:14:26Z

This paper might be relevant (I've only read the abstract):

Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning

JackKelly · 2022-07-19T15:20:27Z

Also worth noting that Alex Carter at NG-ESO says he'd quite like our web UI to display days from the last few years, and to automatically suggest "similar" days.

JackKelly added the ML ML model tweak or big idea label Jun 28, 2022

JackKelly added this to Nowcasting Jun 28, 2022

JackKelly moved this to Todo in Nowcasting Jun 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recall similar days from the history #174

Recall similar days from the history #174

JackKelly commented Jun 25, 2022 •

edited

Loading

jacobbieker commented Jun 28, 2022

JackKelly commented Jun 28, 2022

jacobbieker commented Jun 29, 2022

JackKelly commented Jun 29, 2022

jacobbieker commented Jun 29, 2022

JackKelly commented Jul 19, 2022

JackKelly commented Jul 19, 2022

Recall similar days from the history #174

Recall similar days from the history #174

Comments

JackKelly commented Jun 25, 2022 • edited Loading

jacobbieker commented Jun 28, 2022

JackKelly commented Jun 28, 2022

jacobbieker commented Jun 29, 2022

JackKelly commented Jun 29, 2022

jacobbieker commented Jun 29, 2022

JackKelly commented Jul 19, 2022

JackKelly commented Jul 19, 2022

JackKelly commented Jun 25, 2022 •

edited

Loading