Skip to content
This repository has been archived by the owner on Aug 30, 2024. It is now read-only.

Recall similar days from the history #174

Open
JackKelly opened this issue Jun 25, 2022 · 7 comments
Open

Recall similar days from the history #174

JackKelly opened this issue Jun 25, 2022 · 7 comments
Labels
ML ML model tweak or big idea

Comments

@JackKelly
Copy link
Member

JackKelly commented Jun 25, 2022

To help forecast individual PV power, and GSP power, find a set of "similar" periods from the history, and feed those "similar periods" into the ML model at inference time, along with the recent history.

For training to run as quickly as necessary, this probably requires that we can fit a representation of the history into RAM. (In production, we probably have time to load history from disk).

Could maybe use contrastive learning (#155) to get an encoder to map from, say, NWPs and satellite to a vector, such that vectors are similar when the resulting PV power is similar (and similarity of PV power could be judged with NMAE).

@jacobbieker
Copy link
Member

We could use FAISS? Its what HuggingFace uses for similar search in their vectors, and a lot of other places, is quick, can run on the GPU, and scales well to millions of vectors, we chatted about it a little here: openclimatefix/satflow#65

@JackKelly
Copy link
Member Author

SGTM! Thanks for digging out that discussion! I had a memory that we'd talked about this before but I could find the discussion! So thanks for the link!

I'm in two minds about how important it'll be to recall "similar" days from the history. Right now I'm definitely thinking this is a fairly low priority, not least because I suspect it'll be a fair chunk of work. But I am super-curious to see how well it works!

Please do shout if this is something you might be interested in exploring? And/or something we could ask a student to look into?

@JackKelly JackKelly added the ML ML model tweak or big idea label Jun 28, 2022
@JackKelly JackKelly moved this to Todo in Nowcasting Jun 28, 2022
@jacobbieker
Copy link
Member

Yeah, I'd be interested in tackling it! I think it could be interesting to see how well it works, and even for just having the similarity search could be helpful for finding examples similar to where the models fail more often too

@JackKelly
Copy link
Member Author

Fab! That's be awesome! I've put it on the agenda for next week's meeting(s)!

@jacobbieker
Copy link
Member

For the contrastive encoding, there is a PyTorch implementation here: https://github.com/rschwarz15/CPCV2-PyTorch

@JackKelly
Copy link
Member Author

This paper might be relevant (I've only read the abstract):

Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning

@JackKelly
Copy link
Member Author

Also worth noting that Alex Carter at NG-ESO says he'd quite like our web UI to display days from the last few years, and to automatically suggest "similar" days.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
ML ML model tweak or big idea
Projects
No open projects
Status: Todo
Development

No branches or pull requests

2 participants