-
-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add schema for Generation data #46
Comments
Going to put some thoughts here on this, first of that the bits for using generation data (pv or wind) are not in this repo yet, there is GSP data loading but I think that is different enough that it can be outside the scope of this (can always revisit that), so this schema can start to be used when we add the PV Site Dataset in, and the info for it can be added in then as well: Reason for having site/generation data schemas:
Proposed Schemas:
Assumed MW would be the best power unit for this since that is usually the size of the sites we deal with, also removed tilt and orientation for now because I don't think these are actually used anywhere currently. In terms of enforcing this schema, I think it could be done with the lighter touch side of documentation in a README and comments pointing to that in config rather than automated schema validation (which could be achieved if we used something like this, but I think a lighter touch approach would be adequate for now and make it clearer what is supported and lead to less ad hoc code. Keen to hear people's thoughts on this though. @peterdudfield @dfulu @AUdaltsova, thanks |
re: ml_id, I am still a bit hazy on how this is different from site id (when we're talking pvnet-site). and I think all the data i used recently didn't have it. What am I missing? |
I think we don't require this anymore, but could be wrong I really like the netcdf organisation! And thanks a lot for doing this :) |
I also agree that being super-strict on this is not good right now, we will probably want stuff like tilt and orientation stored in the same files anyway and there might be unforeseen differences we cant reconcile depending on whose data we're using |
I think that currently |
So i think its good for ml_id to start very small, so our embedding dim can be small. Sometimes its useful to have a site id which maps to the system_id that the client has given us, so we know which site is which. I think that was the motiviation I'd be tempted to
|
Thanks for the comments all, that is really helpful! I agree with those points and suggestions and have updated the proposed schemas above, the next step will be to add this info in/use these schemas for renewable generation data files when we add in those datasets for PV and Wind here, hopefully it will be useful in the long term! |
By the way, do we store individual sites in separate netcdf files or all sites from say one project are in one netcdf file? |
I would lean towards one netcdf for all sites, I think it's easiest/get the input datasets together that way Also on the units perhaps MW should be kW as I think this is the most used power units across our code base. Finally, I think we should stick to "site" or "system", I have gone with "site" since I think that is what we use most now, updated the schema outline above to reflect this |
Ok thanks for clarifying! |
yea i like
|
Detailed Description
It would be great to add a schema for the generation data
Context
Possible Implementation
The text was updated successfully, but these errors were encountered: