Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Cloud Masking/Detection Algorithm #227

Open
jacobbieker opened this issue Feb 27, 2024 · 14 comments · May be fixed by #251
Open

Add Cloud Masking/Detection Algorithm #227

jacobbieker opened this issue Feb 27, 2024 · 14 comments · May be fixed by #251
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@jacobbieker
Copy link
Member

There is a paper here: https://www.meteoswiss.admin.ch/services-and-publications/publications/scientific-publications/2013/the-heliomont-surface-solar-radiation-processing.html that describes how surface solar radiation is determined for MeteoSwiss, but also includes detecting different types of clouds and creating a cloud mask in SEVIRI imagery.

Detailed Description

Context

This can be quite useful for making our own cloud masks from the raw imagery, or cloud types. The paper also includes an interesting way of correcting for orbital maneuvers of the satellite, to realign the imagery, which might be very helpful.

Possible Implementation

The paper is quite detailed, so possibly just going directly off of that into Satip.

@jacobbieker jacobbieker added enhancement New feature or request good first issue Good for newcomers labels Feb 27, 2024
@vikasgrewal16
Copy link

Can you assign me this issue and give me a brief on what and how to do so that i can work on this issue?
Regards

@jacobbieker
Copy link
Member Author

Hi, the details are in the paper linked to in that website, they have their approach to cloud masking that should work here. For adding it to Satip, you could add a cloud_mask file that has the cloud masking algorithm implementation, and add some tests that run on the public Zarrs to see how well it works?
.

@vikasgrewal16
Copy link

i have read those details but when i was bbuilding the project and downloading the data with eumetsat api i have come up with this erroe can you please provide me some information or solution regarding this error.

File "/home/grewal/Satip/venv/lib/python3.10/site-packages/botocore/auth.py", line 418, in add_auth
raise NoCredentialsError()
botocore.exceptions.NoCredentialsError: Unable to locate credentials

@vikasgrewal16
Copy link

Can i get your inputs on this issue?

@jacobbieker
Copy link
Member Author

Hi, sorry for the delay, it seems that you need to log in to AWS for those credentials. That seems like you are using the app.py, which currently does upload to S3 by default. For this, you should be able to use the public google cloud dataset here to get the raw data to use with the cloud masking algorithm.

@jacobbieker
Copy link
Member Author

Also, I would recommend focusing on a single issue @vikasgrewal16 if possible? There are quite a few different potential GSoC contributors who are wanting different good first issues. I've seen you also commented on #231, are you more interested in this one or that one? Or a different one?

@vikasgrewal16
Copy link

vikasgrewal16 commented Mar 12, 2024

Thank you @jacobbieker for reaching out and bringing up the importance to focus on a single issue. I appreciate your guidance in streamlining the efforts.

Regarding your question on my preferences, I have a keen interest in both GIS and ML, which is why I am actively contributing to this project. My involvement aims to not only learn about open source but also to become a valuable part of the community. GSoC is a means to this end, and I see it as an excellent opportunity to contribute substantively.

As for the specific issues, for now i will be focusing most on #231

Looking forward to your advice and direction.

Best regards,
@vikasgrewal16

@Surya-29
Copy link

Hi @jacobbieker !

I've read the details of the SPARC cloud masking algorithm as mentioned in the reference provided by you. Currently, I'm looking through the properties of the raw data from the shared data bucket and would like to work on the implementation part of the algorithm. I would appreciate it if you could assign this issue to me. Thank you!

@jacobbieker
Copy link
Member Author

Hi @Surya-29, that sounds great!

@Surya-29
Copy link

I'm having some trouble finding attributes necessary for calculating the SPARC score (used for cloud mask). The problem is that these attributes, specifically clear sky/cloud free brightness temperature $T_{cf}$ and background reflectance $\rho_{cf}$ ​, aren't available in the SEVIRI dataset provided. They can either be calculated (Section 6 Clear Sky Compositing 1) or retrieved from other datasets (All Sky Radiances 2) provided by EUMETSAT. Can I go with the latter option since calculating these attributes might involve fitting a model over the diurnal course? However, the issue with accessing the ASR dataset is that it is only available on the EUMETSAT Data Center (which requires us to order it) and not on the Data Store, so downloading via API is not possible right? @jacobbieker How should I approach this now?

@jacobbieker
Copy link
Member Author

Ah okay, I would have thought that info would have been in the attributes of the native files. Yeah, for a first pass on getting this in, I think getting some data from the data center, and using that is probably the right way to go for now. We can always try to then add calculating the values ourselves later, as the data center can be quite slow to give data. You are right there is no api access to the data center unfortunately. Another, less ideal option, would be to see if we can find an average value, either for the year or per month, that we could use instead? But not sure if there is that published or not somewhere.

@Surya-29
Copy link

Yes, I'll probably go with averaging for background reflectance $\rho_{cf}$. As for brightness temperature $T_{cf}$, I would prefer to implement the model mentioned in the paper, if possible, since the final $sparc_{score}$​ requires at least $T_{score}$ to be calculated. Although this aggregate score cloud masking algorithm could compensate for other missing attributes in $sparc_{score}$​ calculation.

@Surya-29
Copy link

I've made progress on implementing the cloud masking algorithm and have committed the changes to my remote repository
( changes ), should I raise a PR even though the functionality of the code is partial?

  • Where should I add the cloud_mask.py file? Would it be appropriate to create a subpackage in Satip, or do I add it directly under Satip? (Better if we could have it as a subpackage since you've mentioned the possibility of extending the architecture to include other algorithms also,).
  • How do I handle the data? I've been only using the data values (numpy array) for my convenience, but the final result should be xarray.DataArray type including attribute information, etc., just like what you get from EUMETSAT Cloud Mask Dataset right?
  • Lastly, is there a specific Area of Interest? Are we focusing only on the European region?

@jacobbieker
Copy link
Member Author

Awesome! I would open a PR as a draft PR even if it's incomplete, and just keep adding to it that way.

Yeah, a subpackage would be really good to have.

Yes, the output should be in an Xarray data format, primarily to keep the coordinates and satellite attribute information, you could probably essentially just swap out the data values in the xarray satellite image with the cloud mask data and it would be good to go.

If it is easier, focusing on the European area of interest is fine for now, but we would want to extend it to work over Africa and with the Indian Ocean imagery as well.

@Surya-29 Surya-29 linked a pull request Mar 31, 2024 that will close this issue
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants