-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GTC-2570 Filter out rows not requiring analysis #192
Conversation
@@ -48,6 +48,8 @@ object AFiAnalysis extends SummaryAnalysis { | |||
val summaryDF = AFiAnalysis.aggregateResults( | |||
AFiDF | |||
.getFeatureDataFrame(summaryRDD, spark) | |||
.filter($"location_id" =!= -2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think actually we'd want to filter out -2 before we run the analysis, since the first time a user runs this, -2 will be the same as -1, a huge dissolved polygon. So kind of a waste to compute all that and then toss it away. You could run this same filter but for the input feature DF.
Codecov ReportPatch has no changes to coverable lines. ❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
📢 Thoughts on this report? Let us know!. |
@@ -40,5 +40,7 @@ object AFiDF extends SummaryDF { | |||
} | |||
.toDF("id", "error", "dataGroup", "data") | |||
.select($"id.*", $"error.*", $"dataGroup.*", $"data.*") | |||
.filter($"location_id" =!= -2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, should've been clearer: we should filter these out at the very beginning, like if AFiCommand class, before we even run the analysis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks for figuring this out!
Pull request checklist
Please check if your PR fulfills the following requirements:
Pull request type
Please check the type of change your PR introduces:
What is the current behavior?
Performs analysis for difference geometries and areas with null GADM values
Issue Number: GTC-2570
What is the new behavior?
location_id = -2
gadm_id = "null"
Does this introduce a breaking change?
Other information