-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes to include a bounding box in computing fault composition. #264
Changes to include a bounding box in computing fault composition. #264
Conversation
|
Hey Arushi, Thanks for making this pull request! The code looks very clean and documented which is great. We may need to look at more detail at the test results later, but before we do that I have some general comments:
Let me know if you have any further questions! |
Thank you, Menno, for your suggestions. :) I incorporated them in the latest commit, however, I am yet to do more tests to evaluate how this branch is performing. I will keep you updated! |
I am curious what the difference is going to be. Could you rebase the pull request to the newest branch? I have added a new benchmark which should be able to measure the difference. |
Just did! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, and the benchmarks look very promising with just 2 faults which are far away!
I don't see why some of the the test would be failing, but I can take a closer look at that later.
One more question, could you also add this for the slab? If you think that would be too much work that I can do that later in a follow up pull request, but I wanted to give you the opportunity to do it yourself :)
P.s. I made a mistake in extending one of the benchmark, which I fixed in #268, could you rebase again?
source/utilities.cc
Outdated
std::vector<Point<2> > bounding_box (4, point); | ||
|
||
// The bounding box currently is hard-coded to include buffer zone based on the spherical geometry. | ||
const double bounding_buffer_zone = 2.* (buffer_around_fault) / (const_pi * 2. * 6371000.) ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no reason to compute this in this function since you already have a parameter for buffer_around_fault
. I also think this can be more easily generalized for all spherical geometries since the calling function has access to both the depth
and the radius from the center
. Adding both will give you the radius of the sphere/earth.
Could you also add a bit of an explanation of what this is calculating?
e170c64
to
9dc726d
Compare
Hi Menno, I rebased and added the bounding box in subducting plate model. I was also getting a small warning in |
Out of curiosity, what was the warning? Was it related to this pull request or something separate? |
I got this: |
hmm, strange. I am not getting that warning. What compiler are you using? It is correct though. Could you put this change in an other pull request? I am asking this partly because it is a separate issue, but also if we can get that to merge quickly the all tests should start to run automatically for you without me requiring to approve it every time :) You can just remove the variable |
Hi Menno, created a PR #269 for the warning. |
I did actually look into the tests and nearly all of them are very easy to fix when a special case for Cartesian is added. The buffer can't be computed during the parse parameter phase and needs to be computed and added during the compute composition phase, because that is the only place where the radius is known (with the radius from the point and the depth, as explained above). We may need to do a bit more thinking about the spherical cases and what a good boundary distance is. The correct distance could be computed by the intersection line of the sphere of the Earth and the spheres generated by the the points on the corners of the bounding box and the maximum length+thickness of the slab. Doing that exact calculation is probably too costly, but it is a good minimum bound to be aware of. Taking twice the length+thickness and converting that to spherical distance at the equator may be good enough. |
458c8e3
to
b232d2b
Compare
Codecov Report
@@ Coverage Diff @@
## master #264 +/- ##
==========================================
- Coverage 98.98% 98.89% -0.09%
==========================================
Files 74 75 +1
Lines 4144 4271 +127
==========================================
+ Hits 4102 4224 +122
- Misses 42 47 +5
Continue to review full report at Codecov.
|
Hi Menno, In the recent commit, I added the check for coordinate system type in calculating Regarding the buffer zone calculation, I agree that it needs to be more accurate for spherical geometry. |
Thanks for making the changes. I am actually surprised that with a x2 it works for the spherical case, because I still had one spherical test which gave me issues when I tried it. I am not sure if I understand you correctly, but I don't think the cell size of the mesh should play any role in this. Mostly because the idea behind the world builder is to set up a meshless mathematical representation of the model, which can be queried by the program which has a mesh. If the mesh that other program is too coarse to resolve the features defined in the input file that is not something the world builder can do anything about. The only thing the world builder can do something about is make sure that it produces the right output based on the input, which is mesh independent. So, the optimizations need to be always on the safe side. So we need to think up a worst case scenario (I am thinking around the poles) and make sure that there is not difference between the normal and optimized version. Besides that, we also need to find out why these optimizations do not make a difference in your case (or has that changed?). I haven't gotten around to testing your input file myself yet, sorry about that :( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't quite right I think, but you're close to it. You are computing the bounding box as a list of four points, but that's not how this is typically done, and I can't quite convince myself that what you are doing is right. (I do know that bounding_box_contains_point()
is not right, see comment below :-) )
Typically, a bounding box just stores the min/max values for the x and y variables (or lat/long variables, if you want). You can think of this as the bottom left and top right point. So the bounding box should just consists of two points, not four, and checking whether a point is inside then becomes really simple. Take a look at how deal.II does this:
https://github.com/dealii/dealii/blob/master/include/deal.II/base/bounding_box.h#L305
https://github.com/dealii/dealii/blob/master/source/base/bounding_box.cc#L21-L40
In your case, you can probably ignore the tolerance and just set it to zero.
source/features/fault.cc
Outdated
buffer_zone = 2 * (maximum_fault_thickness + maximum_total_fault_length); | ||
|
||
std::vector<Point<2> > bounding_box_coordinates = WorldBuilder::Utilities::get_bounding_box ( | ||
reference_point, coordinates, original_number_of_coordinates, buffer_zone); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is something that doesn't depend on the current point this function is checking. It would be easiest if you could just do this computation once after you read in the details of the fault in parse_entries()
, and then store the bounding box in a member variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember moving the calculation of bounding_box_coordinates()
in parse_entries()
stage based on @MFraters 's suggestion. However, the reason I moved it back into the get_composition
stage is that I needed to use the starting_radius
value for computing the buffer_zone, which I couldn't use in parse_entries()
.
Maybe there is a way to resolve this by modifying the approach for buffer_zone
computation, I will think more about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized that I can access the starting_radius
value in the parse_entries()
. Please ignore my previous comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The starting_radius
is dependent on both the position and depth of the request. You only have them during the request itself (i.e. get_composition
). I don't see how you could get those values during the parse_entries
phase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right! I am sorry, I confused myself by using starting_depth
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that because you are building 2d bounding boxes only for the specific depth you are currently at?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it is general for all cases. I tried to explain the function in a comment reply below.
b232d2b
to
f67d770
Compare
Hello @MFraters, @bangerth: I tried to incorporate your comments in the latest commit. I could not move the computation of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a couple of questions that I currently don't understand. If you have a second, think about these and if we can come to some conclusion, I can test this again!
source/features/fault.cc
Outdated
if (world->parameters.coordinate_system->natural_coordinate_system() == CoordinateSystem::spherical) | ||
buffer_zone = 2 * (maximum_fault_thickness + maximum_total_fault_length) / (const_pi * starting_radius* 2); | ||
else if (world->parameters.coordinate_system->natural_coordinate_system() == CoordinateSystem::cartesian) | ||
buffer_zone = 2 * (maximum_fault_thickness + maximum_total_fault_length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why you add maximum_total_fault_length
to the buffer zone, and why you multiply by two? The buffer zone should just be a distance around points so that the bounding box is around these points, but that buffer zone should be independent of where the points actually lie. (I'm assuming that maximum_total_fault_length
is something that incorporated information about the points of the fault, right?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the way I implemented the bounding_box is a 2D box that contains all the possible surface points for which the composition is calculated. (i.e., the last if condition in ln 499 in faults.cc: WorldBuilder::Utilities::bounding_box_contains_point (...)
The condition for the depth is satisfied in ln 498 in the same if statement.
To calculate the bounding box, I use two inputs: coordinates
and a buffer_zone
. Thecoordinates
is the vector of 2D points that define the fault line. Worldbuilder creates the fault using the coordinates, fault thickness, dip angle, and fault_length
. Using the min and max points in coordinates
, we create the two diagonal points of the box: lower left and upper right. But, since the thickness of fault is not incorporated in the coordinates
, we need to add that to the two extreme points. Additionally, if the faults are dipping, we need to search laterally so that the surface projection of the fault is also covered in the bounding_box. For any value of dip > 0, the surface projection would not extend beyond the maximum_total_fault_length
. Therefore, I add the thickness and the fault to the existing surface coordinates. The factor of 2 is somewhat arbitrary to calculate the bounding lat/long coordinates in a spherical system (I need to remove it for cartesian). If the faults lie on a great circle, we don't need any factor, and as they go towards the pole, the factor should increase. But, I need to discuss more with you and @MFraters on how to correctly use this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alarshi Thanks for the explanation -- although I have to admit that I understand only half of it because I'm not familiar with the way the WorldBuilder describes fault, and so I have no idea what many of the words mean :-)
I very frequently look for bugs by looking for things that appear in one expression but don't go together. In this particular case, it is "thickness" and "length". I got confused why one would want to add these together because I think of one as an "extent across the fault" and the other as an "extend along the fault", and these seem conceptually independent. I can think of a very short fault (very small length) that requires a large buffer zone because it is deep, thick, and with a small dip angle. I can also think of a very long fault that requires a small buffer zone because it is shallow, not very thick, and dips vertically. So it seems wrong that the length of the fault appears in the buffer zone width. But maybe I'm also misunderstanding what WB considers the "length" of a fault :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bangerth: It is confusing because I was trying to make the bounding_box work for my case without thinking through all the details. You are right, the fault_length
and thickness
are perpendicular. My thinking was that if we have a vertical fault, it is sufficient that the buffer zone contains just the fault_thickness because the fault_length will be checked by the depth condition. On the other hand, a fault almost horizontal (dipping along the X-axis) would have surface coordinates depending on both thickness and the fault length, therefore, requiring a conservative estimate of buffer_zone containing both the values.
I might be thinking it wrong, it would help if I could discuss this with you sometime today. I also want to discuss how to implement the BoundingBox class defined PR#306 for our case. :)
source/utilities.cc
Outdated
// Initialize any point to create bounding_box, it will be overwritten later. | ||
Point<2> reference_point(point_list[0]); | ||
// Corner coordinates of the bounding polygon | ||
std::pair<Point<2>, Point<2> > bounding_box (reference_point, reference_point); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to just write it like this, and then don't need reference_point
at all:
std::pair<Point<2>, Point<2> > bounding_box (reference_point, reference_point); | |
std::pair<Point<2>, Point<2> > bounding_box; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried that, but it complains no default constructor exists for class "std::pair<WorldBuilder::Point<2>, WorldBuilder::Point<2>>"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is because I would not know what the default constructor would need to be, since a point requires a coordinate system. Although I guess it could be invalid
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In #306, I initialize a pair of points this way:
std::pair<Point<2>, Point<2> > bounding_box =
{{
-std::numeric_limits<double>::max(),
-std::numeric_limits<double>::max(),
-std::numeric_limits<double>::max(),
cartesian
},
{
+std::numeric_limits<double>::max(),
+std::numeric_limits<double>::max(),
+std::numeric_limits<double>::max(),
cartesian
}
}
f67d770
to
3bcd29e
Compare
Hi @MFraters @bangerth: I figured out the problem with the missing faults (I was dividing the length by 2pi instead of multiplying 2pi)! The latest commit fixes that and also uses the #306 patch to calculate the bounding box in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments, one we need to talk about. One other request, could you also add this check to the temperature and grains for both the slab and the fault?
source/features/fault.cc
Outdated
// For the spherical system, the buffer zone along the longitudal direction is calculated using the | ||
// correponding latitude points. | ||
|
||
const double radius_of_earth = 6371000.; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's talk about this. I don't like to have a constant in here. It would have my preference to have this at run time. I am curious how much of a performance difference it would make, since we can precompute everything except the radius of the earth. The other option would be to put it in the input file. There would be two options for that, either put it with the spherical coordinates or put it in a new section which contains optimization values. The downside of putting it in the spherical coordinates section is that it may conflict with the values provided by aspect and may confuse users. With a new section, we could make it clear that this should only be used for optimization and that it is the users responsibility to make sure it is consistent with the input file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the reviews and your help! :) I pushed the changes in the latest commit (described more in the comment below)
@MFraters @gassmoeller @bangerth : I incorporated your suggestions for pre-defining most variables in the parse-entries stage. Compared to the master, I get about 4x speed using this branch time to set up initial conditions. I discussed this with @MFraters and know that this can be reduced further with more accurate computation of the bounding box (including the fault dips ), but this simple calculation still helps a lot with my models :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I found two more performance improvements, otherwise good to go. I am still a bit worried about the poles, but I am going to make a test now.
I convinced my self for now that the this is not going to add new issues for the poles in spherical coordinates, so I think it is good to go once you have addressed my last comments :) |
Just a quick update on the performance after the latest suggestions by @MFraters: the test model on this branch take ~ 40 s to set up initial conditions and the master takes about ~ 350 s. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a great speedup! I think we can push it even a bit more if needed, but I think this pull request is ready to be merged. I think the changes and the involved work are large enough for a changelog entry. Could you add an entry? You could either put it under changed or make a new section called something like performance (or somewhere else if you think that would be a better fit).
Thanks for sticking through with this pull request!
@MFraters: I will do that! Thank you for all the help in this :) |
hmm, these are indeed a lot of commits for the amount of lines which are changed (although it was a lot of work on your side). Squashing it into three commits (one for the the fault, one for the slab and one for the changelog) would be appreciated. |
PR comments, not addressed completely. PR comments addressed, fixed the previous bugs. Added a larger buffer zone.
Added comments about bounding box. Include the computation of buffer zone in the composition phase. Changes to use two points in calculating bounding box. Addressed suggestions from the PR. Use the bounding box class to compute the spherical bounding box. Rewritten the bounding box with more accurate computation and bounding box class. Moving bounding box computation around. Pre definition of most variables in parse stage. Addressed recent suggestions on the PR for better performance.
6ee807b
to
7cc4545
Compare
ah, sorry, I wasn't clear. Could you add the changelog entry under the [unreleaded] heading instead of [0.4.0]? |
The space between [] and () is preventing it from becoming a proper link in markdown. @bangerth, do you still want to have a look? |
Fixed the heading for changelog. Fixed the link in the changelog.
f1eea0f
to
54e3d2a
Compare
I talked to Wolfgang, and it is good to be merged now. Thanks again! 🎉 |
This PR aims to address the issue: #219, i.e. to reduce the time taken to assign fault composition using WB, particularly when a large number of faults are used. It introduces a function, bounding_polygon_contains_point () that returns true only when a point lies inside the bounding_polygon such that the function distance_point_from_curved_planes is not called for all the model points but only if the point lies inside the bounding_polygon. The work is still in progress because the performance is not improved as expected in my tests.