Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance degradation following 0.12.1 release #183

Closed
chingiztob opened this issue Nov 5, 2024 · 9 comments · Fixed by #184
Closed

Performance degradation following 0.12.1 release #183

chingiztob opened this issue Nov 5, 2024 · 9 comments · Fixed by #184

Comments

@chingiztob
Copy link

After upgrading the Rstar crate from version 0.12 to 0.12.1, I've experienced a dramatic slowdown in performance when using .nearest_neighbor in my public transit graph application. The operation, which connects all transit stops to the nearest street nodes, has slowed from ~50ms to over 30 seconds. All geometries are f64 geo-types points.

-- Expected Performance: ~2-5μs per .nearest_neighbor call (as observed with version 0.12)
-- Observed Performance: 10-20ms per .nearest_neighbor call with version 0.12.1
-- Change: Only the Rstar crate version was updated (from 0.12 to 0.12.1).

Cargo Flamegraph profiling didn’t pinpoint the issue due to heavy interference from Polars-related calls.
flamegraph_0_12

pub type IndexedPoint = GeomWithData<Point, Option<NodeIndex>>;

fn find_nearest_point_and_calculate_distance(
    point: &Point,
    tree: &RTree<IndexedPoint>,
) -> Result<(NodeIndex, f64), Error> {
    let instant = std::time::Instant::now();
    if let Some(nearest_point) = tree.nearest_neighbor(point) {
        println!("This took {:?}", instant.elapsed());
        let distance = Haversine::distance(*point, *nearest_point.geom()) / WALK_SPEED;
        let node = nearest_point.data.ok_or_else(|| {
            Error::NodeNotFound(format!("Nearest node not found for point {point:?}"))
        })?;
        Ok((node, distance))
    } else {
        Err(Error::NodeNotFound(format!(
            "Nearest node not found for point {point:?}"
        )))
    }
}

pub(crate) fn build_rtree(graph: &DiGraph<GraphNode, GraphEdge>) -> RTree<IndexedPoint> {
    let index_geo_vec: Vec<IndexedPoint> = graph
        .node_indices()
        .map(|node| {
            let node_data = graph.node_weight(node).unwrap();
            let node_point: Point = *node_data.geometry();
            IndexedPoint::new(node_point, Some(node))
        })
        .collect();

    RTree::bulk_load(index_geo_vec)
}

/// Connects Transit nodes (stops) to the nearest walk nodes.
pub(crate) fn connect_stops_to_streets(graph: &mut TransitGraph) -> Result<(), Error> {
    let instant = std::time::Instant::now();
    let rtree: RTree<IndexedPoint> = graph.rtree_ref().unwrap().clone();

    for node in graph.node_indices() {
        // check if there is already a transfer edge
        // This is required to avoid creating duplicate transfer edges
        // when merging multiple transit graphs on top of main street graph
        // (Work in progress)
        if graph
            .edges(node)
            .any(|edge| matches!(edge.weight(), GraphEdge::Transfer(_)))
        {
            continue;
        }

        let weight = graph
            .node_weight(node)
            .ok_or_else(|| Error::MissingValue("Node weight not found".to_string()))?;

        if let GraphNode::Transit(_) = weight {
            if let Ok((nearest_point_index, distance)) =
                find_nearest_point_and_calculate_distance(weight.geometry(), &rtree)
            {
                let edge = GraphEdge::Transfer(WalkEdge {
                    edge_weight: distance,
                });

                graph.add_edge(node, nearest_point_index, edge.clone());
                graph.add_edge(nearest_point_index, node, edge);
            }
        }
    }
    println!("This took {:?}", instant.elapsed());
    Ok(())
}
@adamreichold
Copy link
Member

Just going out on a hunch, but could you test with a Git dependency using the branch from #181 ? Thanks!

@chingiztob
Copy link
Author

Just going out on a hunch, but could you test with a Git dependency using the branch from #181 ? Thanks!

that one?

rstar = { git = "https://github.com/michaelkirk/rstar.git", rev = "222b01a40953c4fae35b38fc7d19416a29136818"}

@michaelkirk
Copy link
Member

michaelkirk commented Nov 5, 2024

no, 222b01a doesn't change any behavior, it just adds a failing unit test.

12836a9 changes behavior.

Otherwise, the 0.12.0 to 0.12.1 is only a few commits - you could try doing a git bisect to chase down the culprit.

@adamreichold
Copy link
Member

You could also try

rstar = { git = "https://github.com/georust/rstar.git", branch = "new-empty-min-max" }

@adamreichold
Copy link
Member

Otherwise, the 0.12.0 to 0.12.1 is only a few commits - you could try doing a git bisect to chase down the culprit.

But yeah, this would certainly be the most thorough solution.

@chingiztob
Copy link
Author

Otherwise, the 0.12.0 to 0.12.1 is only a few commits - you could try doing a git bisect to chase down the culprit.

But yeah, this would certainly be the most thorough solution.

idk why, but i get compiler errors when trying to use git dependency

error[E0277]: the trait bound `rstar::primitives::GeomWithData<geo::Point, std::option::Option<petgraph::prelude::NodeIndex>>: rstar::RTreeObject` is not satisfied
   --> cascade-core/src/connectors.rs:62:12
    |
62  |     tree: &RTree<IndexedPoint>,
    |            ^^^^^^^^^^^^^^^^^^^ the trait `rstar::RTreeObject` is not implemented for `rstar::primitives::GeomWithData<geo::Point, std::option::Option<petgraph::prelude::NodeIndex>>`
    |
    = help: the trait `rstar::RTreeObject` is implemented for `rstar::primitives::GeomWithData<R, T>`
note: required by a bound in `rstar::RTree`
   --> /home/chingiz/.cargo/git/checkouts/rstar-526d45c905f9146e/4c16830/rstar/src/rtree.rs:167:8
    |
164 | pub struct RTree<T, Params = DefaultParams>
    |            ----- required by a bound in this struct
...
167 |     T: RTreeObject,
    |        ^^^^^^^^^^^ required by this bound in `RTree`

@urschrei
Copy link
Member

urschrei commented Nov 5, 2024

Possibly because you have to use the same dependency in the geo and geo-types Cargo.toml?

@michaelkirk
Copy link
Member

You should be overiding using a [patch] section in your Cargo.toml - see https://doc.rust-lang.org/cargo/reference/overriding-dependencies.html

@chingiztob
Copy link
Author

no, 222b01a doesn't change any behavior, it just adds a failing unit test.

12836a9 changes behavior.

Otherwise, the 0.12.0 to 0.12.1 is only a few commits - you could try doing a git bisect to chase down the culprit.

Seems like perf drops after 84d1265. With rstar = { git = "https://github.com/georust/rstar.git", branch = "new-empty-min-max" } things are back to normal

github-merge-queue bot pushed a commit that referenced this issue Nov 5, 2024
- [x] I agree to follow the project's [code of
conduct](https://github.com/georust/geo/blob/master/CODE_OF_CONDUCT.md).
- [x] I added an entry to `rstar/CHANGELOG.md` if knowledge of this
change could be valuable to users.
---

This reverts back to the min/max representation of empty AABB due to
regressions introduced by the numerically more tame but not invariant
for merging representation.

To avoid regressing #161, it also adds a single check to avoid computing
the distance to an empty envelope (of the root node of an empty tree).
Integer coordinates are always prone to overflow but in the empty case
we are forcing this onto the caller whereas every non-empty tree will
have AABB that the caller supplied and where we can reasonably ask them
to control for overflow or use a custom `Point` impl based on saturating
arithmetic.

Fixes #183
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants