Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LineSplit trait #1050

Closed
wants to merge 16 commits into from
Closed
Show file tree
Hide file tree
Changes from 15 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
92 changes: 92 additions & 0 deletions geo/src/algorithm/line_split/line_split_result.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
/// The result of splitting a line using
/// [LineSplit::line_split()](crate::algorithm::LineSplit::line_split) method.
/// It can contain between one and two [Lines](crate::Line) / [LineStrings](crate::LineString).
///
/// Note that it may not be desireable to use a `match` statement directly on this type if you only
/// ever want one part of the split. For this please see the helper functions;
/// [.first()](LineSplitResult#method.first),
/// [.second()](LineSplitResult#method.second),
/// [.into_first()](LineSplitResult#method.into_first), and
/// [.into_second()](LineSplitResult#method.into_second).
///
/// ```
/// # use geo::{LineString, coord};
/// # use geo::LineSplit;
///
/// # let my_line = LineString::from(vec![coord! {x: 0., y: 0.},coord! {x: 1., y: 0.},]);
/// if let Some(result) = my_line.line_split_twice(0.2, 0.5) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this a bit more... is there a reason to not just have line_split_many and return Vec<Option<T>>? The guarantee is that the vec has n+1 entries for n input splits, and any of them might be None. Are the enums easier to use in the offset work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @dabreegster :)

Regarding line_split_twice I think you may be right that line_split_many returning Vec<Option<T>> is more broadly useful. I personally have a specific use-case for line_split_twice() where I want to chop out a road segment between two chainages, but I can just move that function into my own project.

I think for a single line_split the enum (LineSplitResult<T>) is helpful as explained above.

Maybe your comment is because it comes wrapped in an option line_split(..) -> Option<LineSplitResult<Self>> which is an annoying extra step to unwrap?

Just thinking out loud; the possible outcomes are
a)

None, // returned when the input line has NAN or INF values, or the input `fraction` is NAN
Some(LineSplitResult::First       (T   )),
Some(LineSplitResult::Second      (   T)),
Some(LineSplitResult::FirstSecond (T, T)),

The tuple version is
b)

None, // returned when the input line has NAN or INF values, or the input `fraction` is NAN
Some((Some(T), None   )),
Some((None,    Some(T))),
Some((Some(T), Some(T))),

One level of wrapping can be removed like this;
c)

(None,     None   ), // returned when the input line has NAN or INF values, or the input `fraction` is NAN
(Some(T), None   ),
(None,    Some(T)),
(Some(T), Some(T)),

So why not just use the last one?


The most satisfying API would be if we are allowed to assume the input geometry and fraction are well-formed, then the result could be like this:

d)

LineSplitResult::First       (T   ),
LineSplitResult::Second      (   T),
LineSplitResult::FirstSecond (T, T),

In this case, the enum is helpful, because it prevents the user from expecting (None, None ) which can never happen if the inputs are valid. Therefore the enum stops rust from complaining about non-exhaustive match arms.


I think it depends on whether line_split should be guarding against NAN/INF. Possibly not? Otherwise if other algorithms rely on this algorithm, and they all check for NAN every time... there will end up being a lot of NAN checks? On the other hand, if it is user facing, then it probably should guard against invalid input?

Apologies for the wall-of-text and thank you for your time :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining further! I think guarding against invalid input is necessary, so some way to represent that sounds helpful. I'm just trying to figure out why the enum is helpful vs returning a bunch of Option<T>s.

I personally have a specific use-case for line_split_twice() where I want to chop out a road segment between two chainages, but I can just move that function into my own project.

I can imagine use cases for different numbers of split points. What about if some of the variations returned an array of fixed size, like line_split -> [Option<T>; 2] and you used https://doc.rust-lang.org/reference/patterns.html#slice-patterns? If the input is invalid, [None, None]. Otherwise the 3 variations can be represented. I just tried out slice patterns and https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=596e6e8d97ae0e673428dff4334c0b3f seems pretty succinct, and would save you from defining the enums.

In this case, the enum is helpful, because it prevents the user from expecting (None, None ) which can never happen if the inputs are valid. Therefore the enum stops rust from complaining about non-exhaustive match arms.

Is (None, None) => unreachable!("input should be valid") in user code something to avoid? Similar to an line_split(...).expect("input should be valid"), right?

Copy link
Member

@michaelkirk michaelkirk Aug 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is (None, None) => unreachable!("input should be valid") in user code something to avoid

It wouldn't be the end of the world, but I think it'd be better to encode it in the type system if we can do so succinctly. Otherwise we're relying on the user to read (and forever remember!) the docs.

The most satisfying API would be if we are allowed to assume the input geometry and fraction are well-formed, then the result could be like this:

LineSplitResult::First       (T   ),
LineSplitResult::Second      (   T),
LineSplitResult::FirstSecond (T, T),

Even though it's not any fewer cases, maybe a better way to communicate to the user the possibility of invalid results is something like:

// Library code

enum LineSplits {
  First(T),
  Second(T),
  FirstSecond((T, T)),
}

fn split_line(line_string: &LineString) -> Result<LineSplits> {  todo!()  }


// User code
let splits = split_line(my_line_string).expect("valid input");

/// if let Some(second) = result.into_second() {
/// // got the 'second' part of the split line
/// // between the points 20% and 50% along its length
/// }
/// }
/// ```
#[rustfmt::skip]
#[derive(PartialEq, Debug)]
pub enum LineSplitResult<T> {
First (T ),
Second ( T),
FirstSecond (T, T),
}

#[rustfmt::skip]
impl<T> LineSplitResult<T>{
/// Return only the first of two split line parts, if it exists.
pub fn first(&self) -> Option<&T> {
match self {
Self::First (x ) => Some(x),
Self::Second ( _) => None,
Self::FirstSecond(x, _) => Some(x),
}
}
/// Return only the first of two split line parts, if it exists, consuming the result.
pub fn into_first(self) -> Option<T> {
match self {
Self::First (x ) => Some(x),
Self::Second ( _) => None,
Self::FirstSecond(x, _) => Some(x),
}
}
/// Return only the second of two split line parts, if it exists.
pub fn second(&self) -> Option<&T> {
match self {
Self::First (_ ) => None,
Self::Second ( x) => Some(x),
Self::FirstSecond(_, x) => Some(x),
}
}
/// Return only the second of two split line parts, if it exists, consuming the result.
pub fn into_second(self) -> Option<T> {
match self {
Self::First (_ ) => None,
Self::Second ( x) => Some(x),
Self::FirstSecond(_, x) => Some(x),
}
}
/// Return all two parts of the split line, if they exist.
///
/// Instead of using this, consider using a match statement directly on the
/// [LineSplitResult] type; the reason is that some combinations of this type
/// (eg `(None, None)`) can never exist, but the compiler will still complain about missing arms
/// in your match statement.
pub fn as_tuple(&self) -> (Option<&T>, Option<&T>) {
match self {
Self::First (a ) => (Some(a), None ),
Self::Second ( b) => (None , Some(b)),
Self::FirstSecond(a, b) => (Some(a), Some(b)),
}
}
/// Return all two parts of the split line, if they exist, consuming the result.
///
/// Instead of using this, consider using a match statement directly on the
/// [LineSplitResult] type; the reason is that some combinations of this type
/// (eg `(None, None)`) can never exist, but the compiler will still complain about missing arms
/// in your match statement.
pub fn into_tuple(self) -> (Option<T>, Option<T>) {
match self {
Self::First (a ) => (Some(a), None ),
Self::Second ( b) => (None , Some(b)),
Self::FirstSecond(a, b) => (Some(a), Some(b)),
}
}
}
220 changes: 220 additions & 0 deletions geo/src/algorithm/line_split/line_split_trait.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,220 @@
use geo_types::CoordFloat;

use super::{LineSplitResult, LineSplitTwiceResult};

/// Defines functions to split a [Line](crate::Line) or [LineString](crate::LineString)
pub trait LineSplit<Scalar>
where
Self: Sized,
Scalar: CoordFloat,
{
/// Split a [Line](crate::Line) or [LineString](crate::LineString) at some `fraction` of its length.
///
/// `fraction` is any real number. Only values between 0.0 and 1.0 will split the line.
/// Values outside of this range (including infinite values) will be clamped to 0.0 or 1.0.
///
/// Returns `None` when
/// - The provided `fraction` is NAN
/// - The the object being sliced includes NAN or infinite coordinates
///
/// Otherwise returns [`Some(LineSplitResult)`](crate::algorithm::LineSplitResult)
///
/// example
///
/// ```
/// use geo::{Line, coord};
/// use geo::algorithm::{LineSplit, LineSplitResult};
/// let line = Line::new(
/// coord! {x: 0.0, y:0.0},
/// coord! {x:10.0, y:0.0},
/// );
/// let result = line.line_split(0.6);
/// assert_eq!(
/// result,
/// Some(LineSplitResult::FirstSecond(
/// Line::new(
/// coord! {x: 0.0, y:0.0},
/// coord! {x: 6.0, y:0.0},
/// ),
/// Line::new(
/// coord! {x: 6.0, y:0.0},
/// coord! {x:10.0, y:0.0},
/// )
/// ))
/// );
///
/// match result {
/// Some(LineSplitResult::First(line1))=>{},
/// Some(LineSplitResult::Second(line2))=>{},
/// Some(LineSplitResult::FirstSecond(line1, line2))=>{},
/// None=>{},
/// }
/// ```
fn line_split(&self, fraction: Scalar) -> Option<LineSplitResult<Self>>;

///
///
/// example
///
/// ```
///
/// ```
/// > Note: Currently the default implementation of this function provided by the trait is
/// > inefficient because it uses repeated application of the
/// > [.line_split()](LineSplit::line_split) function. In future, types implementing this trait
/// > should override this with a more efficient algorithm if possible.
fn line_split_many(&self, fractions: &Vec<Scalar>) -> Option<Vec<Option<Self>>>
where
Self: Clone,
{
match fractions.len() {
0 => None,
1 => self.line_split(fractions[0]).map(|item| {
let (a, b) = item.into_tuple();
vec![a, b]
}),
_ => {
let mut fractions: Vec<Scalar> = fractions
.iter()
.map(|item| item.min(Scalar::one()).max(Scalar::zero()))
.collect();
fractions
.sort_unstable_by(|a, b| a.partial_cmp(b).unwrap_or(std::cmp::Ordering::Equal));
fractions.insert(0, Scalar::zero());
fractions.push(Scalar::one());
let fractions = fractions; // remove mutability
let mut output: Vec<Option<Self>> = Vec::new();
let mut remaining_self = Some(self.clone());
for fraction in fractions.windows(2) {
// cannot be irrefutably unwrapped in for loop *sad crab noises*:
let (a, b) = match fraction {
&[a, b] => (a, b),
_ => return None,
};
let fraction_interval = b - a;
let fraction_to_end = Scalar::one() - a;
let next_fraction = fraction_interval / fraction_to_end;
remaining_self = if let Some(remaining_self) = remaining_self {
match remaining_self.line_split(next_fraction) {
Some(LineSplitResult::FirstSecond(line1, line2)) => {
output.push(Some(line1));
Some(line2)
}
Some(LineSplitResult::First(line1)) => {
output.push(Some(line1));
None
}
Some(LineSplitResult::Second(line2)) => {
output.push(None);
Some(line2)
}
None => return None,
}
} else {
output.push(None);
None
}
}

Some(output)
}
}
}

/// Split a [Line](crate::Line) or [LineString](crate::LineString)
/// at `fraction_start` and at `fraction_end`.
///
/// `fraction_start`/`fraction_end` are any real numbers. Only values between 0.0 and 1.0 will
/// split the line. Values outside of this range (including infinite values) will be clamped to
/// 0.0 or 1.0.
///
/// If `fraction_start > fraction_end`, then the values will be swapped prior splitting.
///
/// Returns [None] when
/// - Either`fraction_start` or `fraction_end` are NAN
/// - The the object being sliced includes NAN or infinite coordinates
///
/// Otherwise Returns a [`Some(LineSplitTwiceResult<T>)`](LineSplitTwiceResult)
///
/// A [`LineSplitTwiceResult<T>`](LineSplitTwiceResult) can contain between one and three
/// line parts where `T` is either [Line](crate::Line) or [LineString](crate::LineString).
///
/// Note that [LineSplitTwiceResult] provides various helper methods to get the desired part(s)
/// of the output.
///
/// The following example shows how to always obtain the "middle" part between the two splits
/// using the [`.into_second()`](LineSplitTwiceResult#method.into_second) method:
///
/// ```
/// use geo::{LineString, line_string};
/// use geo::algorithm::{LineSplit, EuclideanLength};
/// use approx::assert_relative_eq;
/// let my_road_line_string:LineString<f32> = line_string![
/// (x: 0.0,y: 0.0),
/// (x:10.0,y: 0.0),
/// (x:10.0,y:10.0),
/// ];
/// let my_road_len = my_road_line_string.euclidean_length();
/// let fraction_from = 5.0 / my_road_len;
/// let fraction_to = 12.0 / my_road_len;
/// // Extract the road section between `fraction_from` and `fraction_to` using `.into_second()`
/// let my_road_section = match my_road_line_string.line_split_twice(fraction_from, fraction_to) {
/// Some(result) => match result.into_second() { // get the second part of the result
/// Some(linestring)=>Some(linestring),
/// _=>None
/// },
/// _=>None
/// };
/// assert_relative_eq!(my_road_section.unwrap(), line_string![
/// (x: 5.0,y: 0.0),
/// (x:10.0,y: 0.0),
/// (x:10.0,y: 2.0),
/// ]);
/// ```
///
#[rustfmt::skip]
fn line_split_twice(
&self,
fraction_start: Scalar,
fraction_end: Scalar,
) -> Option<LineSplitTwiceResult<Self>> {
// import enum variants
use LineSplitTwiceResult::*;
// reject nan fractions
if fraction_start.is_nan() || fraction_end.is_nan() {
return None;
}
// clamp
let fraction_start = fraction_start.min(Scalar::one()).max(Scalar::zero());
let fraction_end = fraction_end.min(Scalar::one()).max(Scalar::zero());

// swap interval if incorrectly ordered
let (start_fraction, end_fraction) = if fraction_start > fraction_end {
(fraction_end, fraction_start)
} else {
(fraction_start, fraction_end)
};

// find the fraction to split the second portion of the line
let second_fraction =
(end_fraction - start_fraction)
/ (Scalar::one() - start_fraction);

match self.line_split(start_fraction) {
Some(LineSplitResult::FirstSecond(line1, line2)) => match line2.line_split(second_fraction) {
Some(LineSplitResult::FirstSecond(line2, line3)) => Some(FirstSecondThird(line1, line2, line3)),
Some(LineSplitResult::First (line2 )) => Some(FirstSecond (line1, line2 )),
Some(LineSplitResult::Second ( line3)) => Some(FirstThird (line1, line3)),
None => None,
},
Some(LineSplitResult::First (line1)) => Some(First(line1)),
Some(LineSplitResult::Second(line2)) => match line2.line_split(second_fraction) {
Some(LineSplitResult::FirstSecond(line2, line3)) => Some(SecondThird ( line2, line3)),
Some(LineSplitResult::First (line2 )) => Some(Second ( line2 )),
Some(LineSplitResult::Second ( line3)) => Some(Third ( line3)),
None => None,
},
None => None,
}
}
}
Loading