Best Practice: ArrayView vs Pointer #1059

JP-Ellis · 2021-09-01T06:22:35Z

I was wondering whether there was a best practive when writing functions that expect arrays, specifically when they are not taking ownership of the array. I couldn't see this being addressed in the documentation or in an existing issue so here's a few things I think could be worth thinking about:

Should functions prefer to take ArrayView over the equivalent Array, and what about using a pointer vs by value? For example, which of the following would be 'best'?
```
fn sum(a: ArrayView1<f64>) -> f64;
fn sum(a: &ArrayView1<f64>) -> f64;
fn sum(a: &Array1<f64>) -> f64;
```

When muteable data is expected, is there a preference to using ArrayViewMut vs &mut Array?

fn increment(mut a: ArrayViewMut1<f64>) -> f64;
fn increment(a: &mut Array1<f64>) -> f64;

I would expect that for most applications, the ArrayView and ArrayViewMut are preferred as they are more general.

Is the fact that they are views into an array mean that they behave more like pointers and thus &ArrayView is redundant?

The text was updated successfully, but these errors were encountered:

multimeric · 2021-09-01T09:17:07Z

I'm just a user, not a dev, but here's my take.

Firstly you should understand that an ArrayView is not just a pointer to an array. It's a pointer to some data (which may or may not be a full Array), but possibly using a subset of that data, and possibly with a different stride length, meaning that it could be a view over say every second element of the array. This information is stored within the ArrayBase struct.

My second general point is that, since all array subtypes are just ArrayBase with different type signatures, you can and should make your functions generic using these and almost never require any particular array type.

Now to your questions:

Should functions prefer to take ArrayView over the equivalent Array

Neither. As mentioned above, if your function doesn't care what type of array it takes, it should be made generic over ArrayBase<S, D>. I can't think of any functions that should require an ArrayView, especially because I believe all array types implement view() to convert them in a recommended way to an ArrayView if you somehow need one (e.g. the split_at() function is only implemented for ArrayView for some reason).

what about using a pointer vs by value

Follow Rust best practices here, and use the lowest level of ownership you can. If your function doesn't mutate the array, just accept a borrowed ArrayBase. In your example this would be:

fn sum<A, S, D>(a: &ArrayBase<S, D>) -> A
where
    S: Data<Elem = A>,
    D: Dimension;

And indeed this is basically how it's implemented in ndarray, albeit inside an impl block:

ndarray/src/numeric/impl_numeric.rs

Lines 19 to 49 in 87506b8

    
           impl<A, S, D> ArrayBase<S, D> 
        
           where 
        
               S: Data<Elem = A>, 
        
               D: Dimension, 
        
           { 
        
               /// Return the sum of all elements in the array. 
        
               /// 
        
               /// ``` 
        
               /// use ndarray::arr2; 
        
               /// 
        
               /// let a = arr2(&[[1., 2.], 
        
               ///                [3., 4.]]); 
        
               /// assert_eq!(a.sum(), 10.); 
        
               /// ``` 
        
               pub fn sum(&self) -> A 
        
               where 
        
                   A: Clone + Add<Output = A> + num_traits::Zero, 
        
               { 
        
                   if let Some(slc) = self.as_slice_memory_order() { 
        
                       return numeric_util::unrolled_fold(slc, A::zero, A::add); 
        
                   } 
        
                   let mut sum = A::zero(); 
        
                   for row in self.rows() { 
        
                       if let Some(slc) = row.as_slice() { 
        
                           sum = sum + numeric_util::unrolled_fold(slc, A::zero, A::add); 
        
                       } else { 
        
                           sum = sum + row.iter().fold(A::zero(), |acc, elt| acc + elt.clone()); 
        
                       } 
        
                   } 
        
                   sum 
        
               }

When muteable data is expected, is there a preference to using ArrayViewMut vs &mut Array?

Remember, if your function takes an ArrayViewMut, the caller has to transfer ownership of the whole View struct to your function, and can no longer use it. This means throwing away all the subset/stride information. Again, make your functions generic over &ArrayBase. If only the contents of the array need to be mutated, make it generic over ArrayBase<S, D> where S: DataMut. Only use a mutable reference if you need to edit the array itself, which I can't think of many use cases for. Maybe if you want to reshape it?

I would expect that for most applications, the ArrayView and ArrayViewMut are preferred as they are more general.

As explained, no, making it generic over ArrayBase with the appropriate trait is preferred as it's more general again.

Is the fact that they are views into an array mean that they behave more like pointers and thus &ArrayView is redundant?

No. See my first paragraph.

JP-Ellis · 2021-09-01T23:05:08Z

Thanks for your detailed answer 😄 I will refactor some of my code to be a bit more general and use ArrayBase.

Perhaps there should be some part of the documentation that addresses this to make it clearer? Or if that exists somewhere, I didn't see it.

Note that whether I pass in ArrayViewMut into a function or pass a mutable pointer to it, Rust should prevent other parts of the code from accessing it elsewhere (even for reading). So that's why I thought it wouldn't matter too much. I agree that the mutable reference would only ever be needed if I wanted to chnage the underlying array and generally modifying data should only require a mutable view into the array (through DataMut).

multimeric · 2021-09-02T01:16:40Z

Note that whether I pass in ArrayViewMut into a function or pass a mutable pointer to it, Rust should prevent other parts of the code from accessing it elsewhere (even for reading). So that's why I thought it wouldn't matter too much.

This is true, it's not the safety that makes this unadvisable, it's the fact that you're requesting more ownership in your function than you need, and users will be annoyed that they have to re-create the view in their code since your function consumes it.

jturner314 · 2021-09-02T02:08:11Z

#879 has some discussion about this question and a proposal to improve ndarray's API in this respect. (See "Good way to express function parameters" under "Motivation".) With ndarray's current API, I pretty much agree with @multimeric, with the caveat that monomorphization bloat is also a concern, as mentioned in #879. To avoid this issue, it's probably best to use &ArrayBase<S, D> as the parameter type for the public function, but have the body of the public function immediately call a private, inner function which operates on ArrayView<A, D>. That gets pretty tedious to write, though. As a result, for my own private code, I often have functions take ArrayView/Mut parameters just to keep things simple.

JP-Ellis · 2021-09-02T04:37:30Z

@jturner314 Thanks for the link to the RFC! This looks like a very promising change for the next major release of ndarray.

akern40 · 2024-10-26T19:35:27Z

This is a great question and, as jturner314 had pointed out, will be addressed by resolving #879. As a result, this will be closed by #1440 when it lands.

multimeric mentioned this issue Sep 28, 2021

Add maintainers/reviewers? #1075

Closed

grothesque mentioned this issue Jul 18, 2024

Maintainership roundtable and discussion #1272

Open

akern40 linked a pull request Oct 26, 2024 that will close this issue

Adds an array reference type #1440

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Best Practice: ArrayView vs Pointer #1059

Best Practice: ArrayView vs Pointer #1059

JP-Ellis commented Sep 1, 2021

multimeric commented Sep 1, 2021 •

edited

Loading

JP-Ellis commented Sep 1, 2021 •

edited

Loading

multimeric commented Sep 2, 2021

jturner314 commented Sep 2, 2021

JP-Ellis commented Sep 2, 2021

akern40 commented Oct 26, 2024

Best Practice: ArrayView vs Pointer #1059

Best Practice: ArrayView vs Pointer #1059

Comments

JP-Ellis commented Sep 1, 2021

multimeric commented Sep 1, 2021 • edited Loading

JP-Ellis commented Sep 1, 2021 • edited Loading

multimeric commented Sep 2, 2021

jturner314 commented Sep 2, 2021

JP-Ellis commented Sep 2, 2021

akern40 commented Oct 26, 2024

multimeric commented Sep 1, 2021 •

edited

Loading

JP-Ellis commented Sep 1, 2021 •

edited

Loading