-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to realise convolution? #518
Comments
First let me explain the error. Accelerate has a very strict separation between array computations (Acc) and scalar-level computations (Exp). You cannot perform array computations within an Exp context. This is also explained under "Avoiding nested parallelism" in the documentation. In your program, you're entering the Exp context by writing As for how to implement convolution: if the kernel size ( |
About how to implement a convolution: I rewrote your code to the following, and this seems to work, but I don't really know much about Gaussian convolutions, so please test if this indeed does what you intended. EDIT: as correctly observed by robbert-vdh below, if the kernel array is non-symmetric, this is technically cross-correlation, not convolution. (This is the same behaviour as the poster's original code.) Either reverse the kernel array in both dimensions, or replace module Main where
import qualified Data.Array.Accelerate as A
import Data.Array.Accelerate (Exp, Acc, Array, DIM2, DIM3, DIM4, Z(..), (:.)(..))
import qualified Data.Array.Accelerate.Interpreter as I
-- Note that I've removed the additional Int parameters; you can use `shape` to
-- request the size of an array.
gauss_convolution :: Acc (Array DIM2 Double) -> Acc (Array DIM2 Double) -> Acc (Array DIM2 Double)
gauss_convolution gauss_op image = compute_re gauss_op (compute_plat gauss_op image)
-- Add two dimensions on the bottom. This function replaces every cell in the
-- input image with a 2-dimensional array the size of gauss_op, thereby making
-- a 4-dimensional array.
compute_plat :: Acc (Array DIM2 Double) -> Acc (Array DIM2 Double) -> Acc (Array DIM4 Double)
compute_plat gauss_op image = A.generate (A.I4 h w n m) myfun
where
-- Size of the gaussian kernel. Use the pattern synonyms instead of `lift`
-- and `unlift`; they are much nicer. Alternative notation here would be
-- `Z_ ::. n ::. m`.
A.I2 n m = A.shape gauss_op
-- Size of the full image
A.I2 h w = A.shape image
-- Now this function is going to be called for the whole 4-dimensional
-- array, which means that it gets the index in the image as the first two
-- components and the index in the kernel as the last two components.
--
-- About yuan{x,y}: `floor ((fromIntegral x :: Double) / 2)` is precisely
-- what `div` does. ;)
--
-- kery and kerx are what you called y1 and x1.
myfun :: Exp DIM4 -> Exp Double
myfun (A.I4 y x kery kerx) =
image A.! A.I2 (A.min (h - 1) (A.max 0 (y + kery - n `A.div` 2)))
(A.min (w - 1) (A.max 0 (x + kerx - m `A.div` 2)))
-- Fold the 2-dimensional subarrays into a scalar again, bringing the
-- 4-dimensional array down to 2 dimensions again. The first argument is the
-- gaussian kernel.
compute_re :: Acc (Array DIM2 Double) -> Acc (Array DIM4 Double) -> Acc (Array DIM2 Double)
compute_re gauss_op pla =
let A.I4 h w n m = A.shape pla
-- Replicate the gaussian kernel over all positions in the image
replicated_gauss_op :: Acc (Array DIM4 Double)
replicated_gauss_op = A.replicate (A.I4 h w A.All_ A.All_) gauss_op
-- Multiply the kernel elementwise with each neighbourhood in the image
multiplied :: Acc (Array DIM4 Double)
multiplied = A.zipWith (A.*) replicated_gauss_op pla
-- Contract each neighbourhood (2-dimensional subarray) down to a
-- 1-dimensional vector, so that we can fold it away in one go
contracted :: Acc (Array DIM3 Double)
contracted = A.reshape (A.I3 h w (n * m)) multiplied
in -- Finally, sum this contracted vector to get the 2-dimensional result, the size of the original image.
A.fold (A.+) 0 contracted
main :: IO ()
main = do
let image = A.fromList (Z :. (7 :: Int) :. (8 :: Int))
[1, 0, 0, 0, 0, 0, 0, 1
,0, 1, 0, 0, 0, 0, 1, 0
,0, 0, 1, 0, 0, 1, 0, 0
,0, 0, 0, 1, 1, 0, 0, 0
,0, 0, 1, 0, 0, 1, 0, 0
,0, 1, 0, 0, 0, 0, 1, 0
,1, 0, 0, 0, 0, 0, 0, 1]
gauss_op = A.fromList (Z :. (3 :: Int) :. (5 :: Int))
(map (/ 17)
[0, 0, 2, 0, 0
,1, 3, 5, 3, 1
,0, 0, 2, 0, 0])
print $ I.runN gauss_convolution gauss_op image |
Thanks, It's effective! |
WOW! Beautiful job! I need to retry it. |
I read your code, it's good job. But the process of creating a four-dimensional array consumes a lot of memory, especially when I work with large images |
This is an understandable worry. However, have you tried it? Does it indeed consume much more memory at runtime than expected? The reason I'm asking is that Accelerate will not actually create ("manifest") this intermediate 4-dimensional array in memory. (At least, if all is right.) The reason is array fusion: see the "Fusion" subheading in the documentation for See the image (I hope it's readable...) for a diagram showing how this works for the program that I wrote above. In black is the original program, in red some of the fusion rules, and in blue the fused program. Note that Accelerate can actually give you the fused program: |
In case anyone runs into this in the future, keep in mind that does not actually compute convolution. Instead it computes cross correlation. Since the kernel is symmetrical those two will yield the same result, but you need to change the |
Thank you for your professional help.I solved it. |
You can indeed use calls to |
The problem is that I need to call it repeatly, I just wan to sample every other row and col . |
if you want, for example, the 0th, 2nd, 4th, 6th, row etc. of a matrix let I2 h w = shape m
in backpermute (I2 ((h + 1) `div` 2) w) -- new shape: height goes 0->0, 1->1, 2->1, 3->2, 4->2, 5->3, etc.
(\(I2 y x) -> m ! I2 (2 * y) x) -- given index in new array, produce index in old array
m -- array to read values from Getting the 1st, 3rd, 5th, etc. rows is an exercise to the reader. :) |
Thank you! |
Hope this was useful :) |
Below is my code
There throwed an error
Maybe you don't need to read my code, I just want to know how to realize convolution?
Thank you!
The text was updated successfully, but these errors were encountered: