MAHOUT-1974 CUDA support #310

nsakharnykh · 2017-04-27T03:18:10Z

Initial PR for CUDA bindings support through JCuda

… yet

andy@micheal:~/sandbox/mahout_cuda/cuda$ export SIZE_N=5000 andy@micheal:~/sandbox/mahout_cuda/cuda$ export SIZE_M=5000 andy@micheal:~/sandbox/mahout_cuda/cuda$ export SIZE_S=5000 andy@micheal:~/sandbox/mahout_cuda/cuda$ export DENSITY=.2 andy@micheal:~/sandbox/mahout_cuda/cuda$ export NUM_RUNS=2 andy@micheal:~/sandbox/mahout_cuda/cuda$ export SEED=1234

… into AP-UNIT-TEST

andrewpalumbo · 2017-04-27T18:28:45Z

Tests pass on my system:

Mahout JVM Sparse multiplication time: 1914 ms.
Mahout JCuda Sparse multiplication time: 195 ms.
- sparse mmul at geometry of 1000 x 1000 %*% 1000 x 1000 density = .2.  5 runs
Mahout JVM Sparse multiplication time: 43 ms.
Mahout JCuda Sparse multiplication time: 11 ms.
- sparse mmul at geometry of 1000 x 1000 %*% 1000 x 1000 density = .02.  5 runs
Mahout JVM Sparse multiplication time: 2 ms.
Mahout JCuda Sparse multiplication time: 1 ms.
- sparse mmul at geometry of 1000 x 1000 %*% 1000 x 1000 density = .002.  5 runs
UserSetCUDATestSuite:
Mahout JVM Sparse multiplication time: 45 ms.
Mahout JCuda Sparse multiplication time: 10 ms.
User Defined sparse mmul at geometry of 1000 x 1000 %*% 1000 x 1000 density = 0.02 3 runs : 10 ms
- User Defined sparse mmul at geometry of 1000 x 1000 %*% 1000 x 1000 density = 0.02 3 runs

andrewpalumbo · 2017-04-28T18:39:49Z

@nsakharnykh @rawkintrevo I intend to have dense hammered out on Sunday.

andrewpalumbo · 2017-05-01T05:02:59Z

@nsakharnykh , @rawkintrevo, I ran out of time tonight to finish out dense %*% dense and dense %x% sparse; went down a rabbit hole woth the NVIDIA c api docs for cusparse. I noticed that JCuda supported only a single dense dense dgemm algorithm, with column major-matrices. Most mahout matrices are row-major, but i began considering the dense sparse multiplication, and was slightly thrown off by what seems to be required csr compression. it seems that sparse matrices should be compressed as csc since the. Anyways I ended up in the LAPACK fortran; apologies for not finishing it up tonight guys, I got off on a long tangent and ran out of time.

I pushed my beginning work up to my MAHOUT-1974 branch. Nothing really worth looking at right now, but I wil' make a PR against this when I get the densework together.

Regardless, I should have at least a quick n dirty version ready to go soon, while i work out what we'll need for experiments and benchmarking. We can still discuss and consider different SPARK configurations tomorrow with out dense cases. but I'd of course like to get this right.

As I mentioned on the last call we allow a "Sparse" DRM's in-core components to be both sparse and dense. Currently the threshold for conversion of a DRM block to be changed from a sparse to a dense matrix is pretty high (25% non zero estimate). In the future we will need to allow the user to set the sparsity somehow.

FYI:
https://github.com/apache/mahout/blob/master/math-scala/src/main/scala/org/apache/mahout/math/scalabindings/package.scala#L431

nsakharnykh · 2017-05-04T15:06:54Z

@andrewpalumbo regarding column-major: yes, this is the default mode for CUBLAS, sorry I think I didn't mention it in my original email. There are a couple options we can exercise here. 1. We can use transposed versions of gemm routines if the input matrices are row-major. I think the output matrix will be always column-major so we'll have to transpose it by using geam if we want to keep it in a different format. 2. We can also keep the dense matrices in column-major format on the GPU and move between csc and csr formats for sparse matrices by using CUSPARSE conversion routines like csr2csc. There are also existing API functions in CUSPARSE to convert sparse to dense csr2dense and the other way around dense2csr. I think we should try to use the available conversion APIs from CUSPARSE as much as possible to avoid writing this on our own.

andrewpalumbo · 2017-05-06T23:37:10Z

cuda/pom.xml

+  <parent>
+    <groupId>org.apache.mahout</groupId>
+    <artifactId>mahout</artifactId>
+    <version>0.13.0-SNAPSHOT</version>


needs to be 0.13.1-SNAPSHOT

andrewpalumbo · 2017-05-07T22:19:06Z

@nsakharnykh I have my MAHOUT-1974 branch that is almost complete with dense, etc (less the column major issues. We'd discussed just making a PR against this. but It may be easiest if you just went ahead and pushed this to MAHOUT/CUDA, and then I'll make a PR against that, which will be public so that others may comment on it.

andrewpalumbo · 2017-05-07T22:21:44Z

@nsakharnykh https://github.com/andrewpalumbo/mahout/tree/MAHOUT-1974/cuda ^^
P.S. this is still WIP so there's alot of garbage in it..

nsakharnykh · 2017-05-07T22:45:33Z

@andrewpalumbo Ok, sounds good. I'll try to push what I have as soon as I have some time in front of my laptop. I'm currently at GTC so my schedule is a bit fragmented.

andrewpalumbo · 2017-05-07T22:48:26Z

Great, thanks. I figured you were there, and very busy, I'll keep working on my end, and there should be no (or few conflicts).. no rush, since my branch is based off of yours.

rawkintrevo · 2017-05-08T00:39:58Z

looking awesome @nsakharnykh @andrewpalumbo

Before merging, don't forget to fill out
https://github.com/apache/mahout/blob/master/website/docs/native-solvers/cuda.md

andrewpalumbo · 2017-05-08T02:32:04Z

@rawkintrevo I asked @nsakharnykh to just go ahead and push this to the mahout/CUDA branch, since he's already up at GTC, and we're pushing this through as quickly as possible, and has spotty time to do this. I will immediately open up a [WIP] PR from my https://github.com/andrewpalumbo/mahout/tree/MAHOUT-1974/cuda branch (on top of his) and will fill out the md from there.

balashashanka · 2024-04-23T18:22:16Z

Just checking if we need to keep this PR open - I'm guessing this is already merged in feature branch: https://github.com/apache/mahout/tree/CUDA

nsakharnykh and others added 13 commits March 28, 2017 13:17

Initial version of CUDA bindings using JCuda

9706bb8

[WIP]Quick unit test benchmarks.. have not installed cusparse library…

ab5e635

… yet

Fix parameter name in comments

eaedfce

add default run size (and a timer)

b06678d

Fix error in hard coded unit test

2cfbf75

Change geometry of test vars for a faster run.

35c540e

add in some verbosity for cuda mmul test

9cc58a6

Cosmetic changes to unit tests

dd561bc

Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/mahout…

cdb6ca0

… into AP-UNIT-TEST

cosmetic change for user defined variable run

e72053d

Supply default environment settings for tests.

72bbf13

Enable exceptions for JCuda libraries

68e8400

andrewpalumbo mentioned this pull request Apr 27, 2017

[WIP][DISCUSS] Initial work on JCuda bindings #302

Closed

andrewpalumbo reviewed May 6, 2017

View reviewed changes

Update version to 0.13.1

67f41c0

asfgit pushed a commit that referenced this pull request May 8, 2017

MAHOUT-1974 initial CUDA support closes #310

e073884

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MAHOUT-1974 CUDA support #310

MAHOUT-1974 CUDA support #310

nsakharnykh commented Apr 27, 2017 •

edited

Loading

andrewpalumbo commented Apr 27, 2017

andrewpalumbo commented Apr 28, 2017

andrewpalumbo commented May 1, 2017

nsakharnykh commented May 4, 2017

andrewpalumbo May 6, 2017

nsakharnykh May 7, 2017

andrewpalumbo commented May 7, 2017

andrewpalumbo commented May 7, 2017 •

edited

Loading

nsakharnykh commented May 7, 2017

andrewpalumbo commented May 7, 2017

rawkintrevo commented May 8, 2017

andrewpalumbo commented May 8, 2017

balashashanka commented Apr 23, 2024

MAHOUT-1974 CUDA support #310

Are you sure you want to change the base?

MAHOUT-1974 CUDA support #310

Conversation

nsakharnykh commented Apr 27, 2017 • edited Loading

andrewpalumbo commented Apr 27, 2017

andrewpalumbo commented Apr 28, 2017

andrewpalumbo commented May 1, 2017

nsakharnykh commented May 4, 2017

andrewpalumbo May 6, 2017

Choose a reason for hiding this comment

nsakharnykh May 7, 2017

Choose a reason for hiding this comment

andrewpalumbo commented May 7, 2017

andrewpalumbo commented May 7, 2017 • edited Loading

nsakharnykh commented May 7, 2017

andrewpalumbo commented May 7, 2017

rawkintrevo commented May 8, 2017

andrewpalumbo commented May 8, 2017

balashashanka commented Apr 23, 2024

nsakharnykh commented Apr 27, 2017 •

edited

Loading

andrewpalumbo commented May 7, 2017 •

edited

Loading