[eigen] Adding support for AMD GPUs in Eigen

Discussion:

Vincent Hui

2018-07-04 08:56:17 UTC

Hi Deven,

Thank you for your contribution. Did you benchmark Eigen with and without
AMD GPU? Can we give us instructions how to use Eigen with AMD GPU? I have
AMD GPU, I can try Eigen with AMD GPU. Futhermore, did you benchmark
TensorFlow with and without AMD GPU after you added AMD GPU support to
Eigen?

Thank a lot,
Vincent

PR submitted - https://bitbucket.org/eigen/eigen/pull-requests/402/
adding-support-for-using-eigen-in-hip/diff
Jason : Thank you for your reponse. Hoping you will find the initial
level of HIP support to your liking.
Vincent : yes AMD GPU support is similar to what exists for CUDA / OpenCL
Thanks
deven

Hi Deven,
Is AMD GPUs support similar to Eigen OpenCL hardware support to Eigen?
Thanks,
Vincent

Just had to drop in and say cool! It's great to see HIP support
spread through the ecosystem.
I've tried to use Eigen a few times in CUDA and I realized a few
-Solvers that could execute on the GPU didn't, because of dynamic
allocations happening somewhere and I couldn't figure out how to make
that not happen. For things like a batched-qr solve of small
matrices. They may not have actually had happen but the problem is
they'd be referenced on the device side compile, somewhere deep. I
think at the time I was looking at either the SVD or QR solvers.
-It wasn't as flexible as I first hoped, unfortunately there's a lot
of strategies you can use to evaluate matrix operations in warp,
block, or device level parallel and this is outside of what eigen
offers. If it was trying to be a device side library in the capacity
of flexibility that makes sense there, it should for maximum
performance. The cutlass library takes this to the extreme for matrix
https://devblogs.nvidia.com/cutlass-linear-algebra-cuda/
https://github.com/NVIDIA/cutlass . To clarify for flexibility, I
don't just mean exploiting the hierarchy via tiling but choosing
between simpler multiplication techniques given smaller dimensions,
layout, amount of shared memory desired (or registers sacrificed) and
choosing how to extract the parallelism into such evaluations.
Which means that each thread id has to do all its work individually,
this can be somewhat reasonable, depends on the problem's/kernel's
needs.
As for buidling it with cuda support, it autodetects the NVCC compiler
through the macro common definitions that compiler defines (__NVCC__
and the like). You have to explicitly disable it if you're compiling
with NVCC (I've had errors and turn it off occasionally when I'm using
eigen in nvcc on the host side).
I don't know anything about the unit tests, sorry. I also haven't
been watching for any recent changes so my experiences may also be a
little out of date.
I am not a core dev but what I have seen and used in the past for the
project is to submit PR's to https://bitbucket.org/eigen/eigen/ - I of
course leave plenty of room for any stakeholders to clarify any of the
other questions you asked.
-Jason

Hi All,
I am a software developement engineer in AMD and we are currently

working on

enabling support AMD GPUs in Eigen.
We envision that support for the AMD GPUs can be implemented in fashion
similar to what has already been done for NVidia with CUDA. I have

some

1. What is the purpose of the "EIGEN_USE_GPU" macro in the codebase? I

see a

lot of code that is guarded by the EIGEN_CUDACC (guards code that uses

CUDA

extensions) and EIGEN_CUDA_ARCH (guards code that is expected to

execute on

the device) macros, which I think I understand. What I am not clear

about is

the need/use for the EIGEN_USE_GPU macro.
2. How do I configure cmake to
- build Eigen with GPU / CUDA support?
- enable all the unit tests that target the GPU/CUDA?
I want to make sure that our implementation is consistent with what is
already in place for CUDA, and hence the need to understand the CUDA
implementation.
Any information regarding this will be very helpful.
3. What is the correct protocol to use for upstreaming our code (once

done)

to the Eigen codebase? Will a simple pull request suffice, or do we

need to

do something more? Is there some acceptance criteria/checklist we need

complete, before we can can issue the PR?
Please let me know if this is not the correct forum to address these
questions (and point me to the right one :) ) I expect to have a

quite a

few more questions in the coming days, as we
Thanks
deven

Deven Desai

2018-07-06 17:31:07 UTC

Permalink

Hi Vincent,

We have not done any benchmarking of Eigen with AMD GPUs yet. I am
currently focusing on getting the functionality in place and implementing
all the updates requested in the PR feedback so that we can get the PR
merged. I expect to be able to do some benchmarking once that is done.

Running with AMD GPU should only require passing "-DEIGEN_USE_HIP" to the
compiler (for code that pulls in Eigen header files). Everything else
should be similar to what you would do for Nvidia GPUs.

Getting Tensorflow to work with AMD GPUs requires a lot of other changes in
addition to this change in Eigen. There is a separate project ongoing that
is tasked with getting Tensorflow to work on AMD GPUs. Let me know if you
need more information.

Thanks

deven

Post by Vincent Hui
Hi Deven,
Thank you for your contribution. Did you benchmark Eigen with and without
AMD GPU? Can we give us instructions how to use Eigen with AMD GPU? I have
AMD GPU, I can try Eigen with AMD GPU. Futhermore, did you benchmark
TensorFlow with and without AMD GPU after you added AMD GPU support to
Eigen?
Thank a lot,
Vincent

PR submitted -
https://bitbucket.org/eigen/eigen/pull-requests/402/adding-support-for-using-eigen-in-hip/diff
Jason : Thank you for your reponse. Hoping you will find the initial
level of HIP support to your liking.
Vincent : yes AMD GPU support is similar to what exists for CUDA / OpenCL
Thanks
deven

Post by Vincent Hui
Hi Deven,
Is AMD GPUs support similar to Eigen OpenCL hardware support to Eigen?
Thanks,
Vincent

Hi All,
I am a software developement engineer in AMD and we are currently

working on

enabling support AMD GPUs in Eigen.
We envision that support for the AMD GPUs can be implemented in

fashion

similar to what has already been done for NVidia with CUDA. I have

some

1. What is the purpose of the "EIGEN_USE_GPU" macro in the codebase?

I see a

lot of code that is guarded by the EIGEN_CUDACC (guards code that

uses CUDA

extensions) and EIGEN_CUDA_ARCH (guards code that is expected to

execute on

the device) macros, which I think I understand. What I am not clear

about is

done)

to the Eigen codebase? Will a simple pull request suffice, or do we

need to

do something more? Is there some acceptance criteria/checklist we

need to

complete, before we can can issue the PR?
Please let me know if this is not the correct forum to address these
questions (and point me to the right one :) ) I expect to have a

quite a

few more questions in the coming days, as we
Thanks
deven

Vincent Hui

2018-08-08 06:21:33 UTC

Permalink

Hi Deven,

How can I try using AMD GPU in Eigen? Is your code merged on default branch?
I just need to clone Eigen and build Eigen on default branch by passing
"-DEIGEN_USE_HIP" to the compiler. Am I right?

Thanks,
Vincent

Post by Deven Desai
Hi Vincent,
We have not done any benchmarking of Eigen with AMD GPUs yet. I am
currently focusing on getting the functionality in place and implementing
all the updates requested in the PR feedback so that we can get the PR
merged. I expect to be able to do some benchmarking once that is done.
Running with AMD GPU should only require passing "-DEIGEN_USE_HIP" to the
compiler (for code that pulls in Eigen header files). Everything else
should be similar to what you would do for Nvidia GPUs.
Getting Tensorflow to work with AMD GPUs requires a lot of other changes
in addition to this change in Eigen. There is a separate project ongoing
that is tasked with getting Tensorflow to work on AMD GPUs. Let me know if
you need more information.
Thanks
deven

Post by Vincent Hui
Hi Deven,
Is AMD GPUs support similar to Eigen OpenCL hardware support to Eigen?
Thanks,
Vincent

Hi All,
I am a software developement engineer in AMD and we are currently

working on

enabling support AMD GPUs in Eigen.
We envision that support for the AMD GPUs can be implemented in

fashion

similar to what has already been done for NVidia with CUDA. I have

some

1. What is the purpose of the "EIGEN_USE_GPU" macro in the codebase?

I see a

lot of code that is guarded by the EIGEN_CUDACC (guards code that

uses CUDA

extensions) and EIGEN_CUDA_ARCH (guards code that is expected to

execute on

the device) macros, which I think I understand. What I am not clear

about is

already in place for CUDA, and hence the need to understand the CUDA
implementation.
Any information regarding this will be very helpful.
3. What is the correct protocol to use for upstreaming our code

(once done)

to the Eigen codebase? Will a simple pull request suffice, or do we

need to

do something more? Is there some acceptance criteria/checklist we

need to

complete, before we can can issue the PR?
Please let me know if this is not the correct forum to address these
questions (and point me to the right one :) ) I expect to have a

quite a

few more questions in the coming days, as we
Thanks
deven