[eigen] Blitz++ vs. Eigen::Tensor?

Discussion:

Elizabeth Fischer

2016-01-13 21:58:33 UTC

Hello,

This comes out of a discussion we've been having over on the Blitz++ list.
Blitz++ has served a number of well with its blitz::Array, which is similar
to Fortran 90 arrays or Numpy's ndarray. We all like Blitz++ and have had
no problems with it --- it's mature, stable software. However, the
original authors have all moved on and it is currently unmaintained. We
are considering whether it is worthwhile to build momentum around
maintaining it and making new releases --- or whether we'd be better off
using something else.

Eigen::Tensor came up today as being similar to blitz::Array in concept;
however, it does not seem to be as mature and might not do all the things
we're used to doing with blitz::Array. I'm trying to better understand
Eigen::Tensor, with respect to the things I use Blitz++ for.

1. I frequently use blitz::Array to point into pre-existing memory blocks,
memory it does not "own." That way, I can write algorithms that take a
blitz::Array as an argument and know I'll be able to use them in a wide
variety of input formats (including data that originated in Fortran 90 or
Numpy arrays). Is Eigen::Tensor able to do this, or does it insist on
owning its memory?

2. Blitz::Array is quite flexible with its dope vector. You can make an
array with any set of strides. Column major, row major. Any base on each
dimension (0, 1, 17, etc). Even non-unit strides. Even non-contiguous
array slices. I use this, for example, to provide compatibility with
Fortran. Any Fortran 90 array can be passed into C++ code and turned into
a blitz::Array with exactly the same shape, base, shape, etc. For example,
see:

http://jerseybiker.blogspot.com/2013/10/passing-assumed-shape-arrays-between.html
Passing stuff between C++ and Fortran would be a lot more problematic if
the C++ array package does not support all the possibilities supported by
Fortran 90.

3. Does Eigen::Tensor have a shared memory or shared pointer kind of
model? Blitz++ does --- and I'm not sure whether it's a good idea or not:
multiple blitz::Array objects can share the same underlying memory. The
blitz::Array class is essentially a shared_ptr<> plus dope vector. C++11
has cleaner semantics, and it should be possible to have a core "dumb"
array and appropriate smart pointer stuff around it. I've not figured out
how this would work, or whether it would be better in the end than what
Blitz++ does.

4. Has someone considered simply using Blitz++ as the basis for
Eigen::Tensor? It looks like the license are basically compatible:
https://www.gnu.org/licenses/license-list.html
The authors of Blitz++ could probably be convinced to re-license it if that
made things easier for the Eigen project. They're really done with the
code and have lost interest in maintaining it.

Thank you,
-- Elizabeth

Chris Dyer

2016-01-14 01:55:31 UTC

Permalink

affiliated) with Eigen. Yes, it's quite easy to wrap preexisting memory
blocks using Eigen::TensorMap (nb. you can define EIGEN_NO_MALLOC to make
sure you don't accidentally trigger memory allocation during evaluation).
If you're running computations on a GPU, you have to manage your own
memory. The types are a bit cumbersome, but judicious use of auto and/or
typedefs mostly hides that.

Post by Elizabeth Fischer
2. Blitz::Array is quite flexible with its dope vector. You can make an
array with any set of strides. Column major, row major. Any base on each
dimension (0, 1, 17, etc). Even non-unit strides. Even non-contiguous
array slices. I use this, for example, to provide compatibility with
Fortran. Any Fortran 90 array can be passed into C++ code and turned into
a blitz::Array with exactly the same shape, base, shape, etc. For example,
http://jerseybiker.blogspot.com/2013/10/passing-assumed-shape-arrays-between.html
Passing stuff between C++ and Fortran would be a lot more problematic if
the C++ array package does not support all the possibilities supported by
Fortran 90.
3. Does Eigen::Tensor have a shared memory or shared pointer kind of
multiple blitz::Array objects can share the same underlying memory. The
blitz::Array class is essentially a shared_ptr<> plus dope vector. C++11
has cleaner semantics, and it should be possible to have a core "dumb"
array and appropriate smart pointer stuff around it. I've not figured out
how this would work, or whether it would be better in the end than what
Blitz++ does.
4. Has someone considered simply using Blitz++ as the basis for
https://www.gnu.org/licenses/license-list.html
The authors of Blitz++ could probably be convinced to re-license it if
that made things easier for the Eigen project. They're really done with
the code and have lost interest in maintaining it.
Thank you,
-- Elizabeth

Christoph Hertzberg

2016-01-18 17:48:32 UTC

Permalink

Hi,

I was hoping Benoit Steiner would say some words to that (Eigen::Tensor
is essentially 95% his work [I did not actually measure]). But I can
give some general remarks:
As already pointed out there are Map<Matrix [or Array]> for mapping into
existing 2d memory Matrices/Arrays and the analogue TensorMap for
multidimensional Tensors. There is not much interchange between the
"classic 2d" Eigen and Tensors at the moment. And I can't really say a
lot about Eigen::Tensor.

Regarding shared memory: This has been briefly discussed a few times.
IIRC it was mostly dismissed as it makes things such as aliasing or
memory leaks complicate to manage. E.g., if you have a big matrix A and
then have a shared-memory-view into a block of A and A gets destructed,
what shall happen to the view of the block?
Also, if a shared-memory object is changed, shall it first clone itself,
or shall the shared objects change as well? (I had my share of trouble
with OpenCV's shared-memory matrices ...)
If you want anything like that at the moment, you need to manage your
memory yourself and manually work with Eigen::Maps.

But if someone has a proposal for a clean interface/implementation for
shared memory matrices, I would not generally oppose integrating that
into Eigen -- as long as the user has a choice, and the behavior is more
or less clear.

Cheers,
Christoph

Post by Elizabeth Fischer
Hello,
This comes out of a discussion we've been having over on the Blitz++ list.
Blitz++ has served a number of well with its blitz::Array, which is similar
to Fortran 90 arrays or Numpy's ndarray. We all like Blitz++ and have had
no problems with it --- it's mature, stable software. However, the
original authors have all moved on and it is currently unmaintained. We
are considering whether it is worthwhile to build momentum around
maintaining it and making new releases --- or whether we'd be better off
using something else.
Eigen::Tensor came up today as being similar to blitz::Array in concept;
however, it does not seem to be as mature and might not do all the things
we're used to doing with blitz::Array. I'm trying to better understand
Eigen::Tensor, with respect to the things I use Blitz++ for.
1. I frequently use blitz::Array to point into pre-existing memory blocks,
memory it does not "own." That way, I can write algorithms that take a
blitz::Array as an argument and know I'll be able to use them in a wide
variety of input formats (including data that originated in Fortran 90 or
Numpy arrays). Is Eigen::Tensor able to do this, or does it insist on
owning its memory?
2. Blitz::Array is quite flexible with its dope vector. You can make an
array with any set of strides. Column major, row major. Any base on each
dimension (0, 1, 17, etc). Even non-unit strides. Even non-contiguous
array slices. I use this, for example, to provide compatibility with
Fortran. Any Fortran 90 array can be passed into C++ code and turned into
a blitz::Array with exactly the same shape, base, shape, etc. For example,
http://jerseybiker.blogspot.com/2013/10/passing-assumed-shape-arrays-between.html
Passing stuff between C++ and Fortran would be a lot more problematic if
the C++ array package does not support all the possibilities supported by
Fortran 90.
3. Does Eigen::Tensor have a shared memory or shared pointer kind of
multiple blitz::Array objects can share the same underlying memory. The
blitz::Array class is essentially a shared_ptr<> plus dope vector. C++11
has cleaner semantics, and it should be possible to have a core "dumb"
array and appropriate smart pointer stuff around it. I've not figured out
how this would work, or whether it would be better in the end than what
Blitz++ does.
4. Has someone considered simply using Blitz++ as the basis for
https://www.gnu.org/licenses/license-list.html
The authors of Blitz++ could probably be convinced to re-license it if that
made things easier for the Eigen project. They're really done with the
code and have lost interest in maintaining it.
Thank you,
-- Elizabeth

--
Dipl. Inf., Dipl. Math. Christoph Hertzberg

Universität Bremen
FB 3 - Mathematik und Informatik
AG Robotik
Robert-Hooke-Straße 1
28359 Bremen, Germany

Zentrale: +49 421 178 45-6611

Besuchsadresse der Nebengeschäftsstelle:
Robert-Hooke-Straße 5
28359 Bremen, Germany

Tel.: +49 421 178 45-4021
Empfang: +49 421 178 45-6600
Fax: +49 421 178 45-4150
E-Mail: ***@informatik.uni-bremen.de

Weitere Informationen: http://www.informatik.uni-bremen.de/robotik

Benoit Steiner

2016-01-28 23:27:05 UTC

Permalink

You have 2 choices: create a Tensor, which owns its memory, or create a
TensorMap, which wrap an existing block of memory that the caller is
responsible for managing.

2. Blitz::Array is quite flexible with its dope vector. You can make an

Post by Elizabeth Fischer
array with any set of strides. Column major, row major. Any base on each
dimension (0, 1, 17, etc). Even non-unit strides. Even non-contiguous
array slices. I use this, for example, to provide compatibility with
Fortran. Any Fortran 90 array can be passed into C++ code and turned into
a blitz::Array with exactly the same shape, base, shape, etc. For example,
http://jerseybiker.blogspot.com/2013/10/passing-assumed-shape-arrays-between.html
Passing stuff between C++ and Fortran would be a lot more problematic if
the C++ array package does not support all the possibilities supported by
Fortran 90.

Once you've created a TensorMap to wrap a contiguous memory buffer, you can
call the slice operation to have strided access. It probably not as
powerful than what blitz provide, but it's a start.

Post by Elizabeth Fischer
3. Does Eigen::Tensor have a shared memory or shared pointer kind of
multiple blitz::Array objects can share the same underlying memory. The
blitz::Array class is essentially a shared_ptr<> plus dope vector. C++11
has cleaner semantics, and it should be possible to have a core "dumb"
array and appropriate smart pointer stuff around it. I've not figured out
how this would work, or whether it would be better in the end than what
Blitz++ does

Multiple TensorMap objects can share the same memory buffer. You could also
create a memory buffer using a regular tensor, and access its buffer using
one or more TensorMaps. For example, you could write something like:
Eigen::Tensor<float, 2> tensor_that_owns_its_memory(7, 5);
Eigen::TensorMap<Eigen::Tensor<float, 2> >
tensor_that_reuses_existing_storage(tensor_that_owns_its_memory.data(), 7,
5);

Post by Elizabeth Fischer
4. Has someone considered simply using Blitz++ as the basis for
https://www.gnu.org/licenses/license-list.html
The authors of Blitz++ could probably be convinced to re-license it if
that made things easier for the Eigen project. They're really done with
the code and have lost interest in maintaining it.

I looked at Blitz++ before starting to work on the Eigen tensor module, but
I couldn't see a way to bring blitz++ to the level of performance we need.
It seemed easier to design a new tensor library from scratch to take
advantage of multiple cores and/or gpus and extend its API over time rather
than trying improve the performance of blitz++.

--
Benoit <http://bsteiner.info>