[eigen] assembling a sparse matrix from multiple threads

Discussion:

Lorenzo Botti

2018-07-05 08:12:10 UTC

Dear all,
I'm having troubles assembling a sparse matrix in parallel, from multiple
threads.
In particular if I use matrix.insert(i,j) to set the nonzero entries in
several concurrent threads the code fails randomly in optimized mode while,
in debug mode, an Eigen's error message is triggered saying that one of the
matrix entries already exists. The code behavior is pretty unpredictable
when repeatedly assembling a sparse matrix in a loop.

I'm confident that my code is thread safe and matrix entries are unique as
confirmed by the following: if I collect matrix entries in vectors of
triplets (one vector for each thread) and then I assemble the matrix
serially (using matrix.insert(triplet.row(),triplet.col()) =
triplet.value()) everything works fine.

I guess that concurrent insertion is not allowed/supported. Am I right?

Thanks for help
Lorenzo

Dan Čermák

2018-07-05 08:42:41 UTC

Permalink

Hi Lorenzo,

I haven't used Eigen in a while, but afaik it uses the compressed column
storage scheme which (at least I imagine it) makes it complicated to
insert new values efficiently in parallel.

Also the documentation of insert states that it is optimized for
sequential insertion
(https://eigen.tuxfamily.org/dox/classEigen_1_1SparseMatrix.html#title22),
which you probably aren't performing.

I guess you are probably better of assembling the matrix entries in the
vector of triplets and then creating the matrix afterwards. But someone
more familiar with Eigen should probably confirm that.

Cheers,

Dan

Post by Lorenzo Botti
Dear all,
I'm having troubles assembling a sparse matrix in parallel, from multiple
threads.
In particular if I use matrix.insert(i,j) to set the nonzero entries in
several concurrent threads the code fails randomly in optimized mode while,
in debug mode, an Eigen's error message is triggered saying that one of the
matrix entries already exists. The code behavior is pretty unpredictable
when repeatedly assembling a sparse matrix in a loop.
I'm confident that my code is thread safe and matrix entries are unique as
confirmed by the following: if I collect matrix entries in vectors of
triplets (one vector for each thread) and then I assemble the matrix
serially (using matrix.insert(triplet.row(),triplet.col()) =
triplet.value()) everything works fine.
I guess that concurrent insertion is not allowed/supported. Am I right?
Thanks for help
Lorenzo

Christoph Hertzberg

2018-07-05 11:07:33 UTC

Permalink

Hi!

Post by Dan ÄermÃ¡k
Hi Lorenzo,
I haven't used Eigen in a while, but afaik it uses the compressed column
storage scheme which (at least I imagine it) makes it complicated to
insert new values efficiently in parallel.

Yes, it uses CCS (or CRS in RowMajor mode), but as soon as you use the
`.insert(r,c)` method (or some related methods) it will get into a
non-compressed state -- making it slightly less costly (but still
sub-optimal) to insert in random order.
However, there is no synchronization happening. If you insert from
multiple threads, you need to lock the access to the matrix, which most
certainly is less efficient than inserting from a single thread.
You have the same problem when inserting values into a std::vector from
multiple threads.

Post by Dan ÄermÃ¡k
Also the documentation of insert states that it is optimized for
sequential insertion
(https://eigen.tuxfamily.org/dox/classEigen_1_1SparseMatrix.html#title22),
which you probably aren't performing.
I guess you are probably better of assembling the matrix entries in the
vector of triplets and then creating the matrix afterwards. But someone
more familiar with Eigen should probably confirm that.

Inserting into a single std::vector has similar synchronization
problems, so maybe collecting a vector for each thread is an option
(which then need to be concatenated, ideally without actually copying
the data).

The ideal solution will depend a lot on whether you know something about
the structure of the matrix, or the order in which values will be
inserted (e.g., if different threads never insert into the same column).

Christoph

Post by Dan ÄermÃ¡k
Cheers,
Dan

--
Dr.-Ing. Christoph Hertzberg

Besuchsadresse der Nebengeschäftsstelle:
DFKI GmbH
Robotics Innovation Center
Robert-Hooke-Straße 5
28359 Bremen, Germany

Postadresse der Hauptgeschäftsstelle Standort Bremen:
DFKI GmbH
Robotics Innovation Center
Robert-Hooke-Straße 1
28359 Bremen, Germany

Tel.: +49 421 178 45-4021
Zentrale: +49 421 178 45-0
E-Mail: ***@dfki.de

Weitere Informationen: http://www.dfki.de/robotik
-----------------------------------------------------------------------
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Firmensitz: Trippstadter Straße 122, D-67663 Kaiserslautern
Geschaeftsfuehrung: Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster
(Vorsitzender) Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern, HRB 2313
Sitz der Gesellschaft: Kaiserslautern (HRB 2313)
USt-Id.Nr.: DE 148646973
Steuernummer: 19/672/50006
-----------------------------------------------------------------------

Lorenzo Botti

2018-07-05 12:37:54 UTC

Permalink

Post by Christoph Hertzberg
However, there is no synchronization happening. If you insert from
multiple threads, you need to lock the access to the matrix, which most
certainly is less efficient than inserting from a single thread.
You have the same problem when inserting values into a std::vector from
multiple threads.

The point here is that I know that the code is thread safe, that is no more
than one thread will try write in position row i col j.
Accondingly performing the assembly in parallel should not required to lock
the insertion operation, am I wrong?

Post by Christoph Hertzberg
Inserting into a single std::vector has similar synchronization
problems, so maybe collecting a vector for each thread is an option
(which then need to be concatenated, ideally without actually copying
the data).

Yes, exactly. If I do that the code works just fine.

Christoph Hertzberg

2018-07-05 12:45:33 UTC

Permalink

Post by Lorenzo Botti
The point here is that I know that the code is thread safe, that is no more
than one thread will try write in position row i col j.
Accondingly performing the assembly in parallel should not required to lock
the insertion operation, am I wrong?

That would be true for a dense matrix, but for a sparse matrix that
can't work (in general) since sometimes values are moved or memory gets
re-allocated.

To repeat it more clearly: An Eigen::SparseMatrix is _not_ inherently
thread-safe (when values are inserted).

Christoph
--
Dr.-Ing. Christoph Hertzberg

Besuchsadresse der Nebengeschäftsstelle:
DFKI GmbH
Robotics Innovation Center
Robert-Hooke-Straße 5
28359 Bremen, Germany

Postadresse der Hauptgeschäftsstelle Standort Bremen:
DFKI GmbH
Robotics Innovation Center
Robert-Hooke-Straße 1
28359 Bremen, Germany

Tel.: +49 421 178 45-4021
Zentrale: +49 421 178 45-0
E-Mail: ***@dfki.de

Weitere Informationen: http://www.dfki.de/robotik
-----------------------------------------------------------------------
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Firmensitz: Trippstadter Straße 122, D-67663 Kaiserslautern
Geschaeftsfuehrung: Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster
(Vorsitzender) Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern, HRB 2313
Sitz der Gesellschaft: Kaiserslautern (HRB 2313)
USt-Id.Nr.: DE 148646973
Steuernummer: 19/672/50006
-----------------------------------------------------------------------

Gael Guennebaud

2018-07-05 16:10:37 UTC

Permalink

Hi,

if you can guaranty that each column* is written by a single thread AND
that you can predict a reasonable upper-bound for the number of non-zeros
in each column*, then you can preallocate using reserve() and the
subsequent writes within different columns will be thread safe.

* you can replace "column" by row if you use a row-majpr storage.

gael

On Thu, Jul 5, 2018 at 3:20 PM Christoph Hertzberg <

Post by Christoph Hertzberg

Post by Lorenzo Botti
The point here is that I know that the code is thread safe, that is no

Post by Lorenzo Botti
than one thread will try write in position row i col j.
Accondingly performing the assembly in parallel should not required to

lock

Post by Lorenzo Botti
the insertion operation, am I wrong?

That would be true for a dense matrix, but for a sparse matrix that
can't work (in general) since sometimes values are moved or memory gets
re-allocated.
To repeat it more clearly: An Eigen::SparseMatrix is _not_ inherently
thread-safe (when values are inserted).
Christoph
--
Dr.-Ing. Christoph Hertzberg
DFKI GmbH
Robotics Innovation Center
Robert-Hooke-StraÃe 5
28359 Bremen, Germany
DFKI GmbH
Robotics Innovation Center
Robert-Hooke-StraÃe 1
28359 Bremen, Germany
Tel.: +49 421 178 45-4021
Zentrale: +49 421 178 45-0
Weitere Informationen: http://www.dfki.de/robotik
-----------------------------------------------------------------------
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Firmensitz: Trippstadter StraÃe 122, D-67663 Kaiserslautern
Geschaeftsfuehrung: Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster
(Vorsitzender) Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern, HRB 2313
Sitz der Gesellschaft: Kaiserslautern (HRB 2313)
USt-Id.Nr.: DE 148646973
Steuernummer: 19/672/50006
-----------------------------------------------------------------------

Brad Bell

2018-07-05 16:22:33 UTC

Permalink

I think that one would have to be careful to avoid false sharing
https://en.wikipedia.org/wiki/False_sharing
to make this efficient; i.e., the memory for each thread would need to be in separate cache.

Hi,
if you can guaranty that each column* is written by a single thread AND that you can predict a
reasonable upper-bound for the number of non-zeros in each column*, then you can preallocate using
reserve() and the subsequent writes within different columns will be thread safe.
* you can replace "column" by row if you use a row-majpr storage.
gael

That would be true for a dense matrix, but for a sparse matrix that
can't work (in general) since sometimes values are moved or memory gets
re-allocated.
To repeat it more clearly: An Eigen::SparseMatrix is _not_ inherently
thread-safe (when values are inserted).
Christoph
--
Dr.-Ing. Christoph Hertzberg
DFKI GmbH
Robotics Innovation Center
Robert-Hooke-Straße 5
28359 Bremen, Germany
DFKI GmbH
Robotics Innovation Center
Robert-Hooke-Straße 1
28359 Bremen, Germany
Tel.: +49 421 178 45-4021
Zentrale: +49 421 178 45-0
Weitere Informationen: http://www.dfki.de/robotik
-----------------------------------------------------------------------
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Firmensitz: Trippstadter Straße 122, D-67663 Kaiserslautern
Geschaeftsfuehrung: Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster
(Vorsitzender) Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern, HRB 2313
Sitz der Gesellschaft: Kaiserslautern (HRB 2313)
USt-Id.Nr.: DE 148646973
Steuernummer: 19/672/50006
-----------------------------------------------------------------------

Lorenzo Botti

2018-07-05 20:23:42 UTC

Permalink

Ok, thanks! It works.
The trick is to make the matrix storage consistent to the thread's filling
behavior (by row in my case).
One caveat: I need to recreate the matrix at each assembly, using setZero()
does not work.

Il giorno gio 5 lug 2018 alle ore 19:28 Gael Guennebaud <

Post by Gael Guennebaud
Hi,
if you can guaranty that each column* is written by a single thread AND
that you can predict a reasonable upper-bound for the number of non-zeros
in each column*, then you can preallocate using reserve() and the
subsequent writes within different columns will be thread safe.
* you can replace "column" by row if you use a row-majpr storage.
gael
On Thu, Jul 5, 2018 at 3:20 PM Christoph Hertzberg <

Post by Christoph Hertzberg

Post by Lorenzo Botti
The point here is that I know that the code is thread safe, that is no

Post by Lorenzo Botti
than one thread will try write in position row i col j.
Accondingly performing the assembly in parallel should not required to

lock

Post by Lorenzo Botti
the insertion operation, am I wrong?

That would be true for a dense matrix, but for a sparse matrix that
can't work (in general) since sometimes values are moved or memory gets
re-allocated.
To repeat it more clearly: An Eigen::SparseMatrix is _not_ inherently
thread-safe (when values are inserted).
Christoph
--
Dr.-Ing. Christoph Hertzberg
DFKI GmbH
Robotics Innovation Center
Robert-Hooke-StraÃe 5
28359 Bremen, Germany
DFKI GmbH
Robotics Innovation Center
Robert-Hooke-StraÃe 1
28359 Bremen, Germany
Tel.: +49 421 178 45-4021
Zentrale: +49 421 178 45-0
Weitere Informationen: http://www.dfki.de/robotik
-----------------------------------------------------------------------
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Firmensitz: Trippstadter StraÃe 122, D-67663 Kaiserslautern
Geschaeftsfuehrung: Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster
(Vorsitzender) Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern, HRB 2313
Sitz der Gesellschaft: Kaiserslautern (HRB 2313)
USt-Id.Nr.: DE 148646973
Steuernummer: 19/672/50006
-----------------------------------------------------------------------

Andrea Bocci

2018-07-05 12:47:57 UTC

Permalink

Post by Christoph Hertzberg
However, there is no synchronization happening. If you insert from

Post by Christoph Hertzberg
multiple threads, you need to lock the access to the matrix, which most
certainly is less efficient than inserting from a single thread.
You have the same problem when inserting values into a std::vector from
multiple threads.

The point here is that I know that the code is thread safe, that is no
more than one thread will try write in position row i col j.
Accondingly performing the assembly in parallel should not required to
lock the insertion operation, am I wrong?

Yes, unfortunately: if I understand correctly how it works, a sparse matrix
does not preallocate all its elements; they are allocated as the matrix is
filled, and this part that is not safe for concurrent operations.

.A

Christoph Hertzberg

2018-07-05 12:57:28 UTC

Permalink

Post by Andrea Bocci
Yes, unfortunately: if I understand correctly how it works, a sparse matrix
does not preallocate all its elements; they are allocated as the matrix is
filled, and this part that is not safe for concurrent operations.

You can actually pre-allocate a sparse matrix (using
SparseMatrix::reserve():
http://eigen.tuxfamily.org/dox/classEigen_1_1SparseMatrix.html#ac2b219eb36fbab0bfae535cfbfc74a76

Even then (i.e., if you can guarantee that sufficiently many elements
are reserved per inner vector), insertion is still not completely
thread-safe, because elements can get moved, and the `m_data.m_size`
member variable will be modified by different threads -- you may get
lucky here, but you should really not rely on that.

Christoph
--
Dr.-Ing. Christoph Hertzberg

Besuchsadresse der Nebengeschäftsstelle:
DFKI GmbH
Robotics Innovation Center
Robert-Hooke-Straße 5
28359 Bremen, Germany

Postadresse der Hauptgeschäftsstelle Standort Bremen:
DFKI GmbH
Robotics Innovation Center
Robert-Hooke-Straße 1
28359 Bremen, Germany

Tel.: +49 421 178 45-4021
Zentrale: +49 421 178 45-0
E-Mail: ***@dfki.de

Weitere Informationen: http://www.dfki.de/robotik
-----------------------------------------------------------------------
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Firmensitz: Trippstadter Straße 122, D-67663 Kaiserslautern
Geschaeftsfuehrung: Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster
(Vorsitzender) Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern, HRB 2313
Sitz der Gesellschaft: Kaiserslautern (HRB 2313)
USt-Id.Nr.: DE 148646973
Steuernummer: 19/672/50006
-----------------------------------------------------------------------

Lorenzo Botti

2018-07-05 15:48:27 UTC

Permalink

Ok, got it.
Thank you very much for pointing this out.

Bests
Lorenzo

Il giorno gio 5 lug 2018 alle ore 16:50 Christoph Hertzberg <

Post by Lorenzo Botti

Post by Andrea Bocci
Yes, unfortunately: if I understand correctly how it works, a sparse

matrix

Post by Andrea Bocci
does not preallocate all its elements; they are allocated as the matrix

Post by Andrea Bocci
filled, and this part that is not safe for concurrent operations.

You can actually pre-allocate a sparse matrix (using
http://eigen.tuxfamily.org/dox/classEigen_1_1SparseMatrix.html#ac2b219eb36fbab0bfae535cfbfc74a76
Even then (i.e., if you can guarantee that sufficiently many elements
are reserved per inner vector), insertion is still not completely
thread-safe, because elements can get moved, and the `m_data.m_size`
member variable will be modified by different threads -- you may get
lucky here, but you should really not rely on that.
Christoph
--
Dr.-Ing. Christoph Hertzberg
DFKI GmbH
Robotics Innovation Center
Robert-Hooke-StraÃe 5
28359 Bremen, Germany
DFKI GmbH
Robotics Innovation Center
Robert-Hooke-StraÃe 1
28359 Bremen, Germany
Tel.: +49 421 178 45-4021
Zentrale: +49 421 178 45-0
Weitere Informationen: http://www.dfki.de/robotik
-----------------------------------------------------------------------
Deutsches Forschungszentrum fuer Kuenstliche Intelligenz GmbH
Firmensitz: Trippstadter StraÃe 122, D-67663 Kaiserslautern
Geschaeftsfuehrung: Prof. Dr. Dr. h.c. mult. Wolfgang Wahlster
(Vorsitzender) Dr. Walter Olthoff
Vorsitzender des Aufsichtsrats: Prof. Dr. h.c. Hans A. Aukes
Amtsgericht Kaiserslautern, HRB 2313
Sitz der Gesellschaft: Kaiserslautern (HRB 2313)
USt-Id.Nr.: DE 148646973
Steuernummer: 19/672/50006
-----------------------------------------------------------------------