Home Page for Marc Baboulin

    Marc Baboulin, PhD, HDR

Professor at Université Paris-Saclay (France)

Senior Research Scientist at Inria

Laboratoire Méthodes Formelles (LMF), Inria team QuaCS

E-mail: marc.baboulin [at] lmf.cnrs.fr

Research Interests

Quantum computing and simulation
Quantum algorithms
High-performance computing

Current projects

AQEDP: Quantum algorithms for PDEs.
HQI: France Hybrid Quantum Initiative, PI for UPSaclay.
NUMPEX: French Exascale project, PI for UPSaclay.
Quantum algorithms for High Performance Computing (with ATOS/Eviden).
MAGMA: Linear algebra software library for heterogeneous architectures (with University of Tennessee, USA).

Papers

O. Koska, M. Baboulin, A. Gazda
A mixed-precision quantum-classical algorithm for solving linear systems.
IPDPS 2025 - 39th IEEE International Parallel and Distributed Processing Symposium Workshops, Milan, Italy, pp.501-508 (2025). PDF File.

M. Baboulin, O. Kaya, Theo Mary, M. Robeyns
Mixed precision iterative refinement for low-rank matrix and tensor approximations.
SIAM Journal on Scientific Computing (SISC), Vol. 47, No 5, pp. A2906-A2935 (2025), PDF File.

G. He, S. Vialle, M. Baboulin
Generating Sparse Matrices for Large-Scale Spectral Clustering on a Single GPU.
International Journal of Parallel Programming , Vol. 53, No 4 (2025), PDF File.

M. Baboulin, S. Donfack, O. Kaya, T. Mary, M. Robeyns
Mixed precision randomized low-rank approximation with GPU tensor cores.
Proceedings of the Euro-PAR 2024 Conference, Lecture Notes in Computer Science, Springer, Vol. 14803, pp. 31-44 (2024), PDF File.

O. Koska, M. Baboulin, A. Gazda
A tree-approach Pauli decomposition algorithm with application to quantum computing.
ISC High Performance 2024 Research Paper Proceedings (39th International Conference), Hamburg, Germany, pp. 1-11 (2024), PDF File.

T. Goubault de Brugière, M. Baboulin, B. Valiron, S. Martiel, C. Allouche
Decoding techniques applied to the compilation of CNOT circuits for NISQ architectures.
Science of Computer Programming, Vol. 214:102726 (2022).

G. He, S. Vialle, M. Baboulin
Parallel and accurate k-means algorithm on CPU-GPU architectures for spectral clustering.
Concurrency and Computation: Practice and Experience , Vol. 34, No 14 (2022), PDF File.

T. Goubault de Brugière, M. Baboulin, B. Valiron, S. Martiel, C. Allouche
Gaussian elimination versus greedy methods for the synthesis of linear reversible circuits.
ACM Transactions on Quantum Computing, Vol.2, No 3, pp. 1-26 (2021).

T. Goubault de Brugière, M. Baboulin, B. Valiron, S. Martiel, C. Allouche
Reducing the depth of linear reversible quantum circuits.
IEEE Transactions on Quantum Engineering, Vol. 2, pp. 1-22 (2021).

G. He, S. Vialle, N. Sylvestre, M. Baboulin
Scalable Algorithms Using Sparse Storage for Parallel Spectral Clustering on GPU.
Lecture Notes in Computer Science, Springer-Verlag, Vol. 13152, pp. 40-52 (2021),

G. He, S. Vialle, M. Baboulin
Parallelization of the k-means Algorithm in a Spectral Clustering Chain on CPU-GPU Platforms.
Lecture Notes in Computer Science, Springer-Verlag, Vol. 12480, pp. 135-147 (2020), PDF File.

T. Goubault de Brugière, M. Baboulin, B. Valiron, S. Martiel, C. Allouche
Quantum CNOT circuits synthesis for NISQ architectures using the syndrome decoding problem.
Proceedings of the 12th Conference on Reversible Computation (RC 2020).
Lecture Notes in Computer Science, Springer, Vol. 12227, pp. 189-205 (2020), PDF File.

T. Goubault de Brugière, M. Baboulin, B. Valiron, C. Allouche
Quantum circuits synthesis using Householder transformations.
Computer Physics Communications, Vol. 248, p. 107001 (2020), PDF File.

T. Goubault de Brugière, M. Baboulin, B. Valiron, C. Allouche
Synthesizing quantum circuits via numerical optimization.
Proceedings of the International Conference on Computational Science (ICCS 2019).
Lecture Notes in Computer Science, Springer, Vol. 11537, pp. 3-16 (2019), PDF File.

I. Masliah, A. Abdelfattah, A. Haidar, S. Tomov, M. Baboulin, J. Falcou, J. Dongarra
Algorithms and optimization techniques for high-performance matrix-matrix multiplications of very small matrices.
Parallel Computing , Vol. 81, pp. 1-21 (2019), PDF File.

Gary W. Howell and Marc Baboulin
Iterative Solution of Sparse Linear Least Squares using LU Factorization.
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region (HPC Asia 2018).
ACM digital library, pp. 47-53 (2018), PDF File.

C. Allouche, M. Baboulin, T. Goubault de Brugière, B. Valiron
Reuse method for quantum circuit synthesis.
Recent Advances in Mathematical and Statistical Methods, Springer-Verlag, Vol. 259, pp. 3-12 (2018), PDF File.

Evan Coleman, Aygul Jamal, M. Baboulin, Amal Khabou, Masha Sosonkina
A Comparison of Soft-Fault Error Models in the Parallel Preconditioned Flexible GMRES.
Proceedings of the 12th International Conference on Parallel Processing and Applied Mathematics (PPAM 2017).
Lecture Notes in Computer Science, Springer-Verlag, Vol. 10777, pp. 36-46 (2017), PDF File.

M. Baboulin, J. Dongarra, A. Rémy, S. Tomov, I. Yamazaki
Solving Dense Symmetric Indefinite Systems using GPUs.
Concurrency and Computation: Practice and Experience , Vol. 29, No 9 (2017), PDF File.

H. Anzt, M. Baboulin, J. Dongarra, Y. Fournier, F. Hulsemann, A. Khabou, Y. Wang
Accelerating the conjugate gradient algorithm with GPU in CFD Simulations.
Proceedings of the International Conference on Vector and Parallel Processing (VecPar 2016).
Lecture Notes in Computer Science, Springer-Verlag, Vol. 10150, pp. 35-43 (2016), PDF File.

I. Masliah, M. Baboulin, J. Falcou
Meta-programming and multi-stage programming for GPGPUs.
Proceedings of the 10th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSOC 2016).
IEEE Xplore Digital Library, pp. 369-376 (2016), PDF File.

A. Jamal, M. Baboulin, A. Khabou, M. Sosonkina
A hybrid CPU/GPU approach for the parallel algebraic recursive multilevel solver pARMS.
Proceedings of the 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC 2016).
IEEE Xplore Digital Library, pp. 411-416 (2016), PDF File.

I. Masliah, A. Abdelfattah, A. Haidar, S. Tomov, M. Baboulin, J. Falcou, J. Dongarra
High-Performance Matrix-Matrix Multiplications of Very Small Matrices.
Proceedings of Euro-Par 2016.
Lecture Notes in Computer Science, Springer-Verlag, Vol. 9833, pp. 659-671 (08/2016), PDF File.

A. Abdelfattah, M. Baboulin, V. Dobrev, J. Dongarra, C. Earl, J. Falcou, A. Haidar, I. Karlin, T. Kolev, I. Masliah, S. Tomov
High-Performance Tensor Contractions for GPUs.
Proceedings of the International Conference on Computational Science, ICCS 2016.
Procedia Computer Science, Elsevier, Vol. 80, pp. 108-118 (06/2016), PDF File.

M. Baboulin, J. Dongarra, A. Rémy, S. Tomov, I. Yamazaki
Dense Symmetric Indefinite Factorization on GPU Accelerated Architectures.
Proceedings of the 11th International Conference on Parallel Processing and Applied Mathematics (PPAM 2015).
Lecture Notes in Computer Science, Springer-Verlag, Vol. 9573, pp. 86-95 (2016), PDF File.

G. W. Howell, M. Baboulin
LU Preconditioning for Overdetermined Sparse Least Squares Problems.
Proceedings of the 11th International Conference on Parallel Processing and Applied Mathematics (PPAM 2015).
Lecture Notes in Computer Science, Springer-Verlag, Vol. 9573, pp. 128-137(2016), PDF File.

M. Baboulin, A. Jamal, M. Sosonkina
Using Random Butterfly Transformations in Parallel Schur Complement-Based Preconditioning.
Proceedings of the 2015 Federated Conference on Computer Science and Information Systems (FedCSIS 2015).
Vol. 5, pp. 649-654 (2015), PDF File.

M. Baboulin, A. Khabou, A. Rémy
A randomized LU-based solver using GPU and Intel Xeon Phi accelerators.
Proceedings of the Euro-Par 2015 workshop ``HeteroPar - Algorithms, Models, and Tools for Parallel Computing on Heterogeneous Platforms''.
Lecture Notes in Computer Science, Springer-Verlag, Vol. 9523, pp. 175-184 (2015), PDF File.

I. Masliah, M. Baboulin, J. Falcou
Metaprogramming dense linear algebra solvers. Applications to multi and many-core architectures.
Proceedings of the 13th IEEE International Symposium on Parallel and Distributed Processing with Applications (IEEE ISPA-15).
IEEE Xplore Digital Library, Vol. 3, pp. 69-76 (2015), PDF File.

M. Baboulin, J. Dongarra, R. Lacroix
Computing least squares condition numbers on hybrid multicore/GPU systems.
Interdisciplinary Topics in Applied Mathematics, Modeling and Computational Science, Vol. 117, pp. 35-41 (2015), PDF File.

M. Baboulin, X. S. Li, F-H. Rouet
Using Random Butterfly Transformations to Avoid Pivoting in Sparse Direct Methods.
Proceedings of the International Conference on Vector and Parallel Processing (VecPar 2014).
Lecture Notes in Computer Science, Springer-Verlag, Vol. 8969, pp. 135-144 (2014), PDF File.

G. Fursin, R. Miceli, A. Lokhmotov, M. Gerndt, M. Baboulin, A. Malony, Z. Chamski, D. Novillo, D. Del Vento
Collective mind: Towards practical and collaborative auto-tuning.
Scientific Programming, IOS Press, Vol. 22, No 4, pp. 309-329 (2014), PDF File.

M. Baboulin, D. Becker, G. Bosilca, A. Danalis, J. Dongarra
An efficient distributed randomized algorithm for solving large dense symmetric indefinite linear systems.
Parallel Computing , Vol. 40, No 7, pp. 213-223 (2014), PDF File.

Y. Wang, M. Baboulin, K. Rupp, O. Le Maître, Y. Fraigneau
Solving 3D incompressible Navier-Stokes equations on hybrid CPU/GPU systems.
Proceedings of the 22nd High Performance Computing Symposium (HPC'14).
ACM digital library, article 12 (2014), PDF File.

Adrien Rémy, M. Baboulin, M. Sosonkina, B. Rozoy
Locality optimization on a NUMA architecture for hybrid LU factorization.
Proceedings of the International Conference on Parallel Computing, PARCO 2013.
Advances in Parallel Computing, IOS Press, Vol. 25, pp. 153-162 (2014), PDF File.

M. Baboulin, S. Gratton, R. Lacroix, A. J. Laub
Statistical estimates for the conditioning of linear least squares problems.
Proceedings of the 10th International Conference on Parallel Processing and Applied Mathematics, PPAM 2013.
Lecture Notes in Computer Science, Springer-Verlag, Vol. 8384, pp. 124-133 (2014), PDF File.

Y. Wang, M. Baboulin, J. Dongarra, J. Falcou, Y. Fraigneau, O. Le Maître
A parallel solver for incompressible fluid flows.
Proceedings of the International Conference on Computational Science, ICCS 2013.
Procedia Computer Science, Elsevier, Vol. 18, pp. 439-448 (06/2013), PDF File.

M. Baboulin, J. Dongarra, J. Herrmann, S. Tomov
Accelerating linear system solutions using randomization techniques.
ACM Transactions on Mathematical Software (TOMS),Vol. 39, No 2 (2013), PDF File.

M. Baboulin, S. Donfack, J. Dongarra, L. Grigori, A. Rémy, S. Tomov
A class of communication-avoiding algorithms for solving general dense linear systems on CPU/GPU parallel machines.
Inria Research Report 7854 (02/2012).
Proceedings of the International Conference on Computational Science, ICCS 2012.
Procedia Computer Science, Elsevier, Vol. 9, pp. 17-26 (2012), PDF File.

M. Baboulin, D. Becker, J. Dongarra
A parallel tiled solver for dense symmetric indefinite systems on multicore architectures.
Inria Research Report 7762 (12/2011), also appeared as LAPACK Working Note 261.
Proceedings of IEEE International Parallel & Distributed Processing Symposium, IPDPS 2012, PDF File.

D. Becker, M. Baboulin, J. Dongarra
Reducing the amount of pivoting in symmetric indefinite systems.
Inria Research Report 7621 (05/2011), University of Tennessee Technical Report ICL-UT-11-06.
Proceedings of the 9th International Conference on Parallel Processing and Applied Mathematics, PPAM 2011.
Lecture Notes in Computer Science, Springer-Verlag, Vol. 7203, pp. 133-142 (2012), PDF File.

M. Baboulin, S. Gratton
A contribution to the conditioning of the total least squares problem.
Inria Research Report 7488 (12/2010), also appeared as LAPACK Working Note 236.
SIAM Journal on Matrix Analysis and Applications,Vol. 32, No 3, pp. 685-699 (2011), PDF File.

S. Tomov, J. Dongarra, M. Baboulin
Towards dense linear algebra for hybrid GPU accelerated manycore systems.
Parallel Computing , Vol. 36, No 5&6, pp. 232-240 (2010), PDF File.

M. Baboulin, A. Buttari, J. Dongarra, J. Kurzak, J. Langou, J. Langou, P. Luszczek, S. Tomov
Accelerating scientific computations with mixed precision algorithms.
Computer Physics Communications , Vol. 180, No 12, pp. 2526-2533 (2009), PDF File.

M. Baboulin, J. Dongarra, S. Gratton, J. Langou
Computing the conditioning of the components of a linear least squares solution.
Numerical Linear Algebra with Applications , Vol. 16, No7, pp. 517-533 (2009), PDF File.

M. Baboulin, S. Gratton
Using dual techniques to derive componentwise and mixed condition numbers for a linear function of a linear least squares solution.
BIT Numerical Mathematics , Vol. 49, No1, pp. 3-19 (2009), PDF File.

M. Baboulin, J. Dongarra, S. Tomov
Some issues in dense linear algebra for multicore and special purpose architectures.
Proceedings of the 9th International Workshop on State-of-the-Art in Scientific and Parallel Computing (PARA'08) .
Lecture Notes in Computer Science, vol. 6126-6127, Springer-Verlag (2008), PDF File.

M. Baboulin, L. Giraud, S. Gratton, J. Langou
Parallel tools for solving incremental dense least squares problems. Application to space geodesy.
Journal of Algorithms and Computational Technology, Vol. 3, No 1, pp. 117-133 (2009), PDF File.

M. Baboulin, L. Giraud, S. Gratton, J. Langou
A distributed packed storage for large dense parallel in-core calculations.
Concurrency and Computation: Practice and Experience, Vol. 19, No 4, pp. 483-502 (2007), PDF File.

M. Arioli, M. Baboulin, S. Gratton
A partial condition number for linear least squares problems.
SIAM Journal on Matrix Analysis and Applications,Vol. 29, No 2, pp. 413-433 (2007), PDF File.

M. Baboulin, L. Giraud, S. Gratton
A parallel distributed solver for large dense symmetric systems: applications to geodesy and electromagnetism problems.
International Journal of High Performance Computing Applications, Vol. 19, No 4, pp. 353-363 (2005), PDF File.

Theses

Title: Fast and reliable solutions for numerical linear algebra solvers in high-performance computing.
Habilitation à Diriger des Recherches (HDR) from University Paris-Sud, defended December 5, 2012.
Committee: J.C. Bajard (Université Paris 6), P. Dague (Université Paris-Sud), F. Desprez (Inria/Ecole Normale Supérieure de Lyon, referee), Jack Dongarra (University of Tennessee, USA), S. Gratton (ENSEEIHT), P. Langlois (Université de Perpignan, referee), J. Roman (Universitat Politècnica de València, Spain, referee), B. Rozoy (Université Paris-Sud).
HDR dissertation

Title: Solving large dense linear least squares problems on parallel distributed computers. Application to the Earth's gravity field computation.
Ph.D. in Computer Science from Institut National Polytechnique de Toulouse, defended March 21 2006.
Committee: G. Balmino (CNES/CNRS), J. Dongarra (University of Tennessee, USA, referee), I.S. Duff (RAL/CERFACS), L. Giraud (ENSEEIHT), S. Gratton (CERFACS), N.J. Higham (University of Manchester, UK, referee), J. Noailles (ENSEEIHT).
Ph.D. dissertation
This thesis was awarded the Léopold Escande Prize (best PhD thesis) by Institut National Polytechnique de Toulouse.