Marc Baboulin, PhD, HDR

Professor at Université Paris-Saclay (France)

Laboratoire Méthodes Formelles (LMF)

Inria team QuaCS

E-mail: marc.baboulin [at]

Research Interests

  • Quantum computing and simulation
  • Quantum algorithms
  • High-performance computing and software libraries
  • Numerical linear algebra

Current projects

  • HQI: France Hybrid Quantum Initiative
  • NUMPEX: French Exascale project
  • Quantum algorithms for High Performance Computing (with ATOS/Eviden)
  • MAGMA: Linear algebra software library for heterogeneous architectures (with University of Tennessee, USA)


  • M. Baboulin, S. Donfack, O. Kaya, T. Mary, M. Robeyns
    Mixed precision randomized low-rank approximation with GPU tensor cores.
    To appear in the Proceedings of the Euro-PAR 2024 Conference.
  • O. Koska, M. Baboulin, A. Gazda
    A tree-approach Pauli decomposition algorithm with application to quantum computing.
    ISC High Performance 2024 Research Paper Proceedings (39th International Conference), Hamburg, Germany, pp. 1-11 (2024), PDF File.
  • T. Goubault de Brugière, M. Baboulin, B. Valiron, S. Martiel, C. Allouche
    Decoding techniques applied to the compilation of CNOT circuits for NISQ architectures.
    Science of Computer Programming, Vol. 214:102726 (2022).
  • G. He, S. Vialle, M. Baboulin
    Parallel and accurate k-means algorithm on CPU-GPU architectures for spectral clustering.
    Concurrency and Computation: Practice and Experience , Vol. 34, No 14 (2022), PDF File.
  • T. Goubault de Brugière, M. Baboulin, B. Valiron, S. Martiel, C. Allouche
    Gaussian elimination versus greedy methods for the synthesis of linear reversible circuits.
    ACM Transactions on Quantum Computing, Vol.2, No 3, pp. 1-26 (2021).
  • T. Goubault de Brugière, M. Baboulin, B. Valiron, S. Martiel, C. Allouche
    Reducing the depth of linear reversible quantum circuits.
    IEEE Transactions on Quantum Engineering, Vol. 2, pp. 1-22 (2021).
  • G. He, S. Vialle, N. Sylvestre, M. Baboulin
    Scalable Algorithms Using Sparse Storage for Parallel Spectral Clustering on GPU.
    Lecture Notes in Computer Science, Springer-Verlag, Vol. 13152, pp. 40-52 (2021),
  • G. He, S. Vialle, M. Baboulin
    Parallelization of the k-means Algorithm in a Spectral Clustering Chain on CPU-GPU Platforms.
    Lecture Notes in Computer Science, Springer-Verlag, Vol. 12480, pp. 135-147 (2020), PDF File.
  • T. Goubault de Brugière, M. Baboulin, B. Valiron, S. Martiel, C. Allouche
    Quantum CNOT circuits synthesis for NISQ architectures using the syndrome decoding problem.
    Proceedings of the 12th Conference on Reversible Computation (RC 2020).
    Lecture Notes in Computer Science, Springer, Vol. 12227, pp. 189-205 (2020), PDF File.
  • T. Goubault de Brugière, M. Baboulin, B. Valiron, C. Allouche
    Quantum circuits synthesis using Householder transformations.
    Computer Physics Communications, Vol. 248, p. 107001 (2020), PDF File.
  • T. Goubault de Brugière, M. Baboulin, B. Valiron, C. Allouche
    Synthesizing quantum circuits via numerical optimization.
    Proceedings of the International Conference on Computational Science (ICCS 2019).
    Lecture Notes in Computer Science, Springer, Vol. 11537, pp. 3-16 (2019), PDF File.
  • I. Masliah, A. Abdelfattah, A. Haidar, S. Tomov, M. Baboulin, J. Falcou, J. Dongarra
    Algorithms and optimization techniques for high-performance matrix-matrix multiplications of very small matrices.
    Parallel Computing , Vol. 81, pp. 1-21 (2019), PDF File.
  • Gary W. Howell and Marc Baboulin
    Iterative Solution of Sparse Linear Least Squares using LU Factorization.
    Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region (HPC Asia 2018).
    ACM digital library, pp. 47-53 (2018), PDF File.
  • C. Allouche, M. Baboulin, T. Goubault de Brugière, B. Valiron
    Reuse method for quantum circuit synthesis.
    Recent Advances in Mathematical and Statistical Methods, Springer-Verlag, Vol. 259, pp. 3-12 (2018), PDF File.
  • Evan Coleman, Aygul Jamal, M. Baboulin, Amal Khabou, Masha Sosonkina
    A Comparison of Soft-Fault Error Models in the Parallel Preconditioned Flexible GMRES.
    Proceedings of the 12th International Conference on Parallel Processing and Applied Mathematics (PPAM 2017).
    Lecture Notes in Computer Science, Springer-Verlag, Vol. 10777, pp. 36-46 (2017), PDF File.
  • M. Baboulin, J. Dongarra, A. Rémy, S. Tomov, I. Yamazaki
    Solving Dense Symmetric Indefinite Systems using GPUs.
    Concurrency and Computation: Practice and Experience , Vol. 29, No 9 (2017), PDF File.
  • H. Anzt, M. Baboulin, J. Dongarra, Y. Fournier, F. Hulsemann, A. Khabou, Y. Wang
    Accelerating the conjugate gradient algorithm with GPU in CFD Simulations.
    Proceedings of the International Conference on Vector and Parallel Processing (VecPar 2016).
    Lecture Notes in Computer Science, Springer-Verlag, Vol. 10150, pp. 35-43 (2016), PDF File.
  • I. Masliah, M. Baboulin, J. Falcou
    Meta-programming and multi-stage programming for GPGPUs.
    Proceedings of the 10th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSOC 2016).
    IEEE Xplore Digital Library, pp. 369-376 (2016), PDF File.
  • A. Jamal, M. Baboulin, A. Khabou, M. Sosonkina
    A hybrid CPU/GPU approach for the parallel algebraic recursive multilevel solver pARMS.
    Proceedings of the 18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC 2016).
    IEEE Xplore Digital Library, pp. 411-416 (2016), PDF File.
  • I. Masliah, A. Abdelfattah, A. Haidar, S. Tomov, M. Baboulin, J. Falcou, J. Dongarra
    High-Performance Matrix-Matrix Multiplications of Very Small Matrices.
    Proceedings of Euro-Par 2016.
    Lecture Notes in Computer Science, Springer-Verlag, Vol. 9833, pp. 659-671 (08/2016), PDF File.
  • A. Abdelfattah, M. Baboulin, V. Dobrev, J. Dongarra, C. Earl, J. Falcou, A. Haidar, I. Karlin, T. Kolev, I. Masliah, S. Tomov
    High-Performance Tensor Contractions for GPUs.
    Proceedings of the International Conference on Computational Science, ICCS 2016.
    Procedia Computer Science, Elsevier, Vol. 80, pp. 108-118 (06/2016), PDF File.
  • M. Baboulin, J. Dongarra, A. Rémy, S. Tomov, I. Yamazaki
    Dense Symmetric Indefinite Factorization on GPU Accelerated Architectures.
    Proceedings of the 11th International Conference on Parallel Processing and Applied Mathematics (PPAM 2015).
    Lecture Notes in Computer Science, Springer-Verlag, Vol. 9573, pp. 86-95 (2016), PDF File.
  • G. W. Howell, M. Baboulin
    LU Preconditioning for Overdetermined Sparse Least Squares Problems.
    Proceedings of the 11th International Conference on Parallel Processing and Applied Mathematics (PPAM 2015).
    Lecture Notes in Computer Science, Springer-Verlag, Vol. 9573, pp. 128-137(2016), PDF File.
  • M. Baboulin, A. Jamal, M. Sosonkina
    Using Random Butterfly Transformations in Parallel Schur Complement-Based Preconditioning.
    Proceedings of the 2015 Federated Conference on Computer Science and Information Systems (FedCSIS 2015).
    Vol. 5, pp. 649-654 (2015), PDF File.
  • M. Baboulin, A. Khabou, A. Rémy
    A randomized LU-based solver using GPU and Intel Xeon Phi accelerators.
    Proceedings of the Euro-Par 2015 workshop ``HeteroPar - Algorithms, Models, and Tools for Parallel Computing on Heterogeneous Platforms''.
    Lecture Notes in Computer Science, Springer-Verlag, Vol. 9523, pp. 175-184 (2015), PDF File.
  • I. Masliah, M. Baboulin, J. Falcou
    Metaprogramming dense linear algebra solvers. Applications to multi and many-core architectures.
    Proceedings of the 13th IEEE International Symposium on Parallel and Distributed Processing with Applications (IEEE ISPA-15).
    IEEE Xplore Digital Library, Vol. 3, pp. 69-76 (2015), PDF File.
  • M. Baboulin, J. Dongarra, R. Lacroix
    Computing least squares condition numbers on hybrid multicore/GPU systems.
    Interdisciplinary Topics in Applied Mathematics, Modeling and Computational Science, Vol. 117, pp. 35-41 (2015), PDF File.
  • M. Baboulin, X. S. Li, F-H. Rouet
    Using Random Butterfly Transformations to Avoid Pivoting in Sparse Direct Methods.
    Proceedings of the International Conference on Vector and Parallel Processing (VecPar 2014).
    Lecture Notes in Computer Science, Springer-Verlag, Vol. 8969, pp. 135-144 (2014), PDF File.
  • G. Fursin, R. Miceli, A. Lokhmotov, M. Gerndt, M. Baboulin, A. Malony, Z. Chamski, D. Novillo, D. Del Vento
    Collective mind: Towards practical and collaborative auto-tuning.
    Scientific Programming, IOS Press, Vol. 22, No 4, pp. 309-329 (2014), PDF File.
  • M. Baboulin, D. Becker, G. Bosilca, A. Danalis, J. Dongarra
    An efficient distributed randomized algorithm for solving large dense symmetric indefinite linear systems.
    Parallel Computing , Vol. 40, No 7, pp. 213-223 (2014), PDF File.
  • Y. Wang, M. Baboulin, K. Rupp, O. Le Maître, Y. Fraigneau
    Solving 3D incompressible Navier-Stokes equations on hybrid CPU/GPU systems.
    Proceedings of the 22nd High Performance Computing Symposium (HPC'14).
    ACM digital library, article 12 (2014), PDF File.
  • Adrien Rémy, M. Baboulin, M. Sosonkina, B. Rozoy
    Locality optimization on a NUMA architecture for hybrid LU factorization.
    Proceedings of the International Conference on Parallel Computing, PARCO 2013.
    Advances in Parallel Computing, IOS Press, Vol. 25, pp. 153-162 (2014), PDF File.
  • M. Baboulin, S. Gratton, R. Lacroix, A. J. Laub
    Statistical estimates for the conditioning of linear least squares problems.
    Proceedings of the 10th International Conference on Parallel Processing and Applied Mathematics, PPAM 2013.
    Lecture Notes in Computer Science, Springer-Verlag, Vol. 8384, pp. 124-133 (2014), PDF File.
  • Y. Wang, M. Baboulin, J. Dongarra, J. Falcou, Y. Fraigneau, O. Le Maître
    A parallel solver for incompressible fluid flows.
    Proceedings of the International Conference on Computational Science, ICCS 2013.
    Procedia Computer Science, Elsevier, Vol. 18, pp. 439-448 (06/2013), PDF File.
  • M. Baboulin, J. Dongarra, J. Herrmann, S. Tomov
    Accelerating linear system solutions using randomization techniques.
    ACM Transactions on Mathematical Software (TOMS),Vol. 39, No 2 (2013), PDF File.
  • M. Baboulin, S. Donfack, J. Dongarra, L. Grigori, A. Rémy, S. Tomov
    A class of communication-avoiding algorithms for solving general dense linear systems on CPU/GPU parallel machines.
    Inria Research Report 7854 (02/2012).
    Proceedings of the International Conference on Computational Science, ICCS 2012.
    Procedia Computer Science, Elsevier, Vol. 9, pp. 17-26 (2012), PDF File.
  • M. Baboulin, D. Becker, J. Dongarra
    A parallel tiled solver for dense symmetric indefinite systems on multicore architectures.
    Inria Research Report 7762 (12/2011), also appeared as LAPACK Working Note 261.
    Proceedings of IEEE International Parallel & Distributed Processing Symposium, IPDPS 2012, PDF File.
  • D. Becker, M. Baboulin, J. Dongarra
    Reducing the amount of pivoting in symmetric indefinite systems.
    Inria Research Report 7621 (05/2011), University of Tennessee Technical Report ICL-UT-11-06.
    Proceedings of the 9th International Conference on Parallel Processing and Applied Mathematics, PPAM 2011.
    Lecture Notes in Computer Science, Springer-Verlag, Vol. 7203, pp. 133-142 (2012), PDF File.
  • M. Baboulin, S. Gratton
    A contribution to the conditioning of the total least squares problem.
    Inria Research Report 7488 (12/2010), also appeared as LAPACK Working Note 236.
    SIAM Journal on Matrix Analysis and Applications,Vol. 32, No 3, pp. 685-699 (2011), PDF File.
  • S. Tomov, J. Dongarra, M. Baboulin
    Towards dense linear algebra for hybrid GPU accelerated manycore systems.
    Parallel Computing , Vol. 36, No 5&6, pp. 232-240 (2010), PDF File.
  • M. Baboulin, A. Buttari, J. Dongarra, J. Kurzak, J. Langou, J. Langou, P. Luszczek, S. Tomov
    Accelerating scientific computations with mixed precision algorithms.
    Computer Physics Communications , Vol. 180, No 12, pp. 2526-2533 (2009), PDF File.
  • M. Baboulin, J. Dongarra, S. Gratton, J. Langou
    Computing the conditioning of the components of a linear least squares solution.
    Numerical Linear Algebra with Applications , Vol. 16, No7, pp. 517-533 (2009), PDF File.
  • M. Baboulin, S. Gratton
    Using dual techniques to derive componentwise and mixed condition numbers for a linear function of a linear least squares solution.
    BIT Numerical Mathematics , Vol. 49, No1, pp. 3-19 (2009), PDF File.
  • M. Baboulin, J. Dongarra, S. Tomov
    Some issues in dense linear algebra for multicore and special purpose architectures.
    Proceedings of the 9th International Workshop on State-of-the-Art in Scientific and Parallel Computing (PARA'08) .
    Lecture Notes in Computer Science, vol. 6126-6127, Springer-Verlag (2008), PDF File.
  • M. Baboulin, L. Giraud, S. Gratton, J. Langou
    Parallel tools for solving incremental dense least squares problems. Application to space geodesy.
    Journal of Algorithms and Computational Technology, Vol. 3, No 1, pp. 117-133 (2009), PDF File.
  • M. Baboulin, L. Giraud, S. Gratton, J. Langou
    A distributed packed storage for large dense parallel in-core calculations.
    Concurrency and Computation: Practice and Experience, Vol. 19, No 4, pp. 483-502 (2007), PDF File.
  • M. Arioli, M. Baboulin, S. Gratton
    A partial condition number for linear least squares problems.
    SIAM Journal on Matrix Analysis and Applications,Vol. 29, No 2, pp. 413-433 (2007), PDF File.
  • M. Baboulin, L. Giraud, S. Gratton
    A parallel distributed solver for large dense symmetric systems: applications to geodesy and electromagnetism problems.
    International Journal of High Performance Computing Applications, Vol. 19, No 4, pp. 353-363 (2005), PDF File.


    Title: Fast and reliable solutions for numerical linear algebra solvers in high-performance computing.
    Habilitation à Diriger des Recherches (HDR) from University Paris-Sud, defended December 5, 2012.
    Committee: J.C. Bajard (Université Paris 6), P. Dague (Université Paris-Sud), F. Desprez (Inria/Ecole Normale Supérieure de Lyon, referee), Jack Dongarra (University of Tennessee, USA), S. Gratton (ENSEEIHT), P. Langlois (Université de Perpignan, referee), J. Roman (Universitat Politècnica de València, Spain, referee), B. Rozoy (Université Paris-Sud).
    HDR dissertation
    Title: Solving large dense linear least squares problems on parallel distributed computers. Application to the Earth's gravity field computation.
    Ph.D. in Computer Science from Institut National Polytechnique de Toulouse, defended March 21 2006.
    Committee: G. Balmino (CNES/CNRS), J. Dongarra (University of Tennessee, USA, referee), I.S. Duff (RAL/CERFACS), L. Giraud (ENSEEIHT), S. Gratton (CERFACS), N.J. Higham (University of Manchester, UK, referee), J. Noailles (ENSEEIHT).
    Ph.D. dissertation
    This thesis was awarded the Léopold Escande Prize (best PhD thesis) by Institut National Polytechnique de Toulouse.