Doctoral Dissertation/Thesis (Danish: doktordisputats, German: Habilitation):

Aspects of the efficient implementation of the Message Passing Interface (MPI)
Aspekter vedrørende effektiv implementering af MPI


Submitted to the University of Copenhagen, November 1st 2007. Accepted for defense, January 29th 2009. Defended (successfully) for the Dr. Scient. degree, October 2nd, 2009.
Evaluation committee and official opponents: Professor Brian Vinter (University of Copenhagen), Professor Jaswinder Pal Singh (Princeton University), in part Professor Peter Sanders (Universität Karlsruhe).

Publisher (where to buy): Shaker Verlag, Aachen, Germany, 2009, ISBN 978-3-8322-8192-2.

The Papers

that are part of the Dissertation, in chronological order.
  1. Jesper Larsson Träff, Rolf Hempel, Hubert Ritzdorf, Falk Zimmermann. Flattening on the fly: efficient handling of MPI derived datatypes. In Recent Advances in Parallel Virtual Machine and Message Passing Interface. 6th European PVM/MPI Users' Group Meeting, volume 1697 of Lecture Notes in Computer Science, pages 109-116, 1999.
  2. Ralf Reussner, Jesper Larsson Träff, Gunnar Hunzelmann. A benchmark for MPI derived datatypes. In Recent Advances in Parallel Virtual Machine and Message Passing Interface. 7th European PVM/MPI Users' Group Meeting, volume 1908 of Lecture Notes in Computer Science, pages 10-17, 2000.
  3. Jesper Larsson Träff, Hubert Ritzdorf, and Rolf Hempel. The implementation of MPI-2 one-sided communication for the NEC SX-5. In Supercomputing, 2000.
  4. Maciej Golebiewski, Jesper Larsson Träff. MPI-2 one-sided communications on a Giganet SMP cluster. In Recent Advances in Parallel Virtual Machine and Message Passing Interface. 8th European PVM/MPI Users' Group Meeting, volume 2131 of Lecture Notes in Computer Science, pages 16-23, 2001.
  5. Peter Sanders and Jesper Larsson Träff. The hierarchical factor algorithm for all-to-all communication. In Euro-Par 2002 Parallel Processing, volume 2400 of Lecture Notes in Computer Science, pages 799-803, Springer 2002.
  6. Jesper Larsson Träff. Improved MPI all-to-all communication on a Giganet SMP cluster. In Recent Advances in Parallel Virtual Machine and Message Passing Interface. 9th European PVM/MPI Users' Group Meeting, volume 2474 of Lecture Notes in Computer Science, pages 392-400, Springer 2002.
  7. Jesper Larsson Träff. Implementing the MPI process topology mechanism. In Supercomputing, 2002.
  8. Jesper Larsson Träff. SMP-aware message passing programming. In Eigth International Workshop on High-level Parallel Programming Models and Supportive Environments (HIPS03), International Parallel and Distributed Processing Symposium (IPDPS 2003), pages 56-65, 2003.
  9. Joachim Worringen, Jesper Larsson Träff, Hubert Ritzdorf. Improving generic non-contiguous file access for MPI-IO. In Recent Advances in Parallel Virtual Machine and Message Passing Interface. 10th European PVM/MPI Users' Group Meeting, volume 2840 of Lecture Notes in Computer Science, pages 309-318, 2003.
  10. Maciej Golebiewski, Hubert Ritzdorf, Jesper Larsson Träff, Falk Zimmermann. The MPI/SX implementation of MPI for NEC's SX-6 and other NEC platforms. NEC Research & Development, 44(1):69-74, 2003.
  11. Joachim Worringen, Jesper Larsson Träff, Hubert Ritzdorf. Fast parallel non-contiguous file access. In Supercomputing, 2003.
  12. Jesper Larsson Träff. Hierarchical gather/scatter algorithms with graceful degradation. In International Parallel and Distributed Processing Symposium (IPDPS 2004), page 80, IEEE Press, 2004.
  13. Jesper Larsson Träff, Joachim Worringen. Verifying collective MPI calls. In Recent Advances in Parallel Virtual Machine and Message Passing Interface. 11th European PVM/MPI Users' Group Meeting, volume 3241 of Lecture Notes in Computer Science, pages 18-27, 2004.
  14. Rolf Rabenseifner and Jesper Larsson Träff. More efficient reduction algorithms for message-passing parallel systems. In Recent Advances in Parallel Virtual Machine and Message Passing Interface. 11th European PVM/MPI Users' Group Meeting, volume 3241 of Lecture Notes in Computer Science, pages 36-46, 2004.
  15. Jesper Larsson Träff. A simple work-optimal broadcast algorithm for message passing parallel systems. In Recent Advances in Parallel Virtual Machine and Message Passing Interface. 11th European PVM/MPI Users' Group Meeting, volume 3241 of Lecture Notes in Computer Science, pages 173-180, 2004.
  16. Jesper Larsson Träff. An improved algorithm for (non-commutative) reduce-scatter with an application. In Recent Advances in Parallel Virtual Machine and Message Passing Interface. 12th European PVM/MPI Users' Group Meeting, volume 3666 of Lecture Notes in Computer Science, pages 130-138, 2005.
  17. Hubert Ritzdorf, Jesper Larsson Träff. Collective operations in NEC's high-performance MPI libraries. In International Parallel and Distributed Processing Symposium (IPDPS 2006), IEEE Press, 2006.
  18. Jesper Larsson Träff. Efficient allgather for regular SMP-clusters. In Recent Advances in Parallel Virtual Machine and Message Passing Interface. 13th European PVM/MPI Users' Group Meeting, volume 4192 of Lecture Notes in Computer Science, pages 58-65. Springer, 2006.
  19. Peter Sanders and Jesper Larsson Träff. Parallel prefix (scan) algorithms for MPI. In Recent Advances in Parallel Virtual Machine and Message Passing Interface. 13th European PVM/MPI Users' Group Meeting, volume 4192 of Lecture Notes in Computer Science, pages 49-57. Springer, 2006.
  20. Guntram Berti and Jesper Larsson Träff. What MPI could (and cannot) do for mesh-partitioning on non-homogeneous networks. In Recent Advances in Parallel Virtual Machine and Message Passing Interface. 13th European PVM/MPI Users' Group Meeting, volume 4192 of Lecture Notes in Computer Science, pages 293-302. Springer, 2006.
  21. Jesper Larsson Träff, Joachim Worringen. The MPI/SX collectives verification library. In Parallel Computing: Current & Future Issues of High-End Computing (ParCo 2005), volume 33 of NIC Series, pages 909-916. John von Neuman Institute for Computing, Central Institute for Applied Mathematics, Forschungszentrum Julich, 2006.
  22. Jesper Larsson Träff. Direct graph k-partitioning with a Kernighan-Lin like heuristic. Operations Research Letters, 34(6):621-629, 2006.
  23. Peter Sanders, Jochen Speck, and Jesper Larsson Träff. Full bandwidth broadcast, reduction and scan with only two trees. In Recent Advances in Parallel Virtual Machine and Message Passing Interface. 14th European PVM/MPI Users' Group Meeting, volume 4757 of Lecture Notes in Computer Science, pages 17-26. Springer, 2007 (Outstanding paper).
  24. Jesper Larsson Träff, William Gropp, and Rajeev Thakur. Self-consistent MPI performance requirements. In Recent Advances in Parallel Virtual Machine and Message Passing Interface. 14th European PVM/MPI Users' Group Meeting, volume 4757 of Lecture Notes in Computer Science, pages 36-45. Springer, 2007 (Outstanding paper).
  25. Jesper Larsson Träff, Andreas Ripke. Optimal Broadcast for fully connected Processor-node Networks. Journal of Parallel and Distributed Computing, 68(7): 887-901, 2008.

Related Papers

that have followed and/or are not part of the Dissertation.
  1. Ralf Reussner, Peter Sanders, Jesper Larsson Träff. SKaMPI: A comprehensive Benchmark for public Benchmarking of MPI. Scientific Programming, 10(1): 55-65, 2002.
  2. Jesper Larsson Träff, Andreas Ripke, Christian Siebert, Pavan Balaji, Rajeev Thakur, William Gropp. A pipelined Algorithm for large, irregular all-gather Problems. International Journal of High Performance Computing Applications 24(1):58-68, 2010.
  3. Faisal Ghias Mir, Jesper Larsson Träff. Constructing MPI input-output Datatypes for efficient Transpacking. In Recent Advances in Parallel Virtual Machine and Message Passing Interface. 15th European PVM/MPI Users' Group Meeting, volume 5205, of Lecture Notes in Computer Science, pages 141-150. Springer, 2008.
  4. William D. Gropp, Dries Kimpe, Robert Ross, Rajeev Thakur, Jesper Larsson Träff. Self-consistent MPI-IO performance requirements and expectations In Recent Advances in Parallel Virtual Machine and Message Passing Interface. 15th European PVM/MPI Users' Group Meeting, volume 5205, of Lecture Notes in Computer Science, pages 167-176. Springer, 2008.
  5. Jesper Larsson Träff. Relationships between regular and irregular collective communication operations on clustered multiprocessors. Parallel Processing Letters, 19(1):85-96, 2009.
  6. Torsten Hoefler, Jesper Larsson Träff. Sparse Collective Operations for MPI. In 14th International Workshop on High-level Parallel Programming Models and Supportive Environments (HIPS), International Parallel and Distributed Processing Symposium (IPDPS 2009), 2009, page 6.
  7. Pavan Balaji, Darius Buntinas, David Goodell, William Gropp, Sameer Kumar, Ewing Lusk, Rajeev Thakur, Jesper Larsson Träff. MPI on a Million Processors. In Recent Advances in Parallel Virtual Machine and Message Passing Interface. 16th European PVM/MPI Users' Group Meeting, volume 5759 of Lecture Notes in Computer Science, pages 20-30. Springer, 2009 (Outstanding paper).
  8. Faisal Ghias Mir, Jesper Larsson Träff. Exploiting efficient Transpacking for One-sided Communication and MPI-IO. In Recent Advances in Parallel Virtual Machine and Message Passing Interface. 16th European PVM/MPI Users' Group Meeting, volume 5759 of Lecture Notes in Computer Science, pages 154-163. Springer, 2009.
  9. Vinod Tipparaju, William D. Gropp, Hubert Ritzdorf, Rajeev Thakur, Jesper Larsson Träff. Investigating High Performance RMA Interfaces for the MPI-3 Standard. In International Conference on Parallel Processing (ICPP'09), 2009.
  10. Jesper Larsson Träff, William Gropp, and Rajeev Thakur. Self-consistent MPI performance Guidelines. IEEE Transactions on Parallel and Distributed Systems, 21(5):698-709, 2010.
  11. Peter Sanders, Jochen Speck, and Jesper Larsson Träff. Two-Tree Algorithms for Full Bandwidth Broadcast, Reduction and Scan. Parallel Computing, 35(12): 581-594, 2009.
  12. Jesper Larsson Träff. Transparent neutral element elimination in MPI reduction operations. In Recent Advances in Message Passing Interface. 17th European MPI Users' Group Meeting, volume 6305, of Lecture Notes in Computer Science, pages 275-284. Springer, 2010.
  13. Jesper Larsson Träff. Compact and Efficient Implementation of the MPI Group Operations. In Recent Advances in Message Passing Interface. 17th European MPI Users' Group Meeting, volume 6305, of Lecture Notes in Computer Science, pages 170-178. Springer, 2010.
  14. Jesper Larsson Träff. A (radical) proposal addressing the non-scalability of the irregular MPI collective interfaces. In 16th International Workshop on High-level Parallel Programming Models and Supportive Environments (HIPS11) at International Parallel and Distributed Processing Symposium (IPDPS), page 42. IEEE Press 2011.
  15. Enes Bajrovic, Jesper Larsson Träff. Using MPI derived datatypes in numerical libraries. In Recent Advances in Message Passing Interface. 18th European MPI Users' Group Meeting, volume, of Lecture Notes in Computer Science, pages. Springer, 2011.
  16. William D. Gropp, Torsten Hoefler, Rajeev Thakur, Jesper Larsson Träff. Performance expectations and guidelines for MPI derived datatypes: a first analysis. In Recent Advances in Message Passing Interface. 18th European MPI Users' Group Meeting, volume, of Lecture Notes in Computer Science, pages. Springer, 2011.
  17. Pavan Balaji, Darius Buntinas, David Goodell, William Gropp and Torsten Hoefler, Sameer Kumar, Ewing Lusk, Rajeev Thakur, Jesper Larsson Träff. MPI on millions of cores. Parallel Processing Letters, 21(1):45-60, 2011.
  18. Torsten Hoefler, Rolf Rabenseifner, Hubert Ritzdorf, Bronis R. de Supinski, Rajeev Thakur, Jesper Larsson Träff. The Scalable Process Topology Interface of MPI 2.2. Concurrency and Computation: Practice and Experience, 23: 293-310, 2011.