Math 627 - Introduction to Parallel Computing
Spring 2008 - Matthias K. Gobbert
Presentations of the Class Projects

Friday, May 16, 2008, 01:00 p.m., MP 401

  1. 01:05-01:20
    Solving a Model 2D Poisson Equation Using the Parallelized Matrix-Free Conjugate Gradient Method with SSOR Preconditioning
    Amanda Gassman, Department of Mathematics and Statistics
    The Conjugate Gradient method can be parallelized and used to compute the numerical solution of a system of linear equations. In fact, a matrix-free version of this method exists for the well-known Poisson problem, which relies upon knowledge regarding the specific form of the system matrix derived from a second order finite difference discretization. However as it has been well established that preconditioning the system matrix allows much faster convergence, it is the goal of this project to produce a matrix-free parallelized Conjugate Gradient code with Symmetric Successive Overrelaxation (SSOR) preconditioning for the two-dimensional Poisson problem. This requires the development and implementation of clever algorithms to solve linear equations involving the preconditioning matrices resulting from the selected preconditioner, which is possible due to structure of the system matrix. Ultimately, the goal is achieved and results are presented that confirm the accuracy of the new code and demonstrate good performance in terms of speedup and efficiency when run on multiple processes.

  2. 01:25-01:40
    A Comparison of Python and C in MPI-Based Scientific Computing
    Will Murnane, Department of Computer Science and Electrical Engineering
    Parallel scientific computing code is often written in MPI using the C or Fortran interface. This produces high-quality, portable code when done carefully. But problems with memory allocation, debugging of parallel programs, and other pitfalls can lead to poorly performing code or incorrect results. A somewhat easier approach to writing MPI programs using the Python scripting language is introduced in this paper, along with color commentary on a novice's impressions of the process.

  3. 01:45-02:00
    Numerical Simulation of Atomic Layer Deposition Using Parallel Programming
    Michael Reid, Department of Mathematics and Statistics
    Many steps of the process of manufacturing computer chips involve the deposition of uniform layer of material onto trenches etched in a silicon wafer through a process known as Atomic Layer Deposition (ALD). This process is modeled through a series of linear Boltzmann equations, which are then solved with the finite element method by discretizing the domain in velocity space using the discontinuous Galerkin method. Depending on the resolution of the velocity mesh, the finite element method may have upwards of several hundred thousand degrees of freedom, necessitating the use of parallel computing. Speedup studies on various mesh sizes show good performance increases, reducing the time taken to compute a solution from several hours to only a few minutes.

  4. 02:05-02:20
    Implementation of Matrix-Free Conjugate Gradient Method on Cell BE with Comparative Study
    Shiming Yang, Department of Mathematics and Statistics
    In this report, implementation of parallel algorithm of conjugate gradient method on IBM Cell BE is introduced, focusing on task level parallelism. With the implementation of matrix free conjugate gradient method for solving Poisson equation in 2 dimension, we show the performance of PPE cooperating with two SPEs working in parallel. In term of speedup, efficiency and the Karp-Flatt metric, such results are accompanied and studied with results obtained in another distributed memory parallel computing environment. Interesting though limit observation is that the parallel code on Cell BE shows good speedup at least for available small size problems.

  5. 02:25-02:40
    Implementing a Parallel Matrix-Free Multi-Grid Solver for the 2D Poisson Equation
    Ihab Sraj, Department of Mechanical Engineering
    The numerical solution of the Poisson equation requires the use of linear iterative solvers to solve the algebraic equations resulting from its discretization on finite grid points. The accuracy of such methods is mainly limited by the grid resolution, which in its turn is limited by the available computational resources. The time needed to solve the problem on refined meshes using most of the linear iterative solvers (Jacobi, Gauss-Seidel or SOR) does not increase linearly with the mesh size as expected as the number of iterations needed to reach a converged solution also increases with the mesh size. Multi-grid methods and parallel computing are two available techniques to speed up the solution where the former methods are accelerators that increase the speed of the linear solvers locally while the latter divides the computational load between different processors. In this work the implementation of a parallel multi-grid solver is presented for the 2D Poisson's equation using finite difference discretization on a distributed-memory cluster. The scalability study done shows the efficiency of the implemented solver through the good speedup of the solution locally as well as in parallel.

  6. 02:45-03:00
    A Parallel Sort-Last Algorithm for Rendering Massive Models
    Mark Bolstad, Department of Computer Science and Electrical Engineering
    As the complexity of scenes in computer graphics increase, methods for handling the large memory requirements of these scenes and increasing the efficiency of the rendering algorithms are needed. In this paper we examine the effects of a parallel sort-last algorithm in a bucketing, micropolygon-based software renderer. Instead of using the standard method of compositing in the final step after the images are rendered, we composite after the completion of each bucket, thus reducing memory usage at the cost of increased communication. We show that for a limited number of processors constant time rendering is achievable, but the cost of communication quickly dominates the total time. We also show that the underlying rendering algorithm is parallelizable, but a dramatically different implementation will be necessary to increase the efficiency of the parallelism.


Copyright © 2001-2008 by Matthias K. Gobbert. All Rights Reserved.
This page version 1.0, December 2008.