Math 627 - Introduction to Parallel Computing
Matthias K. Gobbert and Madhu Nayakkankuppam
Fall 2002 - Project Presentations
This page can be reached via my homepage at
http://www.math.umbc.edu/~gobbert.
Program for Day 1
Monday, December 09, 2002, 07:00-09:00 p.m., MP 401
-
07:00-07:15
A Comparison of PETSc Linear Solvers on the Poisson Equation
Toni M. Smith
The Portable Extensible Toolkit for Scientific Computation (PETSc) is used
in parallel computing to solve large scale problems. Such problems might, at
some point, require the numerical solution to a large linear system. We
compare two of the methods avaliable in PETSc: Conjugate Gradient and
Generalized Minimum RESidual. We test these methods on linear systems
generated by the finite difference discretization of the Poisson equation
and do so with and without available preconditioners. We wish to determine
which method and preconditioner will return the numerical solution in the
least amount of time, for the largest system, with the greatest
efficiency. We find that the conjugate gradient method with block Jacobi
preconditioning is the best option available in PETSc.
Faculty mentor: Matthias K. Gobbert
Postscript-file of the report.
-
07:20-07:35
A Matrix-Free Conjugate Gradient Solution of the Poisson Equation
Kevin P. Allen
The conjugate gradient method is perhaps the most powerful algorithm for
iteratively solving a system of linear equations with a symmetric positive
definite system matrix.
Parallelization of the method saves time without compromising
accuracy of the solution. By using a matrix-free matrix-vector
product, the system matrix is never stored, reducing the strain on
memory. The combination of these aspects allows us to solve much
larger systems of equations much more quickly.
Faculty mentor: Matthias K. Gobbert
Postscript-file of the report.
-
07:40-07:55
A Reusable C++ Library for Parallel Maximum Likelihood Phylogenetic
Reconstruction
Rajeev Ayyagari
A great deal of software, such as PHYLIP and AxML, is available
for estimation of Phylogenetic Trees from DNA nucleic sequence data. Much
of this software runs in parallel using MPI. However, the source for most
existing software has a monolithic, hard-wired structure that greatly
increases the effort involved in understanding or modifying these
programs. This project produces a library of components for
maximum-likelihood estimation of phylogenetic trees that can easily be
modified, replaced or extended (through object-oriented techniques such as
inheritance). Features include parallelization of the process of
phylogenetic tree estimation in a way that makes it easy to add/modify
library functionality. It hides as much of the parallelization from a
casual user as possible, providing an interface that accepts just a
communicator and does the apportioning automatically. Nucleotide-wise
parallelization was chosen as a quantum of parallelization that was
coarse-grained enough to avoid excessive communication overhead and fine
grained enough the be applicable under most models of evolution. The
method is shown to be highly amenable to parallelization.
Faculty mentor: Madhu Nayakkankuppam
PDF-file of the report.
-
08:00-08:15
Data Distribution for Parallel Sparse Matrix-Vector Multiplication in the
Conjugate Gradient Algorithm for Systems of Linear Equations
Olena Shevchenko
Sparse matrix-vector multiplication lies at the heart of many iterative
solvers for linear systems (in particular, in the conjugate gradient
method). In these solvers, a multiplication y = A x
has to be carried out
repeatedly for the same sparse matrix (in the conjugate gradient method
A is square symmetric positive definite), but each time for a different
input vector x. On a distributed-memory parallel computer, efficient
multiplication requires a suitable distribution of the data and the
associated work. In particular, this requires distributing the sparse
matrix over the p processors of the parallel computer such that each
processor has about the same number of non-zeros and such that the amount
of communications is minimal. Such data distribution as well as
matrix-vector multiplication was implemented and used in the conjugate
gradient method.
Faculty mentor: Madhu Nayakkankuppam
Postscript-file of the report.
-
08:20-08:35
A Parallel Solver for Large Systems of ODEs
Bogdan Gavrea
The goal of this project is to solve large systems of ordinary
differential equations (ODEs) using the parallel solver PVODE. The system
of ODEs was derived from a parabolic partial differential equation by
discretizing with respect to the spatial coordinates, a method that is
also known as the method of lines. The results obtained show that for
large problems, linear speedup is achieved.
Faculty mentor: Matthias K. Gobbert
PDF-file of the report.
-
08:40-08:55
Parallelizing the 1-D Fast Fourier Transform
Michael Muscedere
Parallelizing of the Fast Fourier Transform (FFT) algorithm is explored
to increase execution speed and reduce single processor storage requirements.
Parallel FFT distribution of data between node processors
using a block structure of decimated input points is discussed.
Further discussion describes the use of a hypercube communication topology to
exchange necessary information between node processors. The resulting parallel
FFT algorithm transforms an AM signal correctly and has significant speedup
despite exchanges of large amount of information between node processors.
Faculty mentor: Matthias K. Gobbert
Postscript-file of the report.
Program for Day 2
Tuesday, December 10, 2002, 07:00-08:45 p.m., MP 401
-
07:00-07:15
A Nonlinear Equations Solver in Parallel Using ScaLAPACK
Ravi Siddani
The Scalable Linear Algebra PACKage (ScaLAPACK) is a library
that implements matrix algorithms in parallel. This project looks at
ScaLAPACK routines that would be invoked within each iteration of the
Broyden's algorithm to solve a system of non-linear equations in parallel.
Limited timing studies show that when the size of the matrix is large and
theblock size is small, we achieve a decrease in time taken for certain
portions of the algorithm. Future work will result in a more concrete
statement about time, and a statement about speed-up and efficiency.
Faculty mentor: Madhu Nayakkankuppam
Postscript-file of the report.
-
07:20-07:35
A Parallelization of the Batch K-Means Clustering Algorithm
Matthew C. Siegel
Clustering large data sets can be time consuming and processor
intensive. This paper describes a parallel implementation of a popular
clustering algorithm, the k-means algorithm, to provide for faster
clustering solutions. The algorithm was tested such that
3, 5, and 7 clusters were created on an 4 node Beowulf cluster
of dual-Pentium III 1000 MHz machines, and while optimal levels of
speedup were not achieved, benefits from parallel computing were
definitely discovered.
This work is joint with Dr. Jacob Kogan
(Department of Mathmatics and Statistics, UMBC)
Faculty mentor: Madhu Nayakkankuppam
PDF-file of the report.
-
07:40-07:55
Improved General Circle Detection Algorithm
Tomasz Macura
This paper considers the problem of detecting circles
in binary images on parallel computers. The author presents improvements to
the General Circle Detection Algorithm (GCDA) that was invented by
A. Kavaianpour, S. Shoari, and N. Bagherzadeh. The GCDA is based
on a transformation that converts circles in an image to families of straight
lines. This reduces the problem of circle detection to that of line detection
which may be solved by any line detection algorithm. Theoretical analysis of
the performance of the Improved General Circle Detection Algorithm (IGCDA)
suggests that speedup will be linear. The author's implementation
of the IGCDA on a Beowulf cluster of generic personal computers achieved the
predicted speedup.
Faculty mentor: Matthias K. Gobbert
Postscript-file of the report.
-
08:00-08:15
A Parallel Implementation of Lanczos Algorithm in SVD Computations
Srikanth V. Kallurkar
This work presents a parallel implementation of Lanczos
algorithm for SVD computations using MPI. We investigate execution time as
a function of matrix sparsity, matrix size, and the number of processes.
We observe speedups and slowdowns for parallelized version of the Lanczos
algorithm. Reasons for the speedups and the slowdowns are discussed in the
report.
This work is joint with Dr. Jacob Kogan
(Department of Mathmatics and Statistics, UMBC)
Faculty mentor: Madhu Nayakkankuppam
PDF-file of the report.
-
08:20-08:35
Parallel Jacobi Iteration Method for the Poisson Problem
Using a Two-Dimensional Cartesian Decomposition of the Domain
Zhibin Sun
To make the parallel Jacobi iteration method for the Poisson
Problem more efficient, I use the topology attributes of MPI,
which is 2-D decomposition of the domain, to accelerate the
original solution using 1-D decomposition. In the result, we can
see that the 2-D decomposition can obtain good accelerative
results in some aspects.
Faculty mentor: Matthias K. Gobbert
Postscript-file of the report.
Copyright © 2001-2002 by Matthias K. Gobbert. All Rights Reserved.
This page version 1.7, December 2002.