Math 447/627 - Introduction to Parallel Computing
Fall 2016 - Matthias K. Gobbert
Presentations of the Class Projects
Friday, December 16, 2016, 01:00 p.m.
-
01:05-01:20
A Parallel Performance Study of the High-Order Compact
Direct Flux Reconstruction Method for Conservation Laws
on the Maya Cluster
Lai Wang, Department of Mechanical Engineering
The compact direct flux reconstruction method (CDFR) for
conservation laws utilizes techniques from compact finite
difference methods to directly approximate spatial derivatives
of fluxes with in the standard elements. The CDFR scheme is a
compact high-order method family which can be efficiently
parallelized for high performance computing. In the present
study, a parallel performance study of the 3rd-order CDFR
scheme with a 3rd-order explicit Runge-Kutta scheme is
conducted. The inviscid isentropic vortex propagation problem
is adopted as a test case. The convergence rate studies have
demostrated that the CDFR method works well for conservation
laws. The performance study shows excellent observed speedup
and efficiency.
This work is collaborative with my advisor Dr. Meilin Yu.
-
01:20-01:35
Training Neural Networks in Parallel With ADMM
Nicholas Haltmeyer,
Department of Computer Science and Electrical Engineering
Neural networks are an increasingly popular model for solving
classification problems, including optical character recognition. As
these networks have become deeper and more rich in output space,
alternatives to classical training methods have been proposed to allow
for scalable, distributed learning. Herein I describe an application
of the alternating direction method of multipliers for training neural
networks in a distributed environment. Experiments are then presented
for determining appropriate settings and evaluating performance on the
MNIST database of handwritten digits. This work was done in
collaboration with my advisor Dr. Ting Zhu.
-
01:35-01:50
Parallelizing the Computation of the Expected Value in a
Binomial Tree with Bernoulli Paths
Sai Kumar Popuri, Department of Mathematics and Statistics
Recombinant binomial trees are structures with two child nodes
branching out from every node, including the root node,
in a way that the nodes combine to form a symmetric structure. They
arise in several model computations. For example, in
option pricing in finance,
valuation of a European option can be carried out by evaluating the
expected value of the asset payoffs with respect to the probabilities
of traversing the branches of the tree,
when a closed form solution is not appropriate. When the size of the
tree increases, the cost to compute the expected
value grows exponentially, rendering a serial computation one branch
at a time very time-consuming and not practical.
We propose a parallelization method that transforms the calculation of
the expected value
into a so-called `embarrassingly parallel' problem by mapping the
branches of the binomial tree to the processes on a
parallel computing cluster. We also propose two Monte Carlo estimation
methods that exploit the stratification setup of
our parallelization method and compare the relative variance reductions.
We have implemented our parallelization method
in the statistical environment R and the programming language Julia to
calculate the price of a European option. Our numerical results
indicate that while both the R and Julia implementations are scalable,
the execution times of the
Julia implementation are significantly better than the corresponding R
implementation. A simulation
study is carried out for an Asian and a Fixed Look-back option to
verify the convergence and variance reduction behaviors
in the stratified Monte Carlo methods in relation to the regular Monte
Carlo estimator. This work was done in collaboration with my advisor
Dr. Nagaraj Neerchal, Dr. Matthias Gobbert, and Dr. Andrew Raim.
-
01:50-02:05
Performance Study of an Elastic Plane Problem on Maya Cluster
Yangling Zhou, Department of Mechanical Engineering
The Finite Element Method (FEM) is a widely used numerical analysis
tool in many fields. High accuracy results can always be obtained
with refined mesh which needs long computing time. Parallel computing
can be used to overcome this shortcoming. In this report, an elastic
problem of a square plane which is fixed on the left side and has
distributed loading on the top is solved as a testing problem using
FEM and parallel computing. Two point to point parallel communication
methods, blocking and nonblocking communications, are used and the
results are compared. The results show that parallel computing with
blocking or nonblocking communications obviously speed up
computation. Nonblocking communication shows no higher efficiency
compared with blocking communication for this problem. This study
was done in collaboration with my supervisor Dr. Panos G.
Charalambides of the Department of Mechanical Engineering.
-
02:05-02:20
Monte Carlo Parallel Processing: Reliability Analysis
Aryana Arsham, Department of Mathematics and Statistics
Monte Carlo (MC) simulation is the widely used tool for analysis of
complex reliability systems. The performance measure of such systems
is expressed by reliability metrics such as mean time to failure.
Such simulation methods for assessing system reliability are often
hindered by their requirement for considerable memory and time.
Implementing MC reliability algorithms using parallel processing
methods greatly enhances their efficiency and effectiveness. This
work provides MC and parallel processing methods applied to realistic
system modeling as well as results of implementing such algorithms,
performed in R software. This work is collaborative with my advisor
Dr. Nagaraj Neerchal.
-
02:20-02:35
Parallel Delaunay Triangulation
Yuping Zhang,
Department of Computer Science and Electrical Engineering
Delaunay triangulation is a well known topic of computational
geometry. In this report, I discussed several sequential and parallel
Delaunay triangulation methods. I compared the advantages and
shortages among those methods, then explained the implementation of
both sequential and parallel DeWall algorithm. I run a few sample
point sets on Maya cluster, using the serial program and parallel
program separately. The results are validated by comparing with
Octave results. A performance study has been done by analyzing the
speedup and efficiency of parallel DeWall method from the recorded
wall clock timing. It showed the benefit of using parallel method.
Copyright © 2001-2016 by Matthias K. Gobbert. All Rights Reserved.
This page version 0.1, December 2016.