Math 447/627 - Introduction to Parallel Computing

Fall 2016 - Matthias K. Gobbert
Presentations of the Class Projects

Friday, December 16, 2016, 01:00 p.m.

01:05-01:20
A Parallel Performance Study of the High-Order Compact Direct Flux Reconstruction Method for Conservation Laws on the Maya Cluster
Lai Wang, Department of Mechanical Engineering
The compact direct flux reconstruction method (CDFR) for conservation laws utilizes techniques from compact finite difference methods to directly approximate spatial derivatives of fluxes with in the standard elements. The CDFR scheme is a compact high-order method family which can be efficiently parallelized for high performance computing. In the present study, a parallel performance study of the 3rd-order CDFR scheme with a 3rd-order explicit Runge-Kutta scheme is conducted. The inviscid isentropic vortex propagation problem is adopted as a test case. The convergence rate studies have demostrated that the CDFR method works well for conservation laws. The performance study shows excellent observed speedup and efficiency. This work is collaborative with my advisor Dr. Meilin Yu.
01:20-01:35
Training Neural Networks in Parallel With ADMM
Nicholas Haltmeyer, Department of Computer Science and Electrical Engineering
Neural networks are an increasingly popular model for solving classification problems, including optical character recognition. As these networks have become deeper and more rich in output space, alternatives to classical training methods have been proposed to allow for scalable, distributed learning. Herein I describe an application of the alternating direction method of multipliers for training neural networks in a distributed environment. Experiments are then presented for determining appropriate settings and evaluating performance on the MNIST database of handwritten digits. This work was done in collaboration with my advisor Dr. Ting Zhu.
01:35-01:50
Parallelizing the Computation of the Expected Value in a Binomial Tree with Bernoulli Paths
Sai Kumar Popuri, Department of Mathematics and Statistics
Recombinant binomial trees are structures with two child nodes branching out from every node, including the root node, in a way that the nodes combine to form a symmetric structure. They arise in several model computations. For example, in option pricing in finance, valuation of a European option can be carried out by evaluating the expected value of the asset payoffs with respect to the probabilities of traversing the branches of the tree, when a closed form solution is not appropriate. When the size of the tree increases, the cost to compute the expected value grows exponentially, rendering a serial computation one branch at a time very time-consuming and not practical. We propose a parallelization method that transforms the calculation of the expected value into a so-called `embarrassingly parallel' problem by mapping the branches of the binomial tree to the processes on a parallel computing cluster. We also propose two Monte Carlo estimation methods that exploit the stratification setup of our parallelization method and compare the relative variance reductions. We have implemented our parallelization method in the statistical environment R and the programming language Julia to calculate the price of a European option. Our numerical results indicate that while both the R and Julia implementations are scalable, the execution times of the Julia implementation are significantly better than the corresponding R implementation. A simulation study is carried out for an Asian and a Fixed Look-back option to verify the convergence and variance reduction behaviors in the stratified Monte Carlo methods in relation to the regular Monte Carlo estimator. This work was done in collaboration with my advisor Dr. Nagaraj Neerchal, Dr. Matthias Gobbert, and Dr. Andrew Raim.
01:50-02:05
Performance Study of an Elastic Plane Problem on Maya Cluster
Yangling Zhou, Department of Mechanical Engineering
The Finite Element Method (FEM) is a widely used numerical analysis tool in many fields. High accuracy results can always be obtained with refined mesh which needs long computing time. Parallel computing can be used to overcome this shortcoming. In this report, an elastic problem of a square plane which is fixed on the left side and has distributed loading on the top is solved as a testing problem using FEM and parallel computing. Two point to point parallel communication methods, blocking and nonblocking communications, are used and the results are compared. The results show that parallel computing with blocking or nonblocking communications obviously speed up computation. Nonblocking communication shows no higher efficiency compared with blocking communication for this problem. This study was done in collaboration with my supervisor Dr. Panos G. Charalambides of the Department of Mechanical Engineering.
02:05-02:20
Monte Carlo Parallel Processing: Reliability Analysis
Aryana Arsham, Department of Mathematics and Statistics
Monte Carlo (MC) simulation is the widely used tool for analysis of complex reliability systems. The performance measure of such systems is expressed by reliability metrics such as mean time to failure. Such simulation methods for assessing system reliability are often hindered by their requirement for considerable memory and time. Implementing MC reliability algorithms using parallel processing methods greatly enhances their efficiency and effectiveness. This work provides MC and parallel processing methods applied to realistic system modeling as well as results of implementing such algorithms, performed in R software. This work is collaborative with my advisor Dr. Nagaraj Neerchal.
02:20-02:35
Parallel Delaunay Triangulation
Yuping Zhang, Department of Computer Science and Electrical Engineering
Delaunay triangulation is a well known topic of computational geometry. In this report, I discussed several sequential and parallel Delaunay triangulation methods. I compared the advantages and shortages among those methods, then explained the implementation of both sequential and parallel DeWall algorithm. I run a few sample point sets on Maya cluster, using the serial program and parallel program separately. The results are validated by comparing with Octave results. A performance study has been done by analyzing the speedup and efficiency of parallel DeWall method from the recorded wall clock timing. It showed the benefit of using parallel method.

Math 447/627 - Introduction to Parallel Computing

Fall 2016 - Matthias K. Gobbert Presentations of the Class Projects

Friday, December 16, 2016, 01:00 p.m.

Fall 2016 - Matthias K. Gobbert
Presentations of the Class Projects