HIGH PERFORMANCE COMPUTING AND SIMULATIONS (Spring 08)
Course Number: CSCI 653
Class Number: 30196R
Instructor:
Aiichiro Nakano;
office: VHE 610; phone: (213) 821-2657; email: anakano@usc.edu
Lecture: 3:30-4:50pm M W, KAP 165
Office Hours: 3:30-4:50pm F
Prerequisites: (1) CS596 (Scientific Computing and Visualization) or
(2) basic knowledge of numerical methods, parallel computing, and 3D graphics
(CS580 or equivalent).
Textbooks:
D. Frenkel and B. Smit,
Understanding Molecular Simulation: From Algorithms to Applications, 2nd Ed.
(Academic Press, 2001)--recommended
A. Grama, A. Gupta, G. Karypis, and V. Kumar,
Introduction to Parallel Computing, 2nd Ed.
(Addison-Wesley, 2003)--recommended
W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling,
Numerical Recipes, 3rd Ed. (Cambridge Univ. Press, 2007)
--available online
(C,
Fortran 77, and
Fortran 90)
Course Description
Provide students with advanced techniques that are common to high performance computer simulations
in science and engineering. Scalable algorithms for both deterministic and stochastic simulations
of particles and continuum will be implemented on massively parallel and
distributed Grid computing platforms, and the simulation datasets will be visualized and analyzed
in immersive and interactive virtual environment.

Visualization of topological defects in a 209 million-node
chemical bond network.
Syllabus
- Deterministic particle simulation algorithms
- Survey of molecular dynamics (MD) simulation: spatiotemporal data locality in MD
- Fast computation of electrostatic interaction: O(N) fast multipole method
- Multiple time stepping: fuzzy cluster dynamics
- Parallel computing frameworks
- Parallel algorithm design: divide-and-conquer parallelization,
spatial vs. particle vs. force decomposition,data-driven parallelization
- Load balancing: wavelet-based computational space decomposition,
recursive spectral bisection, spacefilling-curve decomposition, load diffusion
- Scalability analysis
- Optimization of parallel scientific applications
- Deterministic continuum simulation algorithms
- Survey of quantum dynamics (QD) simulation
- Fast solutions of partial differential equations (PDE): O(NlogN) fast Fourier transform,
O(N) wavelet transform, O(N) multigrid method
- O(N) Lanczos and Davidson algorithms for the eigenvalue problem
- Multiscale particle-continuum simulation
- Hybridization techniques: minimizing model-boundary artifacts,
modular algorithm design, adaptive hybridization
- O(N) multiscale optimization
- Space-time multiscaling
- Stochastic simulation algorithms
- Survey of Monte Carlo (MC) simulation: estimator, importance sampling, Markov chain,
Metropolis algorithm
- Simulated annealing
- Kinetic MC: master equation, Poisson process
- Grid scientific computing
- Grid programming: Grid-enabled message passing interface (MPI-G2),
Grid remote procedure call (Ninf-G)
- Grid enabling parallel applications: virtualization-aware scientific algorithms
based on data-locality principles
- Grid MC applications: parallel replica and replica exchange MC
- Scientific data analysis and visualization
- Interactive visualization of large datasets in immersive virtual environment:
hierarchical/probabilistic culling algorithms
- Topology analysis: shortest-path circuits, parallel graph algorithms
- Scientific data mining
- Data compression
- Integration of simulation, data analysis and visualization on a Grid
- Advanced scientific computing methods
- Local and global optimization in molecular dynamics: physically-based preconditioning
of iterative solvers, basin-hopping algorithms, disconnectivity-graph analysis of
the energy landscape
- Accelerated long-time dynamics: path-integral sampling, ensemble mean-field method,
hyper dynamics, activation-relaxation method
Announcements
- 1/14/08 (M): Please email your answer to the questionnaire (1. Name, 2. PhD or MS, 3. Major,
4. Your Research Topics, 5. What High Performance Computing Methods in Particular
Would You Like to Learn?) from your USC account to anakano@usc.edu with subject csci653
by the end of Sunday, Jan. 20.
- 1/27/08: Instead of developing your own Web page for assignment 0, write your report
on the Wiki page.
- 3/5/08 (W): A tour to the USC
High Performance Computing Center, starting from KAP 165 at 3:30 pm.
- 4/7/08 (M): How to color a complex wave function is found in
Visualizing quantum scattering on the CM-2 supercomputer,
J. L. Richardson, Comput. Phys. Commun. 63, 84 (1991).
- 4/11/08 (F): Discussion session for HW 4 from 3:30pm in VHE 610.
- 4/18/08 (F): You are welcome to attend the guest lecture by
Prof. Elizabeth Cochran on
The Quake Catcher Network: Adventures in Volunteer Computing
at 10:00-10:50 am in GFS 216.
Lecture Notes and Reading List
- Introduction: slides
- Deterministic particle simulation algorithms
- Survey of molecular dynamics: notes and
slides;
minimal complex analysis
- Fast multipole method:
-
A fast algorithm for particle simulations,
L. Greengard and V. Rokhlin,
J. Comput. Phys. 73, 325 (1987)--Chunlei
-
Parallel multilevel preconditioned conjugate-gradient approach
to variable-charge molecular dynamics,
A. Nakano,
Comput. Phys. Commun. 104, 59 (1997)
-
Scalable and portable implementation of the fast multipole method on parallel computers,
S. Ogata, et al.,
Comput. Phys. Commun. 153, 445 (2003), source code available
at the CPC Program Library
- Multiple time stepping:
-
Reversible multiscale molecular dynamics,
M. Tuckerman, B. J. Berne, and G. J. Martyna,
J. Chem. Phys. 97, 1990 (1992)
-
Multiresolution molecular dynamics algorithm for realistic materials modeling on parallel computers,
A. Nakano, R. K. Kalia, and P. Vashishta,
Comput. Phys. Commun. 83, 197 (1994)
-
Fuzzy clustering approach to hierarchical molecular dynamics simulation of multiscale materials phenomena,
A. Nakano,
Comput. Phys. Commun. 105, 139 (1997)
- Parallel computing frameworks
- Parallel computing basics and parallel molecular dynamics:
message passing interface (MPI);
OpenMP;
hybrid MPI+OpenMP programming;
parallel MD algorithms (notes and
slides);
scalability analysis of parallel
molecular-dynamics and fast-multipole-method algorithms
-
Fast parallel algorithms for short-range molecular dynamics,
S. Plimpton,
J. Comp. Phys. 117, 1 (1995)
-
NAMD2: greater scalability for parallel molecular dynamics,
L. V. Kale, et al.,
J. Comput. Phys. 151, 283 (1999);
NAMD class hierarchy;
NAMD files
-
Anton, a special-purpose machine for molecular dynamics simulation,
D. E. Shaw, et al.,
Proc. of Int'l Symp. on Computer Architecture (IEEE, 2007);
a fast, scalable method for the parallel evaluation of distance-limited
pairwise particle interactions
D. E. Shaw,
J. Comput. Chem. 26, 1318 (2005)--Chih-Ying
- Divide-and-conquer parallelism: notes and
slides
- Load balancing: slides;
Lanczos method for eigensystems (slides and
supplementary notes) used in spectral-bisection
load balancer
-
Performance of dynamic load balancing algorithms for unstructured mesh calculations,
R. D. Williams,
Concurrency: Practice and Experience 3, 457 (1991)
-
A fast and high quality multilevel scheme for partitioning irregular graphs,
G. Karypis and V. Kumar,
SIAM J. Sci. Comput. 20, 359 (1998)
-
Multiresolution load balancing in curved space: the wavelet representation,
A. Nakano,
Concurrency: Practice and Experience 11, 343 (1999)
-
New challenges in dynamic load balancing,
K. D. Devine, et al.,
Applied Numerical Mathematics 52, 133 (2005)--Osman
-
Hypergraph-based dynamic load balancing for adaptive scientific computations,
U. V. Catalyurek, et al.,
in Proc. of Int'l Parallel & Distributed Processing Symp. (IEEE, 2007)
- Optimizing parallel MD: slides; lecture note by
Prof. James Demmel (UC Berkeley) on
high performance programming on a single processor: memory hierarchies,
matrix multiplication, and automatic performance tuning
-
Improving memory hierarchy performance for irregular applications,
J. Mellor-Crummey, D. Whalley, and K. Kennedy,
in Proc. of 1999 Int'l Conf. on Supercomputing (ACM, 1999);
Improving memory hierarchy performance for irregular applications
using data and computation reorderings,
Int'l J. Parallel Prog. 29, 217 (2001)--John (Tran)
-
Metrics and models for reordering transformations,
M. M. Strout and P. D. Hovland,
in Proc. of Workshop on Memory System Performance (ACM, 2004)--John (Tran)
-
Analysis of the clustering properties of the Hilbert space-filling curve,
B. Moon, et al.,
IEEE Trans. Knowledge Data Eng. 13, 124 (2001)
-
Recursive blocked algorithms and hybrid data structures for
dense matrix library software,
E. Elmroth, et al.,
SIAM Rev. 46, 3 (2004)
-
Cache-oblivious algorithms,
M. Frigo, et al.,
in Proc. of 40th Symp. on Foundation of Computer Science (FOCS) (IEEE, 1999)
- New architectures:
- Lattice-Boltzmann fluid simulation on a Playstation3 cluster:
slides
- MIT couse on
multicore programming primer: learn and compete in programming the Playstation3
cell processor (lectures and sample codes)
- UIUC course on
GPU computing
-
A programming example: large FFT on the cell broadband engine,
A. C. Chow, et al., IBM Technical Report (2005)
-
A rough guide to scientific computing on the Playstation3,
A. Buttari, et all., Univ. of Tennessee, Knoxville Technical report (2007)
-
Parallel lattice Boltzmann flow simulation on a low-cost PlayStation3 cluster,
K. Nomura, et al.,
Int'l J. Comput. Sci., in press (2008)
-
The promise and perils of the coming multicore revolution and its impact,
J. J. Dongarra, ed.,
CT Watch Quarterly 3(1) (February, 2007)
-
Intel teraflops research chip home page; see also Intel's
overview presentation
-
The landscape of parallel computing research: a view from Berkeley,
K. Asanovic, et al., UC Berkeley Technical Report (2006)
- AstroGPU (Princeton, NJ, Nov. 9-10, 2007)
talks
-
Accelerating molecular modeling applications with graphics processors,
J. E. Stone, et al.,
J. Comput. Chem. 28, 2618 (2007)--Liu
-
Harvesting graphics power for MD simulations,
J. A. van Meel, et al.,
arXiv:cond-mat.other/0709.3225v1--Liu
- Deterministic continuum simulation algorithms
- Survey of quantum dynamics (QD): QD basics;
spectral QD and fast Fourier transform (FFT);
parallelizing QD
- Multiresolution numerical methods: slides;
wavelets (Numerical Recipes, Sec. 13.10);
wavelets for image compression;
multigrid method (Numerical Recipes, Sec. 19.6);
-
Massively parallel algorithms for computational nanoelectronics based on quantum molecular dynamics,
A. Nakano, R. K. Kalia, and P. Vashishta,
Comput. Phys. Commun. 83, 181 (1994)
-
Embedded divide-and-conquer algorithm on hierarchical real-space grids:
parallel molecular dynamics simulation based on linear-scaling density functional theory,
F. Shimojo, et al.,
Comput. Phys. Commun. 167, 151 (2005)
-
Wavelets for computer graphics: a primer,
E. J. Stollnitz, et al.,
IEEE Computer Graphcs Appl. 15(3), 76 (1995)
- Hybrid particle/continuum simulation methods
- Multiscale simulation methods: slides
-
Linear-scaling relaxation of the atomic positions in nanostructures,
S. Goedecker, et al.,
Phys. Rev. B 64, 161102(R) (2001)
-
Hybrid finite-element/molecular-dynamics/electronic-density-functional approach
to materials simulations on parallel computers,
S. Ogata, et al.,
Comput. Phys. Commun. 138, 143 (2001)
-
Equation-free: the computer-aided analysis of complex multiscale systems,
I. G. Kevrekidis, C. W. Gear, and G. Hummer,
AlChE. J. 50, 1346 (2004)
-
Multiscale modeling of the dynamics of solids at finite temperature,
X. Li and W. E,
J. Mech. Phys. Solids 53, 1650 (2005)
-
Generalized mathematical homogenization of atomistic media at finite temperatures
in three dimensions,
J. Fish, W. Chen, and R. Li,
Comput. Meth. Appl. Mech. Eng. 196, 908 (2007)
-
Learning on the fly: a hybrid classical and quantum-mechanical molecular dynamics simulation,
G. Csanyi, et al.,
Phys. Rev. Lett. 93, 175503 (2004)
-
A python approach to multi-code simulations: CHIMPS
J. U. Schlutter, et al.,
Annual Research Briefs of the Center for Turbulence Research
(Stanford Univ., 2005)-- John (Spiegel)
- Scientific data analysis and visualization
- Visualization basics--OpenGL programming and visualizing particles:
notes and
slides
- Massive dataset visualization: slides
-
Immersive and interactive exploration of billion-atom systems,
A. Sharma, et al.,
Presence: Teleoperators and Virtual Environments 12, 85 (2003)
-
From mesh generation to scientific visualization:
an end-to-end approach to parallel supercomputing,
T. Tu, et al.,
in Proc. of IEEE/ACM Supercomputing 2006 (SC06) (IEEE, 2006)
-
ParaViz: a spatially decomposed parallel visualization algorithm
using hierarchical visibility ordering,
C. Zhang, et al.,
Int'l J. Computat. Sci. 1, 407 (2007)
- Scientific data mining: slides;
singular value decomposition and data mining
-
Mining scientific data,
N. Ramakrishnan and A. Grama,
Adv. Comput. 55, 119 (2001)
- Graph-based data mining,
D. J. Cook and L. B. Holder,
IEEE Intelligent Systems 15(2), 32 (2000)
- State of the art of graph-based
data mining,
T. Washio and H. Motoda,
ACM SIGKDD Explorations 5(1), 59 (2003)
-
A scalable distributed parallel breadth-first search algorithm on BlueGene/L,
A. Yoo, et al.,
in Proc. of IEEE/ACM Supercomputing 2005 (SC05) (ACM, 2005)
-
Collision-free spatial hash functions for structural analysis of billion-vertex
chemical bond networks,
C. Zhang, et al.,
Comput. Phys. Commun. 175, 339 (2006)
-
Change detection in time series data using wavelet footprints,
M. Sharifzadeh, F. Azmoodeh, and C. Shahabi,
Lecture Notes in Computer Science, 3633, 127 (2005)--Suzana
- Stochastic simulation methods
- Monte Carlo (MC) simulation basics: MC basics;
parallel MC;
kinetic MC simulation
- Long time dynamics and global optimization:
-
Extending the time scale in atomistic simulation of materials,
A. F. Voter, F. Montalenti, and T. C. Germann,
Annu. Rev. Mater. Res. 32, 321 (2002)
-
Sampling activated mechanisms in proteins with the activation-relaxation technique,
N. Mousseau, et al.,
J. Mol. Graph. Model. 19, 78 (2001);
source codes
-
Self-learning kinetic Monte Carlo method: application to Cu(111),
O. Trushin, et al.,
Phys. Rev. B 72, 115401 (2005)
-
Pathfinder: a parallel search algorithm for concerted atomistic events,
A. Nakano,
Comput. Phys. Commun. 176, 292 (2007)
-
A space-time-ensemble parallel nudged elastic band algorithm
for molecular kinetics simulation,
A. Nakano,
Comput. Phys. Commun. 178, 280 (2008)
-
The topology of multidimensional potential energy surfaces: theory and
application to peptide structure and kinetics,
O. M. Becker and M. Karplus,
J. Chem. Phys. 106, 1495 (1997)
-
Automated model reduction for complex systems exhibiting metastability,
I. Horenko, et al.,
Multiscale Model. Sim. 5, 802 (2006)
-
Protein folding by zipping and assembly,
S. B. Ozkan, et al., Proc. Nat'l Academy Sci. 104, 11987 (2007)
-
Computational linguistics: a new tool for exploring biopolymer structures
and statistical mechanics,
K. A. Dill, et al., Polymer 48, 4289 (2007)
- Stochastic continuum simulations
- Grid scientific computing
- Grid computing basics for application scientists: slides
- Grid-enabling scientific applications
-
Screen savers of the world unite,
M. Shirts and V. S. Pande,
Science 290, 1903 (2000)
-
Supporting efficient execution in heterogeneous distributed computing environments
with Cactus and Globus,
G. Allen, et al.,
in Proc. of IEEE/ACM Supercomputing 2001(SC01) (ACM, 2001)
-
Collaborative simulation Grid: multiscale quantum-mechanical/classical atomistic
simulations on distributed PC clusters in the US and Japan,
H. Kikuchi, et al.,
in Proc. of IEEE/ACM Supercomputing 2002 (SC02) (IEEE, 2002)
-
Sustainable adaptive Grid supercomputing: multiscale simulation of semiconductor
processing across the Pacific,
H. Takemiya, et al.,
in Proc. of IEEE/ACM Supercomputing 2006 (SC06) (IEEE, 2006)
-
Remote runtime steering of integrated terascale simulation and visualization,
H. Yu, et al.,
in Proc. of IEEE/ACM Supercomputing 2006 (SC06) (IEEE, 2006)
-
SCEC CyberShake workflows - automating probabilistic seismic hazard analysis
calculations,
P. Maechling, et al.,
in Workflows for e-Science, eds. D. Gannon, et al. (Springer, 2007)
-
BOINC: A system for public-resource computing and storage,
D. P. Anderson,
Proc. of 5th Int'l Workshop on Grid Computing (IEEE/ACM, 2004)--Gideon
-
A case for Grid computing on virtual machines,
R. J. Figueiredo, P. A. Dinda, and J. A. B. Fortes,
Proc. of Int'l Conf. on Distributed Computing Systems
(IEEE CS Press, 2003)--Gideon
Assignments
Source Codes
- MPI, OpenMP, and hybrid MPI+OpenMP:
mpi_simple.c, omp_pi.c, hpi.c
- Parallel molecular dynamics:
pmd.c, pmd.h, pmd.in
- Parallel hypercube sort:
pbmerge_mpi.c
- Quantum dynamics:
qd.c, qd.h, qd.in; qd1.c, qd1.h, qd1.in, four1.c
- Image I/O:
image-rw.c, Lenna512x512.pgm
- Visualization:
atomv.c, atomv.h, md.conf, ball.c
Useful Links