HIGH PERFORMANCE COMPUTING AND SIMULATIONS
Reading List
Deterministic particle simulation algorithms
 Fast multipole method

A fast algorithm for particle simulations,
L. Greengard and V. Rokhlin,
J. Comput. Phys. 73, 325 (1987)

Parallel multilevel preconditioned conjugategradient approach
to variablecharge molecular dynamics,
A. Nakano,
Comput. Phys. Commun. 104, 59 (1997)

Scalable and portable implementation of the fast multipole method on
parallel computers,
S. Ogata, et al.,
Comput. Phys. Commun. 153, 445 (2003), source code available
at the CPC Program Library

A massively parallel adaptive fast multipole method on heterogeneous architectures,
I. Lashuk, et al.,
Commun. ACM 55, 101 (2012)

2HOT: an improved parallel hashed octtree Nbody algorithm for cosmological simulation,
M. S. Warren,
in Proc. of Supercomputing (SC13) (ACM/IEEE, 2013)

Comparison of scalable fast methods for longrange interactions,
A. Arnold, et al.,
Phys. Rev. E 88, 063308 (2013)

Multilevel summation with Bspline interpolation for pairwise interactions in molecular dynamics simulations,
D. J. Hardy, et al.,
J. Chem. Phys. 144, 114112 (2016)
 Multiple time stepping

Reversible multiscale molecular dynamics,
M. Tuckerman, B. J. Berne, and G. J. Martyna,
J. Chem. Phys. 97, 1990 (1992)

Multiresolution molecular dynamics algorithm for realistic materials modeling
on parallel computers,
A. Nakano, R. K. Kalia, and P. Vashishta,
Comput. Phys. Commun. 83, 197 (1994)

Fuzzy clustering approach to hierarchical molecular dynamics simulation of
multiscale materials phenomena,
A. Nakano,
Comput. Phys. Commun. 105, 139 (1997)

A massively spacetime parallel Nbody solver,
R. Speck, et al.,
in Proc. of Supercomputing (SC12) (IEEE/ACM, 2012)
Parallel computing frameworks
 Big picture

The landscape of parallel computing research: a view from Berkeley,
K. Asanovic, et al., UC Berkeley Tech. Rep. (2006)

The promise and perils of the coming multicore revolution and its impact,
J. J. Dongarra, ed.,
CT Watch Quarterly 3(1) (Feb., 2007)

Exascale computing and big data,
D. A. Reed and J. Dongarra,
Commun. ACM 58(7), 56 (2015)

Design for U.S. exascale computer takes shape,
R. F. Service, Science 359, 617 (2018)

Compute Cambrian explosion,
T. Coughlin, Forbes, Apr. 26 (2019)
 Parallel computing basics and parallel molecular dynamics

Fast parallel algorithms for shortrange molecular dynamics,
S. Plimpton,
J. Comput. Phys. 117, 1 (1995)

NAMD2: greater scalability for parallel molecular dynamics,
L. V. Kale, et al.,
J. Comput. Phys. 151, 283 (1999);
NAMD class hierarchy;
NAMD files

Hybrid messagepassing and sharedmemory programming in a molecular dynamics application
on multicore clusters,
M. J. Chorley, et al.,
Int'l J. High Performance Comput. Appl. 23, 196 (2009)

Efficient parallel implementation of molecular dynamics with embedded atom method
on multicore platforms,
C. Hu, et al.,
in Proc. of Int'l Conf. on Parallel Processing (IEEE, 2009)

Extending the generality of molecular dynamics simulations on a specialpurpose machine,
D. P. Scarpazza, et al.,
in Proc. of Int'l Parallel & Distributed Processing Symp. (IPDPS 2013) (IEEE, 2013);
Millisecondscale molecular dynamics simulations on Anton,
D. E. Shaw, et al.,
in Proc. of Supercomputing (SC09) (ACM/IEEE, 2009);
A fast, scalable method for the parallel evaluation of distancelimited
pairwise particle interactions
D. E. Shaw,
J. Comput. Chem. 26, 1318 (2005)

Analysis of scalable dataprivatization threading algorithms for hybrid MPI/OpenMP
parallelization of molecular dynamics,
M. Kunaseth, et al.,
J. Supercomput. 66, 406 (2013)

Performance characteristics of hardware transactional memory for molecular dynamics
application on BlueGene/Q: toward efficient multithreading strategies for
largescale scientific applications,
M. Kunaseth, et al.,
in Proc. of Int'l Workshop on Parallel and Distributed Scientific and Engineering Computing
(PDSEC13) (IEEE, 2013)

A scalable parallel algorithm for dynamic rangelimited ntuple computation
in manybody molecular dynamics simulation,
M. Kunaseth, et al.,
in Proc. of Supercomputing (SC13) (ACM/IEEE, 2013)

Metascalable quantum molecular dynamics simulations of hydrogenondemand,
K. Nomura, et al.,
in Proc. of Supercomputing (SC14) (IEEE/ACM, 2014)

Kokkos: enabling manycore performance portability through polymorphic memory access patterns,
H. C. Edwards, et al.,
J. Par. Distrib. Comput. 74, 3202 (2014)

GROMACS: high performance molecular simulations through
multilevel parallelism from laptops to supercomputers,
M. J. Abraham, et al.,
SoftwareX 12, 19 (2015)

Orderinvariant real number summation: circumventing accuracy loss for multimillion summands on multiple parallel architectures,
P. E. Small, et al.,
in Proc. of Int'l Parallel & Distributed Processing Symp. (IPDPS 2016) (IEEE, 2016)

Redesigning LAMMPS for petascale and hundredbillionatom simulation on Sunway TaihuLight,
X. Duan, et al.,
in Proc. of Supercomputing (SC18) (IEEE/ACM, 2018)

Shiftcollapse acceleration of generalized polarizable reactive molecular dynamics for machine learningassisted computational synthesis of layered materials,
K. Liu, et al.,
in Proc. of ScalA18, p. 41 (IEEE/ACM, 2018)

Shift/collapse on neighbor list (SCNBL): fast evaluation of dynamic manybody potentials in molecular dynamics simulations,
M. Kunaseth, et al.,
Comput. Phys. Commun. 235, 88 (2019)
 Divideconquer"recombine" parallelism

A divideandconquer/cellulardecomposition framework for milliontobillion atom simulations of
chemical reactions,
A. Nakano, et al.,
Comput. Mater. Sci. 38, 642 (2007)

De novo ultrascale atomistic simulations on highend parallel supercomputers,
A. Nakano, et al.,
Int'l J. High Performance Comput. Appl. 22, 113 (2008)

A metascalable computing framework for large spatiotemporalscale atomistic simulations,
K. Nomura, et al.,
in Proc. of Int'l Parallel & Distributed Processing Symp. (IPDPS 2009) (IEEE, 2009)

Nanoscopic mechanisms of singlet fission in amorphous molecular solid,
W. Mou, et al.,
Appl. Phys. Lett. 102, 173301 (2013)

A divideconquerrecombine algorithmic paradigm for large spatiotemporal quantum molecular dynamics simulations,
F. Shimojo, et al.,
J. Chem. Phys. 140, 18A529 (2014)

Quantum molecular dynamics in the postpetaflop/s era,
N. A. Romero, et al.,
IEEE Computer 48(11), 33 (2015)
 Load balancing

Performance of dynamic load balancing algorithms for unstructured mesh calculations,
R. D. Williams,
Concurrency: Practice and Experience 3, 457 (1991)

Provably good partitioning and load balancing algorithms
for parallel adaptive Nbody simulation,
S.H. Teng,
SIAM J. Sci. Comput. 19, 635 (1998)

A fast and high quality multilevel scheme for partitioning irregular graphs,
G. Karypis and V. Kumar,
SIAM J. Sci. Comput. 20, 359 (1998)

Multiresolution load balancing in curved space: the wavelet representation,
A. Nakano,
Concurrency: Practice and Experience 11, 343 (1999)

New challenges in dynamic load balancing,
K. D. Devine, et al.,
Applied Numerical Mathematics 52, 133 (2005)

Hypergraphbased dynamic load balancing for adaptive scientific computations,
U. V. Catalyurek, et al.,
in Proc. of Int'l Parallel & Distributed Processing Symp. (IPDPS 2007) (IEEE, 2007)

A repartitioning hypergraph model for dynamic load balancing,
U. V. Catalyurek, et al.,
J. Parallel Distrib. Comput. 69, 711 (2009)

Load balancing Nbody simulations with highly nonuniform density,
O. Pearce, et al.,
in Proc. of Int'l Conf. on Supercomputing (ICS'14) (ACM, 2014)
 Optimizing parallel MD

Improving memory hierarchy performance for irregular applications,
J. MellorCrummey, D. Whalley, and K. Kennedy,
in Proc. of Int'l Conf. on Supercomputing (ACM, 1999);
Improving memory hierarchy performance for irregular applications
using data and computation reorderings,
Int'l J. Parallel Prog. 29, 217 (2001)

Cacheoblivious algorithms,
M. Frigo, et al.,
in Proc. of Symp. on Foundation of Computer Science (FOCS) (IEEE, 1999)

Analysis of the clustering properties of the Hilbert spacefilling curve,
B. Moon, et al.,
IEEE Trans. Knowledge Data Eng. 13, 124 (2001)

Metrics and models for reordering transformations,
M. M. Strout and P. D. Hovland,
in Proc. of Workshop on Memory System Performance (ACM, 2004)

Recursive blocked algorithms and hybrid data structures for
dense matrix library software,
E. Elmroth, et al.,
SIAM Rev. 46, 3 (2004)

Roofline: an insightful visual performance model for multicore architectures,
S. Williams, et al.,
Commun. ACM 52, 65 (2009)

Performance modeling, analysis, and optimization of celllist based molecular dynamics,
M. Kunaseth, et al.,
in Proc. of Int'l Conf. on Scientific Comp. (CSC'10) (2010)

Exploiting hierarchical parallelisms for molecular dynamics simulation on multicore clusters,
L. Peng, et al.,
J. Supercomput. 57, 20 (2011)

Hierarchical parallelization and optimization of highorder stencil computations on multicore clusters,
H. Dursun, et al.,
J. Supercomput. 62, 946 (2012)

On using the roofline model with lower bounds on data movement,
V. Elango, et al.,
ACM. T. Arch. Code Opt. 11, 67 (2015)

Mixed data layout kernels for vectorized complex arithmetic,
D. T. Popovici, et al.,
in Proc. of HPEC (IEEE, 2017)

Practical implementation of lattice QCD simulation on SIMD machines with Intel AVX512,
I. Kanamori, et al.,
in Proc. of ICCSA (2018)
 New architectures

A programming example: large FFT on the cell broadband engine,
A. C. Chow, et al., IBM Tech. Rep. (2005)

A rough guide to scientific computing on the Playstation3,
A. Buttari, et all., Univ. of Tennessee, Knoxville Technical report (2007)

Accelerating molecular modeling applications with graphics processors,
J. E. Stone, et al.,
J. Comput. Chem. 28, 2618 (2007)

Parallel lattice Boltzmann flow simulation on a lowcost PlayStation3 cluster,
K. Nomura, et al.,
Int'l J. Comput. Sci. 2, 437 (2008)

Harvesting graphics power for MD simulations,
J. A. van Meel, et al.,
Mol. Sim. 34, 259 (2008)

An MPI performance monitoring interface for Cell based compute nodes,
H. Dursun, et al.,
Parallel Processing Lett. 19, 535 (2009)

A massively parallel adaptive fastmultipole method on heterogeneous architectures,
I. Lashuk, et al.,
in Proc. of Supercomputing (SC09) (ACM/IEEE, 2009)

Dynamic load balancing on single and multiGPU systems,
L. Chen, et al.,
in Proc. of Int'l Parallel & Distributed Processing Symp. (IPDPS 2010) (IEEE, 2010)

Preliminary investigation of optimizing molecular dynamics simulation
on GodsonT manycore processor,
L. Peng, et al.,
in Proc. of Workshop on Unconventional High Performance Comp. (2010)

Enhanced molecular dynamics performance with a programmable graphics processor,
D. C. Rapaport,
Comput. Phys. Commun. 182, 926 (2011)

Exploring SIMD for molecular dynamics, using Intel Xeon processors and
Intel Xeon Phi coprocessors,
S. J. Pennycook, et al.,
in Proc. of Int'l Parallel & Distributed Processing Symp. (IPDPS 2013)
(IEEE, 2013)

Scalability study of molecular dynamics simulation on GodsonT manycore architecture,
L. Peng, et al.,
J. Par. Distrib. Comput. 73, 1469 (2013)

PuReMDGPU: a reactive molecular dynamics simulation package for GPUs,
S. B. Kylasa, et al.,
J. Comput. Phys. 272, 343 (2014)

Knights Landing (KNL): 2nd generation Intel Xeon Phi processor,
A. Sodani, et al.,
Hot Chips (IEEE/ACM, 2015)

Optimizing noncontiguous memory access on Intel Xeon Phi coprocessors,
M. Ma, et al.,
in Proc. of Int'l Conf. High Perform. Comput. Commun. (HPCC)
(IEEE, 2015)

Strong scaling of generalpurpose molecular dynamics simulations on GPUs,
J. Glaser, et al.,
Comput. Phys. Commun. 192, 97 (2015)

The Sunway TaihuLight supercomputer: system and applications,
H. Fu, et al.,
Sci. China Inf. Sci. 59, 072001 (2016);
Report on the Sunway TaihuLight system,
J. Dongarra,
Univ. of Tennessee Tech. Rep., UTEECS16742 (2016);
China inches toward the exascale,
R. Courtland,
IEEE Spectrum, 53(8), 14 (2016)

MPIACC: acceleratoraware MPI for scientific applications,
A. M. Aji, et al.,
IEEE T. Par. Distrib. Sys. 27, 1401 (2016);
Evolving MPI+X toward exascale,
D. A. Bader,
IEEE Computer 49(8), 10 (2016);
MPI+X,
M. Wolfe, HPC Wire (2014);
MPI+MPI,
T. Hoefler et al., Computing 95, 1121 (2013)

Breadth first search vectorization on the Intel Xeon Phi,
M. Paredes, et al.,
in Proc. of Int'l Conf. Computing Frontiers (CF)
(ACM, 2016)

A GPUaccelerated machine learning framework for molecular simulation: Hoomdblue with TensorFlow,
R. Barrett, et al.,
ChemrXiv, 8019527 (2019)

Towards artificial general intelligence with hybrid Tianjic chip architecture,
J. Pei, et al.,
Nature 572, 106 (2019)
Deterministic continuum simulation algorithms
 Multiresolution numerical methods

Massively parallel algorithms for computational nanoelectronics based on quantum molecular dynamics,
A. Nakano, R. K. Kalia, and P. Vashishta,
Comput. Phys. Commun. 83, 181 (1994)

Wavelets for computer graphics: a primer,
E. J. Stollnitz, et al.,
IEEE Computer Graphcs Appl. 15(3), 76 (1995)

Embedded divideandconquer algorithm on hierarchical realspace grids:
parallel molecular dynamics simulation based on linearscaling density functional theory,
F. Shimojo, et al.,
Comput. Phys. Commun. 167, 151 (2005)

Autotuning multigrid with PetaBricks,
C. Chan, et al.,
in Proc. of Supercomputing (SC09) (ACM/IEEE, 2009)

Parallel geometricalgebraic multigrid on unstructured forests of octrees,
H. Sundar,
in Proc. of Supercomputing (SC12) (IEEE/ACM, 2012)
 Parallel continuum simulations

Graphbased linear scaling electronic structure theory,
A. M. N. Niklasson, et al.,
J. Chem. Phys. 144, 234101 (2016)

QXMD: An opensource program for nonadiabatic quantum molecular dynamics,
F. Shimojo, et al.,
SoftwareX 10, 100307 (2019)

Parallel transport timedependent density functional theory calculations with hybrid functional on Summit,
W. Jia, et al.,
arXiv, 1905.01348v1 (2019)
Hybrid particle/continuum simulation methods
 Multiscale simulation methods

Linearscaling relaxation of the atomic positions in nanostructures,
S. Goedecker, et al.,
Phys. Rev. B 64, 161102(R) (2001)

Hybrid finiteelement/moleculardynamics/electronicdensityfunctional approach
to materials simulations on parallel computers,
S. Ogata, et al.,
Comput. Phys. Commun. 138, 143 (2001)

Equationfree: the computeraided analysis of complex multiscale systems,
I. G. Kevrekidis, C. W. Gear, and G. Hummer,
AlChE. J. 50, 1346 (2004)

Learning on the fly: a hybrid classical and quantummechanical molecular dynamics simulation,
G. Csanyi, et al.,
Phys. Rev. Lett. 93, 175503 (2004)

Multiscale modeling of the dynamics of solids at finite temperature,
X. Li and W. E,
J. Mech. Phys. Solids 53, 1650 (2005)

A python approach to multicode simulations: CHIMPS,
J. U. Schlutter, et al.,
Annual Research Briefs of the Center for Turbulence Research
(Stanford Univ., 2005)

Generalized mathematical homogenization of atomistic media at finite temperatures
in three dimensions,
J. Fish, W. Chen, and R. Li,
Comput. Meth. Appl. Mech. Eng. 196, 908 (2007)

Equation of motion for coarsegrained simulation based on microscopic description,
T. Kinjo and S. Hyodo,
Phys. Rev. E 75, 051109 (2007)

A hybrid multiloop geneticalgorithm/simplex/spatialgrid method for locating
the optimum orientation of an adsorbed protein on a solid surface,
T. Wei, et al.,
Comput. Phys. Commun. 180, 669 (2009)

Hybrid latticeBoltzmann/levelset method for liquid simulation and visualization,
Y. Kwak, et al.,
Int'l J. Comput. Sci. 3, 579 (2009)

Efficient ab initio modeling of random multicomponent alloys,
C. Jiang and B. P. Uberuaga,
Phys. Rev. Lett. 116, 105501 (2016)

Multiscale timedependent density functional theory for a unified description of ultrafast dynamics: Pulsed light, electron, and lattice motions in crystalline solids,
A. Yamada and K. Yabana,
Phys. Rev. B 99, 245103 (2019)
Scientific data visualization and analytics
 Massive dataset visualization

Immersive and interactive exploration of billionatom systems,
A. Sharma, et al.,
Presence: Teleoperators and Virtual Environments 12, 85 (2003)

From mesh generation to scientific visualization:
an endtoend approach to parallel supercomputing,
T. Tu, et al.,
in Proc. of Supercomputing (SC06) (IEEE/ACM, 2006)

ParaViz: a spatially decomposed parallel visualization algorithm
using hierarchical visibility ordering,
C. Zhang, et al.,
Int'l J. Computat. Sci. 1, 407 (2007)

Nextgeneration visualization technologies: enabling discoveries at extreme scale,
K.L. Ma, et al.,
SciDAC Review 12, 12 (2009)

Scalable computation of streamlines on very large datasets,
D. Pugmire, et al.,
in Proc. of Supercomputing (SC09) (ACM/IEEE, 2009)

Terascale data organization for discovering multivariate climatic trends,
W. Kendall, et al.,
in Proc. of Supercomputing (SC09) (ACM/IEEE, 2009)

Realtime ray tracing with CUDA,
M. Shih, et al.,
in Proc. of Int'l Conf. on Algorithms and Architectures
for Parallel Processing (ICA3PP '09) (2009)

MultiGPU volume rendering using MapReduce,
J. A. Stuart, et al.,
in Proc. of Int'l Workshop on MapReduce and its Applications (MAPREDUCE 2010),
Int'l Symp. on High Performance Distributed Comput. (HPDC'2010) (2010)

Parallel I/O, analysis, and visualization of a trillion particle simulation,
S. Byna, et al.,
in Proc. of Supercomputing (SC12) (IEEE/ACM, 2012)

METAGUI. a VMD interface for analyzing metadynamics and molecular dynamics simulations,
X. Biarnes, et al.,
Comput. Phys. Commun. 183, 203 (2012)

Massively parallel inverse rendering using multiobjective particle swarm optimization,
K. Nagano, et al.,
J. Vis. 20, 195 (2017)
 Virtual reality and 3D display

A headmounted three dimensional display,
I. E. Sutherland,
in Proc. of AFIPS, p. 757 (ACM, 1968)

Surroundscreen projectionbased virtual reality: the design and implementation of the CAVE,
C. CruzNeira, et al.,
in Proc. of SIGGRAPH, p. 135 (ACM, 1993)

Rendering for an interactive 360° light field display,
A. Jones, et al.,
in Proc. of SIGGRAPH (ACM, 2007)

An autostereoscopic projector array optimized for 3d facial display,
K. Nagano, et al.,
in Proc. of SIGGRAPH (ACM, 2013)

Threedimensional volume containing multiple twodimensional information patterns,
H. Nakayama, et al.,
Sci. Rep. 3, 1931 (2013)

Inline digital holographic microscopy using a consumer scanner,
T. Shimobaba, et al.,
Sci. Rep. 3, 2664 (2013)

iBET: immersive visualization of biological electrontransfer dynamics,
C. M. Nakano, et al.,
J. Mol. Graph. Model. 65, 94 (2016)

GameEngineAssisted Research platform for Scientific computing (GEARS) in virtual reality,
B. Horton, et al.,
SoftwareX 9, 112 (2019)
 Scientific machine learning and big data analytics
 Graphbased data mining,
D. J. Cook and L. B. Holder,
IEEE Intelligent Systems 15(2), 32 (2000)

Mining scientific data,
N. Ramakrishnan and A. Grama,
Adv. Comput. 55, 119 (2001)
 State of the art of graphbased
data mining,
T. Washio and H. Motoda,
ACM SIGKDD Explorations 5(1), 59 (2003)

Change detection in time series data using wavelet footprints,
M. Sharifzadeh, F. Azmoodeh, and C. Shahabi,
Lecture Notes in Computer Science, 3633, 127 (2005)

Towards the computational design of solid catalysts,
J. K. Norskov, et al.,
Nature Chem. 1, 37 (2009)

Dynamic structure learning of factor graphs and parameter estimation
of a constrained nonlinear predictive model for oilfield optimization
H. Lee, et al.,
in Proc. of Int'l Conf. on Artificial Intelligence (ICAI'10) (Las Vegas, NV, 2010)

DNA sequencing via quantum mechanics and machine learning,
H. Yuen, et al.,
Int'l J. Comput. Sci. 4, 352 (2010)

Nonlinear dimensionality reduction in molecular simulation:
the diffusion map approach,
A. Ferguson, et al.,
Chem. Phys. Lett. 509, 1 (2011)

The Harvard clean energy project: largescale computational screening and design
of organic photovoltaics on the world community Grid,
J. Hachmann, et al.,
J. Phys. Chem. Lett. 2, 2241 (2011)

Nonlinear dimensionality reduction for nonadiabatic dynamics: the influence of
conical intersection topography on population transfer rates,
A. M. Virshup, et al.,
J. Chem. Phys. 137, 22A519 (2012)

Using sketchmap coordinates to analyze and bias molecular dynamics simulations,
G. A. Tribello, et al.,
Proc. Nat. Acad. Sci. 109, 5196 (2012)

Python Materials Genomics (pymatgen): a robust, opensource Python library
for materials analysis,
S. P. Ong, et al.,
Comput. Mater. Sci. 68, 314 (2013);
Opportunities and challenges for firstprinciples materials design and
applications to Li battery materials,
G. Ceder,
MRS Bulletin 35, 693 (2010)

Stochastic voyages into uncharted chemical space produce a representative library
of all possible druglike compounds,
A. M. Virshup, et al.,
J. Am. Chem. Soc. 135, 7296 (2013)

Amplify scientific discovery with artificial intelligence,
Y. Gil, et al.,
Science 346, 171 (2014)

General multiobjective force field optimization framework,
with application to reactive force fields for silicon carbide,
A. JaramilloBotero, et al.,
J. Chem. Theory Comput. 10, 1426 (2014)

Humanlevel concept learning through probabilistic program induction,
B. M. Lake, et al.,
Science 350, 1332 (2015)

Constructing highdimensional neural network potentials,
J. Behler,
Int. J. Quant. Chem. 115, 1032 (2015)

Identifying structural flow defects in disordered solids using machinelearning methods,
E. D. Cubuk, et al.,
Phys. Rev. Lett. 114, 108001 (2015)

Maximally informative hierarchical representations of highdimensional data,
G. Ver Steeg & A. Galstyan,
in Proc. of Artificial Intelligence and Statistics (AISTATS) (2015)

New opportunities for materials informatics:
resources and data mining techniques for uncovering hidden relationships,
A. Jain, et al.,
J. Mater. Res. 31, 977 (2016)

ZNN  a fast and scalable algorithm for training 3D convolutional networks
on multicore and manycore shared memory machines,
A. Zlateski, et al.,
in Proc. of Int'l Parallel & Distributed Processing Symp. (IPDPS 2016) (IEEE, 2016);
ZNNi: maximizing the inference throughput of 3D convolutional networks on CPUs and GPUs,
A. Zlateski, et al.,
in Proc. of Supercomputing (SC16) (ACM/IEEE, 2016)

Information leverage in interconnected ecosystems: overcoming the curse of dimensionality,
H. Ye and G. Sugihara,
Science 353, 922 (2016)

Probing the metabolic heterogeneity of live Euglena gracilis with stimulated Raman scattering microscopy,
Y. Wakisaka, et al.,
Nature Microbiol. 1, 16124 (2016); see also serendipiter

ANI1: an extensible neural network potential with DFT accuracy at force field computational cost,
J. S. Smith, et al.,
arXiv, 1610.08935 (2016)

Deep learning at 15PF,
T. Kurth, et al.,
in Proc. of Supercomputing (SC17) (ACM/IEEE, 2017)

Exascale deep learning for climate analytics,
T. Kurth, et al.,
in Proc. of Supercomputing (SC18) (IEEE/ACM, 2018)

Multiobjective genetic training and uncertainty quantification of reactive force fields,
A. Mishra, et al.,
npj Comput. Mater. 4, 42 (2018)

Active learning for accelerated design of layered materials,
L. Bassman, et al.,
npj Comput. Mater. 4, 74 (2018)

Structural phase transitions in a MoWSe2 monolayer: molecular dynamics simulations and variational autoencoder analysis,
P. Rajak, et al.,
Phys. Rev. B 100, 014108 (2019)

A 20Year Community Roadmap for Artificial Intelligence Research in the US,
Y. Gill & B. Selman (CCC/AAAI, 2019)
 Graph data

Analytic and algorithmic solution of random satisfiability problems,
M. Mezard, et al.,
Science 297, 812 (2002)

A scalable distributed parallel breadthfirst search algorithm on BlueGene/L,
A. Yoo, et al.,
in Proc. of Supercomputing (SC05) (ACM/IEEE, 2005)

Collisionfree spatial hash functions for structural analysis of billionvertex
chemical bond networks,
C. Zhang, et al.,
Comput. Phys. Commun. 175, 339 (2006)

Finding long cycles in graphs,
E. Marinari, et al.,
Phys. Rev. E 75, 066708 (2007)

Hypergraphs and cellular networks,
S. Klamt, et al.,
PLoS Comput. Biol. 5(5), e1000385 (2009)

The emerging field of signal processing on graphs,
D. I. Shuman, et al.,
IEEE Signal Proc. Mag. 2013(5), 84 (2013)

Graph curvature for differentiating cancer networks,
R. Sandhu, et al.,
Sci. Rep. 5, 12323 (2015)

Synthesized classifiers for zeroshot learning,
S. Changpinyo, et al.,
in Proc. of Computer Vision & Pattern Recognition (CVPR) (IEEE, 2016)

Neural message passing for quantum chemistry,
J. Gilmer, et al.,
in Proc. Int'l Conf. on Machine Learning, ICML (2017)

Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties,
T. Xie and J. C. Grossman,
Phys. Rev. Lett. 120, 145301 (2018)

Graph neural network analysis of layered material phases,
K. Liu, et al.,
in Proc. SpringSimHPC (SCS, 2019)

Decoding molecular graph embeddings with reinforcement learning,
S. Kearnes, et al.,
in Proc. Int'l Conf. on Machine Learning, ICML (2019)
Stochastic simulation methods
 Monte Carlo (MC) simulation basics

Theoretical foundations of dynamical Monte Carlo simulations,
K. A. Fichthorn and W. H. Weinberg,
J. Chem Phys. 95, 1090 (1991)

Suppressing roughness of virtual times in parallel discreteevent simulations,
G. Korniss, et al.,
Science 299, 677 (2003)

Optimal allocation of replicas to processors in parallel tempering simulations,
D. J. Earlab and M. W. Deemr,
J. Phys. Chem. B 108, 6844 (2004)

Parallel tempering: theory, applications, and new perspectives,
D. J. Earlab and M. W. Deemr,
Phys. Chem. Chem. Phys. 7, 3910 (2005)

A decentralized parallel implementation for parallel tempering algorithm,
Y. Li, et al.,
Parallel Comput. 35, 269 (2009)

Towards optimal scaling of Metropoliscoupled Markov chain Monte Carlo,
Y. F. Atchande, et al.,
Stat. Comput. 20 (2010)

Billionatom synchronous parallel kinetic Monte Carlo simulations of
critical 3D Ising systems,
E. Martinez, et al.,
J. Comput. Phys. 230, 1359 (2011)

An Introduction to Monte Carlo Simulations of Surface Reactions,
A. P. J. Jansen (Springer, 2012)

A derivation and scalable implementation of the synchronous parallel kinetic Monte Carlo method for simulating longtime dynamics,
H. S. Byun, et al.,
Comput. Phys. Commun. 219, 246 (2017)
 Long time dynamics and global optimization

The topology of multidimensional potential energy surfaces: theory and
application to peptide structure and kinetics,
O. M. Becker and M. Karplus,
J. Chem. Phys. 106, 1495 (1997)

Sampling activated mechanisms in proteins with the activationrelaxation technique,
N. Mousseau, et al.,
J. Mol. Graph. Model. 19, 78 (2001);
source codes

Selflearning kinetic Monte Carlo method: application to Cu(111),
O. Trushin, et al.,
Phys. Rev. B 72, 115401 (2005)

Automated model reduction for complex systems exhibiting metastability,
I. Horenko, et al.,
Multiscale Model. Sim. 5, 802 (2006)

Pathfinder: a parallel search algorithm for concerted atomistic events,
A. Nakano,
Comput. Phys. Commun. 176, 292 (2007)

Protein folding by zipping and assembly,
S. B. Ozkan, et al., Proc. Nat'l Academy Sci. 104, 11987 (2007)

Computational linguistics: a new tool for exploring biopolymer structures
and statistical mechanics,
K. A. Dill, et al., Polymer 48, 4289 (2007)

A spacetimeensemble parallel nudged elastic band algorithm
for molecular kinetics simulation,
A. Nakano,
Comput. Phys. Commun. 178, 280 (2008)

Accelerated molecular dynamics methods: introduction and recent developments,
D. Perez, et al.,
Annu. Rep. Comput. Chem. 5, 79 (2009);
Extending the time scale in atomistic simulation of materials,
A. F. Voter, F. Montalenti, and T. C. Germann,
Annu. Rev. Mater. Res. 32, 321 (2002)

Tracing conformational changes in proteins,
N. Haspel, et al., BMC Struct. Biol. 10, S1 (2010)

Enhanced modeling via network theory: adaptive sampling of Markov state models,
G. R. Bowman, et al., J. Chem. Theory Comput. 6, 787 (2010)

Optimizing transition states via kernelbased machine learning,
Z. D. Pozun, et al., J. Chem. Phys. 136, 174101 (2012)

Reinforced dynamics for enhanced sampling in large atomic and molecular systems,
L. Zhang, et al., J. Chem. Phys. 148, 124113 (2018)

Boltzmann generators  sampling equilibrium states of manybody systems with deep learning,
F. Noe and H. Wu, arXiv, 1812.01729v1 (2018)
 Stochastic continuum simulations

Stochastic finite elements with multiple random nonGaussian properties,
R. Ghanem,
J. Eng. Mech. 125, 26 (1999)

Parallel computing for option pricing based on the backward stochastic differential equation,
Y. Peng, et al.,
in Proc. of High Performance Computing and Applications (HPCA 2009) (2009)

Stochastic timedependent DFT with optimally tuned rangeseparated hybrids: application to excitonic effects in large phosphorene sheets,
V. Vlcek, et al.,
J. Chem. Phys. 150, 184118 (2019)
Distributed scientific computing
 Gridenabling scientific applications

Screen savers of the world unite,
M. Shirts and V. S. Pande,
Science 290, 1903 (2000)

Supporting efficient execution in heterogeneous distributed computing environments
with Cactus and Globus,
G. Allen, et al.,
in Proc. of Supercomputing (SC01) (ACM/IEEE, 2001)

Collaborative simulation Grid: multiscale quantummechanical/classical atomistic
simulations on distributed PC clusters in the US and Japan,
H. Kikuchi, et al.,
in Proc. of Supercomputing (SC02) (IEEE/ACM, 2002)

A case for Grid computing on virtual machines,
R. J. Figueiredo, P. A. Dinda, and J. A. B. Fortes,
in Proc. of Int'l Conf. on Distributed Computing Systems
(IEEE CS Press, 2003)

BOINC: A system for publicresource computing and storage,
D. P. Anderson,
in Proc. of 5th Int'l Workshop on Grid Computing (IEEE/ACM, 2004)

Sustainable adaptive Grid supercomputing: multiscale simulation of semiconductor
processing across the Pacific,
H. Takemiya, et al.,
in Proc. of Supercomputing (SC06) (IEEE/ACM, 2006)

Remote runtime steering of integrated terascale simulation and visualization,
H. Yu, et al.,
in Proc. of Supercomputing (SC06) (IEEE/ACM, 2006)

SCEC CyberShake workflows  automating probabilistic seismic hazard analysis
calculations,
P. Maechling, et al.,
in Workflows for eScience, eds. D. Gannon, et al. (Springer, 2007)

Grid enablement and sustainable simulation of multiscale physics applications,
Y. Song, et al.,
in Proc. of CCGrid (2009)

Predicting protein structures with a multiplayer online game,
S. Cooper, et al.,
Nature 466, 756 (2010)

De novo protein design by citizen scientists,
B. Koepnick, et al.,
Nature 570, 390 (2019)
 Cloud computing

MapReduce: simplified data processing on large clusters,
J. Dean and S. Ghemawat,
Comm. ACM 51(1), 107 (2008)

Above the clouds: a Berkeley view of cloud computing,
M. Armbrust, et al.,
Tech. Report, UC Berkeley (2009)

Highperformance cloud computing: a view of scientific applications,
C. Vecchiola, et al.,
in Proc. of PSAN (2010)

Distributed GraphLab: a framework for machine learning and data mining in the cloud,
Y. Low, et al.,
Proc. VLDB 5, 716 (2012)