High Performance Computing

 

September 11-12 


Faster and more efficient computing is more and more important as we are facing the “big data” problem. Generations of supercomputing experts use innovative designs and parallelism to achieve superior computational peak performance. This track will feature speakers from the creator of the Condor system, to the hardware designer who knows the architectural design of the fastest computer in the world – the K computer, as well as presentations on the smart algorithms and software that enable data-intensive computing in life science.

Day 1 | Day 2 

Tuesday, September 11

7:30 am Registration and Morning Coffee


Faster, Greener Supercomputers
 

8:15 Chairperson’s Opening Remarks

Kevin Davies, Ph.D., Editor-in-Chief, Bio-IT World


» Keynote Presentation 

8:30 High-Throughput Computing with Cloud Resources

Miron LivnyMiron Livny, Ph.D., Professor, Computer Sciences Department, University of Wisconsin

Since the mid 80’s, the Condor project has been engaged in supporting the high-throughput computing (HTC) needs of scientific and commercial applications.  Universities, research laboratories and enterprises have adopted the Condor distributed resource management system to harness all available capacity of large, dynamic and heterogeneous collections of computing resources. The computing capabilities of clouds are a natural fit to the HTC model and are therefore used by a growing number of our users to increase their computational throughput.

9:00 K Computer and Its Application to Life Sciences

Makoto TaijiMakoto Taiji, Ph.D., Director, Computational Biology Core, RIKEN Quantitative Biology Center; Team Leader, Processor Research Team, RIKEN Advanced Institute for Computational Sciences

The K computer is the fastest high-performance computer in the world with 10-PFLOPS nominal peak performance. We are developing the software for computational life science including molecular simulations, drug developments, cellular simulations, brain simulations, and bioinformatics. The application plans in life science field will be discussed.

9:30  Gordon - A Flash-based Supercomputer for Data Intensive Computing

Robert SinkovitsRobert Sinkovits, Ph.D., Gordon Applications Lead, San Diego Supercomputer Center, University of California San Diego

The Gordon system at the San Diego Supercomputer Center was designed from the ground up to solve data and memory intensive problems. Each of Gordon’s 1024 compute nodes contains two Intel Sandy Bridge octo-core processors and 64 GB of memory. The nodes are connected via a dual-rail 3D torus network based on Mellanox QDR Infiniband hardware and can access a 4 PB Lustre-based parallel file system capable of delivering up to 100 GB/s of sequential bandwidth. Two novel features of Gordon though make it particularly well suited for data intensive problems. To bridge the large latency gap between remote memory and spinning disk, Gordon contains 300 TB of high performance Intel 710 series solid-state storage. Gordon also deploys a number of  “supernodes”, based on ScaleMP’s vSMP foundation software, which can provide users with up to 2 TB of virtual shared memory. This talk will cover the Gordon architecture, our motivation for building the system, and a summary of recent success stories on Gordon spanning a number of domains.

10:00 Coffee Break in the Exhibit Hall with Poster Viewing

10:30 Introduction of Massively Parallel Computing Applications on TH-1A System

Nan LiNan Li, Ph.D., Professor, Vice Dean, School of Computing, National University of Defense Technology, China

TH-1A is China’s first petaflops supercomputer and is now installed in the National Supercomputing Center in Tianjin, which was ranked No.1 on the TOP500 list released in Nov. 2010. TH-1A system adopts a hybrid architecture of heterogeneous integration of CPU+GPU and self-intellectual high-speed interconnection system, and shows distinguished system usability, performance stability and application scalability in quite a few high performance computing areas, provides an important platform for science research and technology innovation. In this presentation, several large-scale application tests on TH-1A are introduced, such as the oil seismic data processing, aircraft flow field simulation, biomolecular dynamics simulation, magnetic confinement fusion numerical simulation, turbulent flow simulation, crystal silicon molecular dynamics simulation, fully implicit simulation of global atmospheric shallow wave, and heat flow simulation of earth’s outer core. These results show that TH-1A has good parallel efficiency and scalability in practical applications.

11:00 Panel Discussion: Applications of Supercomputers in Life Sciences

Moderator: Kevin Davies, Ph.D., Editor-in-Chief, Bio-IT World

Panelists:

Miron Livny, Ph.D., Professor, Computer Sciences Department, University of Wisconsin

Makoto Taiji, Ph.D., Director, Computational Biology Core, RIKEN Quantitative Biology Center; Team Leader, Processor Research Team, RIKEN Advanced Institute for Computational Sciences

Nan Li, Ph.D., Professor, Vice Dean, School of Computing, National University of Defense Technology, China

Robert Sinkovits, Ph.D., Gordon Applications Lead, San Diego Supercomputer Center, University of California San Diego

12:00 pm Close of Session

12:15 Luncheon Presentation (Sponsorship Opportunity Available) or Lunch on Your Own


Scale Up High-Throughput Computing 

2:00 Chairperson’s Remarks

D Akira Robinson, Ph.D., Consulting Computer Scientist,Neuro-Epigenomics.Com

2:05 RDMA -- A Concept for Low-Latency, High-Throughput Data Movement

Robert D. Russell, Ph.D., Associate Professor, University of New Hampshire InterOperability Laboratory

Remote Direct Memory Access (RDMA) transfers data without operating system intervention directly between the virtual memories of processes on different network nodes. RDMA protocols avoid the extra data copying of traditional TCP/IP and UDP/IP, resulting in low latency for short messages, high bandwidth for large messages, and low CPU utilization for both. This talk gives a brief introduction to the 3 current RDMA technologies (InfiniBand, RoCE, iWARP), presents some performance measurements, and discusses examples of how and where RDMA is being used in HPC today.

2:35 Understanding Cancer: Trillions of Data Points, Petabytes of Storage

Gary StiehrGary Stiehr, Group Leader, Information Systems, The Genome Institute, Washington University

Recent advances in DNA sequencing technologies have dramatically changed the scale at which we can analyze and understand individuals at a genetic level--and that’s making a huge impact on cancer research. To enable these discoveries, however, it has been essential to leverage High Performance Computing technologies. These projects involve trillions of data points moving through sophisticated bioinformatics pipelines and require petabytes of high performance storage and thousands of CPU cores. We’ll discuss the challenges faced in such an environment along with a few approaches to handling those challenges.
 

3:05 Refreshment Break in the Exhibit Hall with Poster Viewing

3:45 Supporting Large Scale Data Access and Analysis through Shared Cloud Resources

Weijia Xu, Ph.D., Center for Computational Biology & Bioinformatics, University of Texas Austin

Data intensive computing tasks present different challenges and requirements than computational intensive tasks. In this talk, I will give an introduction on high performance computing resources at Texas advanced computing center. I will present how data intensive computations can be supported by provisions of dynamic cloud environment using existing high performance computing clusters through couple of ongoing projects.

4:15 From an Algorithm to the Spreadsheet into the Cloud

Hans Henning GabriellHans-Henning Gabriel, Data Scientist, Datameer

The Apache Hadoop project has become an important tool for data analytics. Utilizing the MapReduce paradigm, it enables scientists to parallelize their computations on a large cluster of inexpensive machines and scale on demand. This talk will explain what the MapReduce paradigm is and how it can directly be applied to tackle a biological challenge. We will show a simple way to execute a computationally intense application in the cloud on demand, through a generic spreadsheet approach that hides the complexity of parallelizing algorithms. This allows researchers to concentrate on their research, rather than on the IT and analytics infrastructure traditionally associated with data analysis.

4:45 Rapid False-Discovery Rate Estimation for Bioinformatics Data

Mark Seligman, Principal Investigator, Insilicos LLC

The problem of multiple comparisons poses significant challenges to bioinformatics practitioners. Efforts to compensate can be so conservative as to severely constrain large studies. False-discovery rate estimation is one alternative, but the more powerful versions of this approach impose heavy computational costs. For the case of linear regression, we demonstrate significant acceleration of these methods using GPUs.

5:15 Welcome Reception in the Exhibit Hall with Poster Viewing

6:15 Close of Day



Day 1 | Day 2 



*IBM and the IBM logo are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide.

Premier Sponsors

Annai Systems

Aspera 
 

Cycle Computing
 

DNAnexus
 

IBM 

Official Media Partner

Bio-IT World 

View All Sponsors 

View Media Partners 

Cloud Usage Study 

* IBM and the IBM logo are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide.