Breakout Discussion Groups

 

Group 1: Large System Requirements for Large Scale Analytics

Michael SchulmanModerator: Michael Schulman, Director of Marketing, ScaleMP

  • What are some of the compute and memory requirements for analyzing very large amounts of data?
  • How do you determine how much memory or compute horsepower is needed to get your job done?
  • When using a cloud for data analysis, what happens if they can't provide the needed resources?
  • Can distributing the data and analysis be used in all cases? When does this work and when doesn't it?
  • Do you modify what you want to accomplish based on available resources?

 

Group 2: Automated Workflow of Bioinformatics Tools for Whole-Genome Analysis

Clinton CarioModerator: Clinton Cario, System Programmer, University of Pittsburgh

  • How do you parse data files between different formats to streamline workflows?
  • Which frameworks/languages do you develop in and why?
  • When is it more appropriate to use flat files vs. databases for storage (if ever)?
  • What services are you looking for when transitioning to a cloud-based solution?
  • What is the greatest technological challenge or hurdle standing in the way of future workflow design/development?

 

Group 3: Meeting the Challenge of Moving Big Data from LAN to WAN

Moderator: Michael Sullivan, M.D., Associate Director, Health Sciences, Internet2

Moving Big Data between organizations and across long distances is challenging and raises several questions:

  • How do you identify and fix performance bottlenecks like perimeter firewalls? Is the Science DMZ architecture developed by the DOE the answer?
  • Do you have special security concerns with Big Data and how do you address them?
  • What has been your experience with specialized data transport software like Aspera or Globus Online?

Group 4: Intergrating Storage Systems for Data-Intensive Life Sciences Research

Moderator: Ron Hawkins, Director, Industry Relations, San Diego Supercomputer Center, University of California, San Diego

  • How do your research workflows and processes affect storage system choices? (What technology characteristics are key at each stage in your workflow?)
  • Do you envision integrating remote storage clouds and on-premises storage systems?
  • What are the considerations for data-access interfaces (file system, object-based, etc.)?
  • At what stage(s) are data redundancy and backup critical?

Is energy efficiency an important consideration?

Group 5: How would you like to make use of an Exascale computer if you have access to one?

Moderator:Ming Guo, Conference Director, Cambridge Healthtech Institute

 

Group 6: Kicking the Can Down the Road: Current Innovative methodologies for Storage

Moderator: D. Akira Robinson, Ph.D., Consulting Computer Scientist, Neuro-Epigenomics.com

 

Premier Sponsors

Annai Systems

Aspera 
 

Cycle Computing
 

DNAnexus
 

IBM 

Official Media Partner

Bio-IT World 

View All Sponsors 

View Media Partners 

Cloud Usage Study 

* IBM and the IBM logo are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide.