Content with tag featured .

Preaching the HPC Gospel

To get the help you need, sometimes you have to break something first. Dirk Colbry admits, a bit sheepishly, that he made his debut in Michigan State University's high performance computing...

Of Micelles and Machines: HPC Simulations Transform Everyday Household Products

Have you ever dropped your brand new razor or a full bottle of hand soap on a tiled bathroom floor and wondered why it didn’t simply shatter into a dozen pieces or split apart and create a gooey...

XSEDE Allocation System to Receive Makeover

XSEDE is a set of resources and systems that thousands of researchers, scientists and engineers regularly use to do groundbreaking science. But how does an XSEDE user actually request time on...

Internet2: Advancing Science in the Age of Big Data

Internet2 To Discuss Advancing Science in the Age of Big Data and Support for Network Virtualization at XSEDE14 Conference XSEDE Benefits 17 Supercomputers and 8,000 Scientists With Access to the...

XSEDE User Portal at your fingertips

XSEDE is offering a newly designed mobile-device-optimized version of the XSEDE User Portal (XUP). You can access the redesigned XUP Mobile web site by navigating to https://mobile.xsede.org on...

Dan Stanzione: New Executive Director at TACC

AUSTIN, Texas — Dan C. Stanzione Jr. has been named executive director of the Texas Advanced Computing Center (TACC) at The University of Texas at Austin. A nationally recognized leader in...

People of XSEDE: Campus Champions - Preaching the HPC Gospel

XSEDE's Campus Champions Provide Vital Link between Researchers, Supercomputing Resources To get the help you need, sometimes you have to break something first. Dirk Colbry admits, a bit...

Teen Mentored by UC San Diego Professors Wins $250,000 in Science Prizes

A 17-year-old senior at Canyon Crest Academy in San Diego's North County recently won not one, but three major science competitions after being mentored by two UC San Diego professors in a...

Open Science and Industry Collaboration

Consumers are happy when products flow nicely, whether the items are tubes of toothpaste or bottles of shampoo, while people in open-science research and private industry, respectively, like...

XSEDE, PRACE call for requests of joint support

XSEDE and PRACE, major research infrastructures, providing peer-reviewed access to high-end HPC resources and services in the United States and Europe, respectively, are now exploring options to...
Showing 1 - 10 of 65 results.
Items per Page 10
of 7
« Back

Seeing is Believing

Extreme Digital visualization and data analysis resources help researchers derive insights from massive data sets

By Aaron Dubrow, Texas Advanced Computing Center

Isaac Asimov, the American science fiction and popular science writer, famously said, "The most exciting phrase to hear in science, the one that heralds new discoveries, is not 'Eureka!' (I found it!) but 'That's funny. '"

In a world swimming in information, how does a scientist have such a revelation? How do they find a needle of insight in a growing digital haystack?

Scientific visualization is one important tool scientists use to make discoveries. The process of visualization converts data—from sensors, DNA sequencers, social networks, and massive high-performance computing simulations and models—into images that can be perceived by the eye and explored and interpreted by the human mind.

This aspect of discovery has always been valuable, but as our ability to simulate subatomic particles, perform high-resolution 3D scans of the body, or map the universe improves, turning that data into useful information is increasingly critical.

In November 2008, the National Science Foundation (NSF) requested proposals for "TeraGrid Phase III: eXtreme Digital Resources for Science and Engineering (XD)." The grants funded the first of a new class of computing systems: two state-of-the-art resources at the Texas Advanced Computing Center (TACC) and the National Institute for Computational Sciences (NICS) that together increased the visualization and data analysis capabilities of the open science community significantly.

The NSF solicitation was motivated by an awareness that simulations on high-performance computing systems and data from new scientific instruments were producing copious amounts of information that could not be analyzed or visualized by any previous system. 

"We were seeing science at a completely different scale," said Kelly Gaither, principal investigator for the XD Vis award and director of visualization at TACC. "These systems address the data deluge that we saw coming down the pipe as a result of the bigger HPC systems."

TACC's Longhorn was deployed in January 2010 and has been supporting visualization, data analysis, and general computing for a year and a half. A Dell cluster with both NVIDIA GPUs and Intel quad-core CPUs on each node, Longhorn provides unprecedented capabilities, foremost among them, the ability to remotely visualize massive data sets in real-time.

This means a research group in Topeka, Kansas, can compute and visualize their dataset on the Longhorn system in Austin, Texas, from the quiet of their offices. The researchers can move, spin, zoom, and, in some cases, animate the subject with the touch of a button.

Gaither thinks this new capability—a hands-on approach to virtual experiments—improves scientists' relationship to their data and has the potential to transform research.

"Oftentimes, researchers don't know what they're looking for. They use visualization to do debugging or to do exploratory analysis of their simulation data. In those cases, visualization is really the only way to see," Gaither said.  "It's generally recognized in the vis community that interactivity is a crucial component of being able to do that analysis."

Longhorn is the largest hardware-accelerated interactive visualization cluster in the world and has supported these real-time interactions for users as remote as Saudi Arabia. Longhorn is also able to manage incredibly huge data sets, including highly detailed visualizations created to study the instabilities in a burning Helium flame.

Nautilus, an SGI Altix UV1000 system, is likewise a large computing system designed for remote visualization and analysis. It, too, has significant amounts of computational and GPU capacity, but it has a significantly different architecture than Longhorn. Nautilus is a symmetric multiprocessor (SMP) machine, one where the system shares all of the available memory with all of the processors. The scientists see 1,024 CPU processors and 4 TB of memory as one single system. The system also contains eight GPUs for general-purpose processing and hardware-accelerated graphics.

"Graph and societal network analysis. Correlation and document clustering. There are all sorts of analyses that are not amenable to a cluster type of architecture," explained Sean Ahern, director of the Center for Remote Data Analysis and Visualization (RDAV) at NICS (the center that operates Nautilus), and visualization task leader at Oak Ridge National Laboratory. "We've been able to accelerate the science that researchers are already doing, taking it from weeks to hours, and we have other projects where the size of the memory means researchers can pull in entire datasets where they were never able to do so before."

Rather than proposing pure visualization systems, as have dominated in the past, these machines were built to be multipurpose, allowing interactive and batch visualization, GPGPU (general-purpose GPU-based) computing, traditional HPC computing, and new kinds of data analysis.

This composite nature allows the systems to provide improved visualization resources to the academic community, while remaining fully used to maximize the public investment.

Like all resources in the XSEDE infrastructure, Longhorn and Nautilus run 24 hours a day, 7 days a week, 365 days a year, and are supported by expert staff at the host centers. The resources are available to U.S. researchers through an XSEDE allocation from the National Science Foundation.

Over the course of the past year and a half, 1,560   scientists have used Longhorn and Nautilus, applying their unique speed and capabilities to wide-ranging science problems, while also exploring what role GPU-processing can play in science generally.

The results emerging from the systems are encouraging.

Some of the notable successes on Longhorn are a collaboration with the National Archives and Record Administration to develop a new visualization framework for digital archivists; visualizations of the Gulf oil spill that helped the National Oceanic and Atmospheric Administration and the Coast Guard locate and contain oil slicks; record-setting molecular dynamics simulations of surfactants, which are used in detergents, manufacturing, and nanotechnology; and visualizations of the earthquake in Japan.

"With our analysis code, I get as much as 16,000 times speedup on Longhorn, which has given much insight into the physics of the protein-water interface, and allows us to understand at a more fundamental level how nature designs proteins to catalyze reactions under non-extreme conditions," said David LeBard, a postdoctoral fellow in the Institute for Computational Molecular Science at Temple University.

Simulations by LeBard and his collaborator Dmitry V. Matyushov appeared in the Journal of Physical Chemistry B and were featured on the cover of Physical Chemistry Chemical Physics in December 2010.

Nautilus has seen similar successes. Researchers on the system have performed unprecedented species modeling in the Great Smoky Mountains National Park, a biodiversity hot spot; gained new insights in the role turbulence plays in fusion; and explored how human society has evolved over the last half-century using historical sources.

"Nautilus has been a critical enabling resource for the GlobalNet project in several ways," said Kalev Leetaru, senior research scientist for content analysis at the Illinois Institute for Computing in Humanities, Art and Social Science (I-CHASS). "Most visibly, the ability to instantly leverage terabytes of memory in a single system image has allowed the project for the first time to move beyond small 1 to 5 percent samples to explore the dataset as a whole, leading to numerous fundamental new discoveries simply not possible without the ability to analyze the entire dataset at once."

Together, the two systems have supported 759 projects, totaling 11.4 million computing hours (the equivalent of 1,250 years on a single desktop system) in the last year and a half.

Visualization and data analysis are clearly moving into the mainstream, and with the Extreme Digital visualization grants, the NSF has given a big boost to the national science community. Gaither and Ahern believe this could be the beginning of a new paradigm.

"Seeing the visualization and interacting with the data is probably one of the great enablers that will propel science for the next generation and beyond," Gaither said. "I think in some respects, you won't even see this intermediate thing called a ‘dataset'. You will interact with the simulations itself, or, if you'd prefer, with the science."

Ahern went further.

"Data without analysis is nothing," Ahern said. "If you've run a giant simulation, you've only done half the work. The real science comes from processing that data into something that people can understand. The job of science is done in the phase of analysis, and that's purely where we live."