This blog will be dedicated to examining and promoting civic data in Chicago, Cook County and Illinois.
WBEZ is partnering with the Smart Chicago Collaborative to promote civic data. This blog will be part of that collaboration.
We'll post original data sets @OpenSocrata
(A visualization by the Argonne Leadership Computing Facility utilizing real patient MRI data displays blood flow of a brain aneurysm.)
Joseph Insley, principal software development specialist at Argonne National Laboratory, gave a presentation about how the lab is using its supercomputer Mira to create visualizations that crunch data at a galactic size and monstrous computation speeds.
Insley works with the Argonne Leadership Computing Facility (ALCF), which allows outside groups to apply for time to use the facilities. (It’s really competitive!
The ALCF uses Mira, said to be the 4th fastest computer in the world, an IBM Blue Gene/Q Supercomputer operating with 768 terabytes of memory 10 petaFLOPS. Insley said this means it can handle 10 quadrillion floating-point operations per second.
It’s that processing power that allows the ACLF to process and store massive amounts of data.
It’s not enough to just process the data, but displaying it can require a bit of horsepower as well. Insley said that’s where “Tukey” comes into the picture. Tukey is the compute cluster that processes the images for visualizations.
To provide some context, a typical MacBook Pro utilizes a single Nvidia GeForce GT 650M graphics chip. Tukey contains 96 AMD Dual Operation Compute nodes, with each node containing 16 CPUs, 64 gigs of RAM, 2 NVIDIA Tesla M2070 GPUs —at 6GB of RAM each.
It’s that visualization tonnage paired with Mira’s computing power that allow ALCF to tackle projects in climate, nuclear energy, medicine and astrophysics.
One of the projects Insley has worked on is the visualization of arterial blood flow, to accurately model physical and biological systems.
Insley said this requires “simulating on multiple scales, [which] results in very large, complex data sets.”
"Problems like this and many others that run on our biggest machines, visualization is often the best and sometimes the only way to really make sense of all this data."
In the animation example below, they used real patient MRI data to study an aneurysm in the brain, analyzing large scale blood flow.
The vizualization is essentially figuring out how a person’s blood flood on a large scale affects one tiny region of the brain on the particle-level.
Another example that Insley worked on visualizing was a cosmology simulation, which sought to accomplish a little task like “the evolution of the universe.”
According to Insley, the job is so large that it is an ongoing simulation that occasionally pauses to collect data and before resuming.
Insley said code that powers it tracks “the behavior and interactions of individual particles, actually 1.1 trillion of them.”
He said it’s the largest simulation of its kind to date.
Insley said there’s certainly a limit to the amount of data they can process. Being able to write the data to disk becomes difficult with size.
He said the ACLF actively looks at ways where we can do analysis and visualization before it gets written to disk.
When asked about the importance of visualization for scientist Insley said it can be used for validation. “They expect things to be one way, and the visualization can reaffirm that.”
He said at some point in the future you can take the scans and immediately be able to run a simulation what’s happening in a patient.
For now, the tricorders and starship scanners may be on hold.