Rice-led team shows it can improve quality of supercomputing answers by 1,000 times
Computer scientists from Rice University, Argonne National Laboratory and the University of Illinois at Urbana-Champaign have used one of Isaac Newton’s numerical methods to demonstrate how “inexact computing” can dramatically improve the quality of simulations run on supercomputers.
The research is summarized in a paper on the preprint server ArXiv and is part of an ongoing effort by scientists at Rice University’s Center for Computing at the Margins (RUCCAM) to dramatically improve the resolution of weather and climate models with new ultra-efficient approaches to supercomputing.
The research stems from an idea put forward in 2003 by RUCCAM Director Krishna Palem: Accuracy and energy are exchangeable in computation, and sacrificing minimal accuracy can yield tremendous energy savings.
“In many situations, having an answer that is accurate to seven or eight decimal places is of no greater value than having an answer that is accurate to three or four decimal places, and it is important or realize that there are very real costs, in terms of energy expended, to arrive at the more accurate answer,” Palem said. “The discipline of inexact computing centers on saving energy wherever possibly by paying only for the accuracy that is required in a given situation.”
Palem, who won a Guggenheim Fellowship in 2015 to adapt these approaches to climate and weather modeling, collaborated with Oxford University physicist and climate scientist Tim Palmer to show that inexact computing could potentially reduce by a factor of three the amount of energy needed to run weather models without compromising the quality of the forecast.
In the new research, Palem, working with colleagues at Rice, with a team at Argonne National Laboratory headed by Sven Leyffer and Stefan Wild, and with Marc Snir of the University of Illinois at Urbana-Champaign (UIUC) showed it is possible to leapfrog from one part of a computation to the next and reinvest the energy saved from inexact computations at each new leap to increase the quality of the final answer while retaining the same energy budget.
Palem likened the new approach to calculating answers in a relay of sprints rather than in a marathon.
“By cutting precision and handing off the saved energy, we achieve significant quality improvements,” said Palem, Rice’s Kenneth and Audrey Kennedy Professor of Computer Science. “This model allows us to change the way computational energy resources are utilized in supercomputers to dramatically improve solutions within a fixed energy budget.”
The research team took advantage of one of the most commonly used tools of numerical analysis, a method known as Newton-Raphson that was created in the 1600s by Isaac Newton and Joseph Raphson. In supercomputing, the method is used to allow high-performance computers to find successively better approximations to complex mathematical functions.
The researchers demonstrated that the solution’s quality could be improved by more than three orders of magnitude for a fixed energy cost when an inexact approach to calculation was used rather than a traditional high-precision approach.
“In simple terms, it is analogous to rebalancing an investment portfolio,” said Snir, the Michael Faiman Professor in the Department of Computer Science at UIUC. “If you have one investment that’s done well but has maxed out its potential, you might want to reinvest some or all of those funds to a new source with more potential for a much better return on investment.”
Palem said, “A specific goal is to encourage the application of this approach as a way to advance the quality of weather and climate modeling by improving model resolution.”
He said RUCCAM is working with Oxford’s Palmer and others to explore possible ways to improve the resolution of the OpenIFS model that was developed by the European Center of Medium Range Weather Forecasting.
Additional co-authors include Mike Fagan of Rice, and Kazutomo Yoshii and Hal Finkel of Argonne. The research was supported by the Department of Energy, the Defense Advanced Research Projects Agency and the Guggenheim Foundation.
RUCCAM brings together researchers from Rice and universities around the world to explore solutions to the physical and energy limitations currently restricting the continued expansion of computing capacity needed to solve emerging workload problems. The researchers intend to change the manner in which computational resources are utilized, even at the margins of stability and accuracy, to increase the efficiency at which answers can be calculated.