December 6, 1999
 

        I.B.M. Plans Supercomputer That
        Works at Speed of Life

        By STEVE LOHR

            Two years ago, when an IBM supercomputer known
            as Deep Blue beat the world chess champion, Gary
        Kasparov, it seemed a confirmation of the Computer
        Age, a triumph of machine over man. On Monday, IBM
        is announcing a five-year, $100-million program to
        build a supercomputer whose ambitions dwarf Deep
        Blue's.

        The goal of the new supercomputer, to be called Blue
        Gene, is to simulate one of the most common routines in
        natural biology -- the process by which amino acids
        intricately fold themselves into full-fledged proteins,
        the body's molecular work force whose chores range
        from metabolizing food to fighting disease.

                              Protein folding may be
                              routine, but it is a routine of
                              enormous complexity. To
                              simulate the process is
                              beyond the reach of
                              contemporary computing. To
                              meet the challenge, Blue
                              Gene is being built to run
                              500 times faster than the
                              world's fastest
                              supercomputer today, by
                              using an innovative design
        that seeks to sharply accelerate the already torrid rate
        at which the speed of computers improves.

        If successful, the IBM project would not only be a
        breakthrough in cutting-edge computing, but would also
        help supply fundamental insights into the basic physics
        and chemistry of biology, opening the door to a new
        understanding of diseases and more effective drugs,
        perhaps ones individually tailored to a person's genetic
        makeup.

        The publicly financed human genome project, an
        international research initiative begun in 1990 with the
        goal of deciphering all of the human genetic code by
        2005, is already generating a huge quantity of
        biological data for computation. IBM's Blue Gene
        program is, in a sense, an effort to take the next step --
        feeding the genetic data into a supercomputer to try to
        understand basic biological processes.

        Computer scientists and biologists who have been
        briefed in advance of Monday's announcement are
        impressed by the ambition and promise of the IBM
        effort.

        "The combination of all the ideas that IBM is putting
        together to make a supercomputer on this scale is really
        exciting," said Ken Kennedy, a computer scientist at
        Rice University who is co-chairman of the President's
        Information Technology Advisory Committee.

        "And there will be a lot more benefit to society from
        this project than there was from having Deep Blue beat
        Kasparov in chess," he added.

        Biologists and medical experts say the broadest impact
        of the IBM research, at least during the next few years,
        will likely come from its contribution to improving the
        field of computer simulations of molecular biology in
        general, rather than the "grand challenge" of protein
        folding itself. It is expected to take five years before
        Blue Gene is ready to begin the marathon simulations of
        protein folding.

        For more than a decade, biological researchers have
        used computer simulations to study the activity of
        proteins in the body -- how drugs bind to proteins, for
        example, or how cell membranes absorb some
        substances while screening out others.

        Being able to run faster, longer and better simulations
        of more modest molecular mysteries than protein
        folding could have big health-care payoffs in
        understanding ailments like heart disease and high
        blood pressure.

        "The promise of what IBM is doing is far beyond the
        one machine," said Bernard Brooks, a principal
        investigator at the National Institutes of Health and a
        leading expert in computer simulations of molecular
        biology. "The really important work that can be done
        with this technology is in smaller-scale simulations
        rather than the demonstration project of protein
        folding."

        For IBM, Blue Gene is a research program of its
        renowned Watson Labs. But it is the expected
        trickle-down of research knowledge into commercial
        uses that justifies the company's $100 million
        investment on Blue Gene.

        Since mid-1998, IBM has jumped from third in
        supercomputer installations worldwide, after the Cray
        division of Silicon Graphics and Sun Microsystems, to
        the top spot.

        In that time, IBM has nearly doubled its share of the
        500 most powerful machines, from 75 to 141 last
        month, according to "The Top 500 Supercomputing
        Sites," a list compiled by three academic
        supercomputer experts. At the same time, the number of
        installations for Cray fell sharply while Sun
        Microsystems held steady.

        "There is no doubt in our mind that a lot of that
        improvement is because of what we learned with Deep
        Blue," said Paul Horn, senior vice president of
        research. "The payoff can be enormous."

        Several IBM supercomputers are already at work on
        the human genome project worldwide, including one
        that is host to one of the project's central databases in
        Toronto.

        The announcement Monday, just as the project is getting
        under way, is also clearly an image-burnishing step by
        IBM, intended to emphasize its commitment to
        supercomputing and to research. Blue Gene, experts
        agree, is a multidisciplinary endeavor requiring not
        only computer hardware, software and manufacturing
        expertise but also mathematicians, biologists, chemists
        and physicists.

        In addition, the Blue Gene project
        should serve as a kind of recruiting
        tool for IBM research -- and
        perhaps serve as a venture that
        could lift the stature of
        computer-science research in
        general. Such a lift, according to
        Kennedy of Rice University, is
        badly needed. Computer talent, to be
        sure, has perhaps never been in such
        great demand as it is today. Yet the excitement of
        Internet start-ups and the lure of stock options, Kennedy
        notes, has meant that computer-science students
        increasingly shun graduate studies and advanced
        research.

        "A few projects like this could re-establish research
        institutions -- academic or corporate -- as centers of
        excitement in computing," Kennedy said. "It's going to
        bring some of those minds back."

        The frontier of computational biology is certainly a
        field that can stir excitement in the research community
        as well as hold out the promise of being a huge industry
        someday. In the last few years, IBM has built a
        30-person team of researchers in computational
        biology.

        IBM hopes its supercomputer project will stimulate the
        field. "We want to attract significant interest and
        involvement from university researchers and from the
        scientific community in general," Dr. Sharon Nunes, a
        senior research manager, said. "If we can influence this
        fundamental research, it will happen faster."

        The computing innovation behind Blue Gene, in
        essence, is to build a computer that works much as
        nature works -- a triumph, if it succeeds, of marrying
        simplicity and complexity.

        The computer scientists at IBM plan to sharply simplify
        the RISC (reduced instruction-set computing)
        architecture used in the chips that run engineering work
        stations and supercomputers today. The "instruction
        set" -- the total vocabulary of machine-language
        instructions a computer understands -- will number 57
        for Blue Gene, compared with about 200 for most RISC
        machines.

        Then, instead of putting a single microprocessor on a
        chip, Blue Gene will have 32 microprocessors -- the
        calculating engines of computers -- on each chip.
        Sixty-four such chips will be inserted on each
        motherboard, with eight motherboards in each of the 64
        computing towers of Blue Gene.

        When completed, Blue Gene will stand about six feet
        high, occupying a floor space of 40 feet by 40 feet at
        the Watson labs in Yorktown Heights, N.Y. It will have
        a total of about 1 million microprocessors.

        Among the innovations computer scientists find most
        impressive about Blue Gene is that IBM will place
        memory for storing data on the same chip as the
        microprocessor. In conventional computer designs, the
        memory for storage is separate from the processor.

        Shuttling data from the memory to the processor is a
        major bottleneck in computers, slowing them down.
        Only within the last year or so, because of advances in
        chip making and miniaturization, has it become
        possible to consider putting memory and processing on
        the same chip in the way that IBM is developing.

        To attain the speeds Blue Gene seeks within five years,
        IBM must try a new architecture of computing. The
        conventional wisdom holds that microprocessor speeds
        can theoretically double every 18 months, a
        phenomenon known as Moore's Law, for Gordon
        Moore, the chip pioneer who first observed it. With
        Moore's Law, it would take about 15 years to achieve
        the speed target for Blue Gene.

        "There's no way you get to where IBM is heading
        unless you change today's computing architecture," said
        Arvind, a computer scientist at the Massachusetts
        Institute of Technology, who uses only a single name.
        "It looks as if they have an outstanding engineering
        plan. If they can execute it properly, it will be a real
        breakthrough."

        Blue Gene's speed target is a petaflop -- that is, a
        thousand trillion floating point operations, or
        calculations, each second. Such a speed would make
        the machine 500 times faster than the two fastest
        supercomputers in operation today -- an IBM
        supercomputer at the Lawrence Livermore national
        laboratory, and an Intel machine at the Los Alamos lab.

        To translate Blue Gene's speed into a personal
        computer scale: If a fast PC was represented as an inch
        tall, the IBM machine would be 20 miles high.

        The hardware design of Blue Gene is innovative
        indeed, but the real challenge, as is so often the case in
        computing, will be the software. For in simplifying the
        hardware design for speed, the complexity of protein
        folding is left to the software. And the software, among
        other things, has to be "self-healing" so that the
        simulation does not grind to a halt if a few processors
        break down. The software must recognize the flawed
        processors and re-route the data.

        "We have some idea how we're going to do this," said
        Marc Snir, a senior researcher at Watson. "But I would
        be lying if I said we have solved this. We do have
        research to do."

        If all the computer wizardry works as planned, it will
        still take Blue Gene about a year to simulate on the
        computer the folding of a single protein. How long does
        it take the body to fold one? Less than a second.

        "It is absolutely amazing the complexity of the problem
        and the simplicity with which the body does it every
        day," Ajay Royyuru, a researcher in IBM's
        computational biology center, noted.

Related sites:

  • IBM Research
  • IBM: Kasparov vs. Deep Blue
  • Human Genome Project
  • SGI: Cray
  • Sun Microsystems

  • The Top 500 Supercomputing Sites