Storage Magazine - UK
  BEAST THAT ROARS FOR APPLIANCE OF SCIENCE

BEAST THAT ROARS FOR APPLIANCE OF SCIENCE

From STORAGE Magazine Vol 7, Issue 6 - September 2007

Scientific research is now opening up new pathways towards a greater understanding of the genetic basis of many common ailments that have plagued and perplexed humanity for centuries. Editor Brian Wall reports.

Nowhere is the quest for a greater understanding of the common ailments that beset us pursued with greater vigour than at the Wellcome Trust Centre for Human Genetics (WTCHG).

Established in 1994, the WTCHG undertakes vital research into the genetic basis of common diseases, including asthma, diabetes, malaria and cardio-vascular disease. A not-for-profit organisation, it places all of its findings in the public domain - a tremendous asset to global biomedical research.

For the past eight years, the WTCHG has been located in the Henry Wellcome Building of Genomic Medicine, University of Oxford. Here it explores all aspects of the genetic susceptibility of disease, seeking an understanding of how these DNA variants may contribute to risk of disease in the population and how such genetic factors contribute biologically to a disease process.

The WTCHG houses multi-disciplinary research teams in human genetics, functional genomics, bioinformatics, statistical genetics and structural biology, all of which presents a multitude of challenges when it comes to data storage and availability. With so much research underway, data volumes have expanded at a staggering rate.

What has fuelled this frenzy is the explosion in bioinformatics and statistical genetics work involved in genomic research, says Dr Tim Bardsley, the WTCHG's IT manager, "particularly since the entire human genome has been mapped".

Under founder director John (now Sir John) Sulston, The Wellcome Trust Sanger Institute spearheaded the UK contribution to the Human Genome Project. It sequenced almost one-third of the human genome, greatly enhancing our ability to study the diseases that afflict people and animals, and was instrumental in ensuring that sequence data were made freely available to researchers worldwide for the benefit of all.

As well as the human genome, The Wellcome Trust Sanger Institute has sequenced the genomes of numerous disease-causing microbes, including those that cause tuberculosis, malaria, leprosy and diphtheria (see box-out, 'The Wellcome Trust Sanger Institute: a new focus').

"As a result, scientific, research-focused organisations like ours are faced with extremely complex storage management tasks," states Bardsley. "Our expanding research programmes and increasing data are forcing an exponential growth in our data storage requirements, a phenomenon seen right across the scientific research industry."

All this is a far cry from how things were when he joined the WTCHG eight years ago. "Then bioinformatics had a relatively small role, with far fewer stats being generated. Even six years ago, our SAN [Storage Area Network] was the size of a modern hard disc. To meet our expanding requirements, this was first upgraded to eight terabytes and then, through further recent investment, to fifty terabytes."

With a total of around 500 network-connected permanent staff to cater for, including administrators and scientists (primarily located within the main building itself, but many also collaborating with associates worldwide), IT has to provide the bandwidth - and data security - that helps to sustain the WTCHG's operations.
This means ensuring the vast amounts of data that is generated and then relayed to bioinformations for detailed analysis can travel at speed, unhampered, across the network.

"The bottom line is that, when genetic material is being analysed, large output files are created," adds Bardsley, "each of about fifty megabytes in size - and all of these have to be kept online in order to facilitate research." As a result of this activity, the WTCHG's data yield has soared from 20GB a day a couple of years ago to the current daily level of 200-300GB. With a 120-node Linux cluster and 25 core servers, IT's task is to ensure user data processing needs are always being met."

It is little wonder, then, that the WTCHG was an early adopter of storage network solutions - and one key area of investment has been in Nexsan technology. Last year, the WTCHG bought two 21TB SATABeasts, Nexsan's powerful, high-capacity storage system, and has recently followed that up with the purchase of two more (see box-out, 'The beast unleashed').

Bardsley points specifically to the SATABeast's advanced mechanical design, which not only provides excellent cooling, reduced drive vibration and high levels of energy efficiency, but also makes the SATABeast extremely reliable. For, whilst failure of the WTCHG's IT systems would have a highly detrimental effect on all users - and especially its researchers - protection of the data is paramount.

"We have research programmes that have been running now for three years or more. Raw data is our life's blood and that is why it is safeguarded so well. There is no way to put a price on how valuable that is," he says. "The underlying concern for us always with any equipment is resilience to failure. Nexsan offered the industry-leading storage density to handle our data."

He was also impressed by the level of attention and involvement that Nexsan exhibited after the WTCHG's latest SATABeast purchases from reseller S-Store. "They brought in one of Nexsan's VPs at one point and we were able to quiz him about the technology in some detail. It's unusual for someone of that standing to become involved in that way and we were able to establish the direction Nexsan is taking, which was very valuable to us, in terms of planning our own future needs."

The recent investment was triggered in part by a major collaborative project. "The requirement, capacity wise, was to give our people on-line DAS [Direct Attached Storage] so storage consolidation was a major driver." The NexSAN arrays are directly attached to a server that feeds into a fibre channel switch for distribution to other servers. That has increased our overall data capacity to 42TB - and we are using around 30TB of that already," Bardsley concludes.

The decoding of the human genome has opened up a world of possibilities that have transformed scientific research in a way that could not have been contemplated a few short years ago. SATABeast now forms part of the insurance policy that underpins that work for the Wellcome Trust Centre for Human Genetics. ST

The products referenced in this site are provided by parties other than BTC. BTC makes no representations regarding either the products or any information about the products. Any questions, complaints, or claims regarding the products must be directed to the appropriate manufacturer or vendor. Click here for usage terms and conditions.

©2006 Business and Technical Communications Ltd. All rights reserved.
No part of this site may be reproduced without written permission of the owners.
For Technical problems with this site contact the Webmaster