Storage Magazine - UK
  Untitled Document

BUILDING AN ILM FUTURE

From STORAGE Magazine Vol 5 No 02 - March 2005

SOARING STATUTORY OBLIGATIONS RELATING TO DATA RETENTION AND AVAILABILITY ARE PUTTING HUGE PRESSURES ON BUSINESSES EVERYWHERE. NEVER WAS AN ILM STRATEGY MORE NEEDED, SAYS EDITOR BRIAN WALL

Information Lifecycle Management (ILM) has come increasingly into the limelight in the past year, as storage vendors are beset by pleas from customers to provide them with solutions that will help them meet the regulatory data storage compliance requirements that seem to be popping up faster than ducks at a fairground.

Words that exacted only puzzled looks but a short time ago - HIPAA and Sarbanes-Oxley (which dictate that documents must be retained by law for a certain period of time) –now roll off the tongue of IT managers with practised ease.

The upshot is that the imperative already driving organisations to seek a storage solution to help them manage data more efficiently has risen even higher up the agenda, because compliance is a boardroom issue and a great concentrator of the corporate mind. What is clear is that businesses, whichever industry sector they are in, have common goals they need to address:

• Reduce the costs of storing large and growing amounts of data and files

• Match more-active data with higher-performance storage, and less-active data with lower-cost storage

• Maintain business continuity through a tiered storage environment that is transparent to applications.

Where ILM comes in to the equation is that - as Martin Warren, marketing manager, automated tape solutions, StorageTek, is quick to point out - not all data is created equal.

“Most organisations have already realised that the most appropriate course of action is to prioritise data into a tiered storage hierarchy. The criteria for prioritising data will vary according to circumstances, but the philosophy remains that same: the most relevant data is kept on the fastest, always-available storage systems, less important data on lower cost equipment and data that is accessed rarely, or never, is archived completely.

“This approach is the essence of an ILM strategy, based on the three key principles of data protection, archival and storage management. ILM has allowed companies to decide where to invest as a priority and how to put some sort of limit on the total investment. Archive and Data Protection solutions look set to receive the highest rates for growth of nine and 12 per cent respectively throughout 2005. Here, companies are looking for solutions that improve the efficiency and productivity of backup and recovery, while also meeting the new demands placed on their businesses by the requirements to achieve compliance.”

Since the number of documents within the possession of any business can pile up in dramatic fashion for the duration of a regulation – and we can be sure that there are more on the way! - an ILM strategy employing both storage and content management is a necessity in our data-driven world.

“ILM allows IT department and the business to determine what the value of information really is, both today and over time,” says Nigel Williams, director of strategy and solutions, EMEA, EMC Software Group. “This allows the right information services to be put provided at the right cost.

“There are several examples of what end users are doing today to make this a reality. Some customers are implementing tiered storage, which ensures that each application is tiered and matched with a storage system that provides the appropriate service levels for performance, availability and recoverability. Some organisations are using content addressed storage solutions (such as EMC's Centera) to set archival, retention and deletion policies, based on the changing value of a piece of data over time, the size of a piece of content, whether it constitutes a business record, whether its user is in a specific department - or any number of other factors.

”Additionally, customers with EMC Documentum Content Management can ensure their content is seamlessly managed across storage tiers using Content Storage Services (CSS), which supports automatically placing content imported into the central repository into different storage devices, depending on the policies defined.”

The scale of the challenge of managing data is nowhere better exemplified than in the handling (or mishandling) of email. Research and analyst company IDC claims the total number of emails sent daily worldwide has been on a growth path from the more ‘modest’ 9.7 billion in 2000 to a massive 35 billion in 2005.

“It is crucial in today’s business environment with compliance and data retention regulations that businesses consider their policy of managing information,” says David Smith, ILM programme manager, HP UK. “Email is an excellent example of the need for policies. The Radicati Group, a firm of independent analysts that tracks the email market says that by 2007, it is estimated that the number of corporate email users will grow to 773 million, resulting in a staggering 10.3 petabytes of global email traffic each day.”

Smith states that many companies are now reacting to this by introducing architectures to manage information and, in particular, reference information. “The danger is the ‘scalability gap’ that appears in traditionally architected email archiving systems as the amount of data begins to grow. Many customers discovered that storage architectures designed for the active management of a few million information objects do not scale well to address the challenge of storing and tracking billions of objects without either significant performance degradation or escalating cost, as more money is thrown at providing ‘bolted on’ data access tools. Businesses need to make careful long-term decisions on a scalable architecture.”

Increasing enterprise data and storage requirements and new storage solutions have resulted in companies deploying a wide variety of storage technologies, including storage area networks (SANs), networked attached storage (NAS), hierarchical storage management (HSM) and direct attached storage. While these solutions help to better address the storage challenges, they also increase the complexity of the environment.

“The most obvious discussions about components included in ILM solutions tend to revolve around the selection of different storage products, of different capabilities, and at differing price points, to match the requirements of information being held at different stages in the life of data,” says Ian Bond, head of business development - storage networking, data centre solutions, Cisco Systems.

“For example, email being held for the first two weeks after receipt may, as part of corporate policy, be held on high-performance disk storage to be readily available at high speed. However, once that same email is two years old, it is unlikely to be accessed for anything other than a regulatory audit, so it can be stored on lower cost, slower access disk or tape, or even a hybrid storage product. The choice of the most appropriate storage devices at the best price point, and the provision of software to manage information through its full lifecycle using various levels of storage, is at the heart of ILM.”

Managing enterprise information
Whatever the need and path taken, a comprehensive strategy is needed to reduce the costs associated with managing data and storage resources. ILM provides an overall stratagem to help organisations manage information in the most cost-effective manner, using a combination of processes, products and people. And fundamental to its effectiveness is a realisation that all information has a lifecycle.

Typically, the lifecycle begins with acquiring information to satisfy a business need and referencing that information on a regular basis during day-to-day business operations. Over time, access rates decline as information gradually loses its business value. However, when historical information is needed, its business value increases immediately. The lifecycle ends when the information is no longer needed and can be deleted.

Ironically, through most of the lifecycle, rarely accessed data and information are retained on high-performance platforms and expensive storage media for easy access. This practice increases costs and wastes valuable business resources. By understanding how the data is used and how long it must be retained, companies can develop a strategy to map usage patterns to the optimal storage media, thereby minimising the total cost of storing information over its lifecycle.

According to Ron Riffe, director, storage software strategy, IBM, the process of identifying and analysing storage and data assets has lead the IT industry in general to begin to categorise data, resulting in an interesting discovery. “Most organisations have many categories of data, each with a different value to the business. The challenge is that most organisations had only a limited few tiers of cost in the storage that was being used to house this data. In other words, there were many value tiers in their data, but only a limited number of cost tiers (enterprise-class disk and tape, for example).

“Over the last two years, two significant advancements in disk storage have been delivered and matured that offer IT manager’s a significant part of the solution to their challenge – tiered storage. The first advancement is an expansion in raw disk technologies to the point that there is almost a continuum of disk storage costs, ranging from enterprise class, through midrange to cost-centric S-ATA.

“The second advancement has been the delivery and maturing of disk virtualisation technologies, so that IT managers can expand the number of cost tiers in their storage infrastructure, while still maintaining single points of management and single points of replication that work across all the cost tiers.”

As enterprises consider their individual approaches to ILM, there are common grounds from which these strategies are being developed. For example, enterprise data can be grouped into categories: unstructured data, such as files and documents; semi-structured data, such as email; and structured data, such as relational databases. Basic ILM data management principles apply to all types of data. Each type of data will have its own unique challenges.

When the data is stored in a relational database, the challenges are compounded because of the complexities inherent in the data relationships. Relational databases consume more storage capacity and are among the most difficult to manage because subsets of data are accessed on a regular basis.

According to the Meta Group, relational databases are growing at 125% per annum. Without the ability to manage relational data cost effectively, relative to its access and storage requirements, runaway database growth will result in increased operational costs, poor performance and limited availability for the applications that rely on these databases. The ideal solution, therefore, is to manage relational data as part of an overall enterprise ILM approach.

The impact of database growth extends well beyond increasing storage costs, including direct impact to business continuity and disaster recovery plans. Larger databases take longer to rebuild and restore, while overloaded relational databases degrade performance and limit the availability of critical applications. Expensive hardware, software and storage upgrades increase operating costs and only offer diminishing returns over time.

Compliance with data retention requirements compounds the problem. Companies retain historical data online for audit and legal reasons, though much of it is rarely accessed. How can an organisation reduce the impact of database growth? Although managing enterprise information lifecycle is critical, few standards exist to assist companies in formulating and implementing long-term data retention strategies.

Today, database archiving is recognised as a proven and cost-effective strategy for managing complex relational databases and controlling excessive database growth for the long-term. Database archiving works within the framework of various storage technologies and is a critical component of an overall ILM line of attack for managing structured data. Combining database archiving with ILM provides organisations with a best practices approach for meeting the challenges of managing increasing data volumes, using storage resources cost effectively and reducing operational costs.

A lot of the interest in ILM is being driven by new requirements for reporting (in the case of financial records of public companies) or privacy (in the case of patient records in health care). In these cases, the enterprise has to adapt its information management practices to the new legal environment, whether the aim is better transparency of information or better protection from unauthorised access.

Uers have to decide how long to keep information, and how quickly they will need to access it, which is tricky for storage managers because of the different rates at which information loses value. To add further complications, some information, such as legally required records, maintains high value for its entire life, but is seldom going to be accessed. Other information, such as a lot of accounting information, declines in value slowly, although the uses to which is it put change as it ages. Often it stops being current working information and becomes the basis for reports and analyses, the kind of thing better kept in a data warehouse.

Most data shows a steep decline in value and accesses over the first 60 days after it is recorded, settling into a seldom-used limbo after about 90 days. A successful ILM storage strategy has to reflect these mixes of value and access as the data ages, and this usually means more than spooling the stuff off to tape at the end of 30 days.

Obviously this can lead to a complex data retrieval situation, because data that ages at different rates is ideally kept on different storage cycles. One concomitant of ILM is the notion of 'path management', as EMC refers to it. This is the ability to find a particular piece of data no matter what file, volume or disk or tape it happens to be on. Path-management software, which is rapidly developing, virtualises the process of data retrieval by keeping track of where everything is in the storage cycle.

Whatever the plan applied to the business, the greater sophistication of approach through ILM hopefully means that mass, indiscriminate storage is becoming a thing of the past. As Nigel Tozer, business technologist, CA, points out, some organisations have suffered from the ‘store everything’ mentality, which is based on fear and can result in data reduction.

“A sensible approach to ILM involves learning about what data you have, how it is used and how it relates to supporting the business (or not, as the case may be),” he states. “At CA, we look at how applying an information lifecycle management approach can help reduce storage costs, manage information growth and enhance data protection.

“An ILM strategy begins with knowledge of the data that your business has, which can help you define policies that lead to good practice. A good policy example is not allowing users to put PST files (personal email files) on to servers. This alone can add literally hundreds of gigabytes to the nightly backup.

“Properly implemented, the results of a successful ILM strategy lowers disk usage and means that new investments are fit for purpose, rather than the costly over-protection of low-value data or the hidden cost of under-protecting high-value data.”

The credibility gap
Clearly, economic, operational and legal factors are driving enterprise IT departments to adopt fully-costed and thought-out ILM approaches. Meanwhile, software, systems and storage vendors are all racing to bridge the gap between today’s reality and tomorrow’s vision. Interestingly, it is the enterprise storage sector leading the charge, according to analysis firm 451 Group. And it is in the storage domain, it adds, where end users are applying the most immediate pressure for products and services that will truly met their long-term needs and ambitions
 

The products referenced in this site are provided by parties other than BTC. BTC makes no representations regarding either the products or any information about the products. Any questions, complaints, or claims regarding the products must be directed to the appropriate manufacturer or vendor. Click here for usage terms and conditions.

©2006 Business and Technical Communications Ltd. All rights reserved.
No part of this site may be reproduced without written permission of the owners.
For Technical problems with this site contact the Webmaster