Untitled Document
BUILDING AN ILM FUTURE
From STORAGE Magazine
Vol 5 No 02 - March 2005
SOARING STATUTORY OBLIGATIONS RELATING TO DATA RETENTION AND AVAILABILITY
ARE PUTTING HUGE PRESSURES ON BUSINESSES EVERYWHERE. NEVER WAS AN ILM STRATEGY
MORE NEEDED, SAYS EDITOR BRIAN WALL
Information Lifecycle Management (ILM) has come increasingly into the limelight
in the past year, as storage vendors are beset by pleas from customers to
provide them with solutions that will help them meet the regulatory data storage
compliance requirements that seem to be popping up faster than ducks at a
fairground.
Words that exacted only puzzled looks but a short time ago - HIPAA and
Sarbanes-Oxley (which dictate that documents must be retained by law for a
certain period of time) –now roll off the tongue of IT managers with practised
ease.
The upshot is that the imperative already driving organisations to seek a
storage solution to help them manage data more efficiently has risen even higher
up the agenda, because compliance is a boardroom issue and a great concentrator
of the corporate mind. What is clear is that businesses, whichever industry
sector they are in, have common goals they need to address:
• Reduce the costs of storing large and growing amounts of data and files
• Match more-active data with higher-performance storage, and less-active data
with lower-cost storage
• Maintain business continuity through a tiered storage environment that is
transparent to applications.
Where ILM comes in to the equation is that - as Martin Warren, marketing
manager, automated tape solutions, StorageTek, is quick to point out - not all
data is created equal.
“Most organisations have already realised that the most appropriate course of
action is to prioritise data into a tiered storage hierarchy. The criteria for
prioritising data will vary according to circumstances, but the philosophy
remains that same: the most relevant data is kept on the fastest,
always-available storage systems, less important data on lower cost equipment
and data that is accessed rarely, or never, is archived completely.
“This approach is the essence of an ILM strategy, based on the three key
principles of data protection, archival and storage management. ILM has allowed
companies to decide where to invest as a priority and how to put some sort of
limit on the total investment. Archive and Data Protection solutions look set to
receive the highest rates for growth of nine and 12 per cent respectively
throughout 2005. Here, companies are looking for solutions that improve the
efficiency and productivity of backup and recovery, while also meeting the new
demands placed on their businesses by the requirements to achieve compliance.”
Since the number of documents within the possession of any business can pile up
in dramatic fashion for the duration of a regulation – and we can be sure that
there are more on the way! - an ILM strategy employing both storage and content
management is a necessity in our data-driven world.
“ILM allows IT department and the business to determine what the value of
information really is, both today and over time,” says Nigel Williams, director
of strategy and solutions, EMEA, EMC Software Group. “This allows the right
information services to be put provided at the right cost.
“There are several examples of what end users are doing today to make this a
reality. Some customers are implementing tiered storage, which ensures that each
application is tiered and matched with a storage system that provides the
appropriate service levels for performance, availability and recoverability.
Some organisations are using content addressed storage solutions (such as EMC's
Centera) to set archival, retention and deletion policies, based on the changing
value of a piece of data over time, the size of a piece of content, whether it
constitutes a business record, whether its user is in a specific department - or
any number of other factors.
”Additionally, customers with EMC Documentum Content Management can ensure their
content is seamlessly managed across storage tiers using Content Storage
Services (CSS), which supports automatically placing content imported into the
central repository into different storage devices, depending on the policies
defined.”
The scale of the challenge of managing data is nowhere better exemplified than
in the handling (or mishandling) of email. Research and analyst company IDC
claims the total number of emails sent daily worldwide has been on a growth path
from the more ‘modest’ 9.7 billion in 2000 to a massive 35 billion in 2005.
“It is crucial in today’s business environment with compliance and data
retention regulations that businesses consider their policy of managing
information,” says David Smith, ILM programme manager, HP UK. “Email is an
excellent example of the need for policies. The Radicati Group, a firm of
independent analysts that tracks the email market says that by 2007, it is
estimated that the number of corporate email users will grow to 773 million,
resulting in a staggering 10.3 petabytes of global email traffic each day.”
Smith states that many companies are now reacting to this by introducing
architectures to manage information and, in particular, reference information.
“The danger is the ‘scalability gap’ that appears in traditionally architected
email archiving systems as the amount of data begins to grow. Many customers
discovered that storage architectures designed for the active management of a
few million information objects do not scale well to address the challenge of
storing and tracking billions of objects without either significant performance
degradation or escalating cost, as more money is thrown at providing ‘bolted on’
data access tools. Businesses need to make careful long-term decisions on a
scalable architecture.”
Increasing enterprise data and storage requirements and new storage solutions
have resulted in companies deploying a wide variety of storage technologies,
including storage area networks (SANs), networked attached storage (NAS),
hierarchical storage management (HSM) and direct attached storage. While these
solutions help to better address the storage challenges, they also increase the
complexity of the environment.
“The most obvious discussions about components included in ILM solutions tend to
revolve around the selection of different storage products, of different
capabilities, and at differing price points, to match the requirements of
information being held at different stages in the life of data,” says Ian Bond,
head of business development - storage networking, data centre solutions, Cisco
Systems.
“For example, email being held for the first two weeks after receipt may, as
part of corporate policy, be held on high-performance disk storage to be readily
available at high speed. However, once that same email is two years old, it is
unlikely to be accessed for anything other than a regulatory audit, so it can be
stored on lower cost, slower access disk or tape, or even a hybrid storage
product. The choice of the most appropriate storage devices at the best price
point, and the provision of software to manage information through its full
lifecycle using various levels of storage, is at the heart of ILM.”
Managing enterprise information
Whatever the need and path taken, a comprehensive strategy is needed to reduce
the costs associated with managing data and storage resources. ILM provides an
overall stratagem to help organisations manage information in the most
cost-effective manner, using a combination of processes, products and people.
And fundamental to its effectiveness is a realisation that all information has a
lifecycle.
Typically, the lifecycle begins with acquiring information to satisfy a business
need and referencing that information on a regular basis during day-to-day
business operations. Over time, access rates decline as information gradually
loses its business value. However, when historical information is needed, its
business value increases immediately. The lifecycle ends when the information is
no longer needed and can be deleted.
Ironically, through most of the lifecycle, rarely accessed data and information
are retained on high-performance platforms and expensive storage media for easy
access. This practice increases costs and wastes valuable business resources. By
understanding how the data is used and how long it must be retained, companies
can develop a strategy to map usage patterns to the optimal storage media,
thereby minimising the total cost of storing information over its lifecycle.
According to Ron Riffe, director, storage software strategy, IBM, the process of
identifying and analysing storage and data assets has lead the IT industry in
general to begin to categorise data, resulting in an interesting discovery.
“Most organisations have many categories of data, each with a different value to
the business. The challenge is that most organisations had only a limited few
tiers of cost in the storage that was being used to house this data. In other
words, there were many value tiers in their data, but only a limited number of
cost tiers (enterprise-class disk and tape, for example).
“Over the last two years, two significant advancements in disk storage have been
delivered and matured that offer IT manager’s a significant part of the solution
to their challenge – tiered storage. The first advancement is an expansion in
raw disk technologies to the point that there is almost a continuum of disk
storage costs, ranging from enterprise class, through midrange to cost-centric
S-ATA.
“The second advancement has been the delivery and maturing of disk
virtualisation technologies, so that IT managers can expand the number of cost
tiers in their storage infrastructure, while still maintaining single points of
management and single points of replication that work across all the cost
tiers.”
As enterprises consider their individual approaches to ILM, there are common
grounds from which these strategies are being developed. For example, enterprise
data can be grouped into categories: unstructured data, such as files and
documents; semi-structured data, such as email; and structured data, such as
relational databases. Basic ILM data management principles apply to all types of
data. Each type of data will have its own unique challenges.
When the data is stored in a relational database, the challenges are compounded
because of the complexities inherent in the data relationships. Relational
databases consume more storage capacity and are among the most difficult to
manage because subsets of data are accessed on a regular basis.
According to the Meta Group, relational databases are growing at 125% per annum.
Without the ability to manage relational data cost effectively, relative to its
access and storage requirements, runaway database growth will result in
increased operational costs, poor performance and limited availability for the
applications that rely on these databases. The ideal solution, therefore, is to
manage relational data as part of an overall enterprise ILM approach.
The impact of database growth extends well beyond increasing storage costs,
including direct impact to business continuity and disaster recovery plans.
Larger databases take longer to rebuild and restore, while overloaded relational
databases degrade performance and limit the availability of critical
applications. Expensive hardware, software and storage upgrades increase
operating costs and only offer diminishing returns over time.
Compliance with data retention requirements compounds the problem. Companies
retain historical data online for audit and legal reasons, though much of it is
rarely accessed. How can an organisation reduce the impact of database growth?
Although managing enterprise information lifecycle is critical, few standards
exist to assist companies in formulating and implementing long-term data
retention strategies.
Today, database archiving is recognised as a proven and cost-effective strategy
for managing complex relational databases and controlling excessive database
growth for the long-term. Database archiving works within the framework of
various storage technologies and is a critical component of an overall ILM line
of attack for managing structured data. Combining database archiving with ILM
provides organisations with a best practices approach for meeting the challenges
of managing increasing data volumes, using storage resources cost effectively
and reducing operational costs.
A lot of the interest in ILM is being driven by new requirements for reporting
(in the case of financial records of public companies) or privacy (in the case
of patient records in health care). In these cases, the enterprise has to adapt
its information management practices to the new legal environment, whether the
aim is better transparency of information or better protection from unauthorised
access.
Uers have to decide how long to keep information, and how quickly they will need
to access it, which is tricky for storage managers because of the different
rates at which information loses value. To add further complications, some
information, such as legally required records, maintains high value for its
entire life, but is seldom going to be accessed. Other information, such as a
lot of accounting information, declines in value slowly, although the uses to
which is it put change as it ages. Often it stops being current working
information and becomes the basis for reports and analyses, the kind of thing
better kept in a data warehouse.
Most data shows a steep decline in value and accesses over the first 60 days
after it is recorded, settling into a seldom-used limbo after about 90 days. A
successful ILM storage strategy has to reflect these mixes of value and access
as the data ages, and this usually means more than spooling the stuff off to
tape at the end of 30 days.
Obviously this can lead to a complex data retrieval situation, because data that
ages at different rates is ideally kept on different storage cycles. One
concomitant of ILM is the notion of 'path management', as EMC refers to it. This
is the ability to find a particular piece of data no matter what file, volume or
disk or tape it happens to be on. Path-management software, which is rapidly
developing, virtualises the process of data retrieval by keeping track of where
everything is in the storage cycle.
Whatever the plan applied to the business, the greater sophistication of
approach through ILM hopefully means that mass, indiscriminate storage is
becoming a thing of the past. As Nigel Tozer, business technologist, CA, points
out, some organisations have suffered from the ‘store everything’ mentality,
which is based on fear and can result in data reduction.
“A sensible approach to ILM involves learning about what data you have, how it
is used and how it relates to supporting the business (or not, as the case may
be),” he states. “At CA, we look at how applying an information lifecycle
management approach can help reduce storage costs, manage information growth and
enhance data protection.
“An ILM strategy begins with knowledge of the data that your business has, which
can help you define policies that lead to good practice. A good policy example
is not allowing users to put PST files (personal email files) on to servers.
This alone can add literally hundreds of gigabytes to the nightly backup.
“Properly implemented, the results of a successful ILM strategy lowers disk
usage and means that new investments are fit for purpose, rather than the costly
over-protection of low-value data or the hidden cost of under-protecting
high-value data.”
The credibility gap
Clearly, economic, operational and legal factors are driving enterprise IT
departments to adopt fully-costed and thought-out ILM approaches. Meanwhile,
software, systems and storage vendors are all racing to bridge the gap between
today’s reality and tomorrow’s vision. Interestingly, it is the enterprise
storage sector leading the charge, according to analysis firm 451 Group. And it
is in the storage domain, it adds, where end users are applying the most
immediate pressure for products and services that will truly met their long-term
needs and ambitions
|