Important things to consider when investing in a good long term storage solution
Data is important and the only thing more important than that is YOUR data! Whether you are a home or business it is almost certain that your data is valuable and therefore a long term storage array to archive and protect this data is paramount.
Creating a solid and long term data archive is more than just buying an external hard drive to carry in your bag. Unlike a standard backup, an archive and/or cold storage needs to be designed to last for a long time and if often required to be accessible at a much faster rate than the likes of USB, Firewire and eSATA can provide. Factors such as physical robustness and integrity are only half the problem and with the evolution of IT happening at such an alarming rate, you have to ensure that the data storage array you have will last and still be compatible and accessible in years to come. Any number of reason exist as to why a private individual or business might want to invest in a large scale data archive. Such as:
- Legal compliance regarding data handling
- Sharing and distributing of work, both finished and WiP
- Enabling business requirements of housing and pooling data
- Facilitating an off-site backup for safety and security
- The housing of multi-company data in one location handled by a 3rd party
1: Which Storage Medium should I buy for my long-term and cold storage?
Possibly your first consideration and what your entire array will be built around is the storage media and the form it takes. The majority of business cold storage and archives will be broken down into two parts – Hard Drives and Tapes. Now tapes are the more hardcore of the two and companies such as LTO and Quantum have really refined and pioneered this kind of storage. The rating system for tapes is dependant on the total available storage capacity of each unit and the durability of the tape (generally in years and gradually give in to demagnetization) and a high quality tape should last longer than 10 years at least. It should be highlighted however than Tape based cold and archive storage does work out the more expensive and worth bearing in mind.
What are the alternatives to Tape based Cold Storage?
Hard Drives, or large scale disc based archives are the more common sign and range from stand alone 10 and 12 bay desktop HDD enclosures upto more business geared and enterprise rackmounted servers for server cabinet and server rooms filled with hundreds of drives and thousands of Terabytes. Specifically designed data center and cold storage hard drives are often used in this archive arrays and range from the SATA hard drives, to the more enterprise and faster SAS and even included the more expensive but faster still Solid State Drives. Choosing which or a combination of all is an important choice and will be largely affected by your budget.
2: Which Enclosure should I buy for my long-term and cold storage?
So you made the decision for the media that your long-term and cold storage will live on for years to come. However now you need to decide in what format this data will be accessible. In short terms, where are they going to live? Once again budget will play a big part in this choice. Unsurprisingly the larger the unit (desktop or rack mount enclosure), the higher the cost. Likewise, the faster the connection (the means with which your host machine and this archive will communicate) the more you will need to pay. Photo and video editors that want a storage array to house large scale media projects for archiving and on the fly retrieval and editing will see the advantages in Thunderbolt 2 and 3 – but for that you are looking at 4 figure prices. For businesses that want their large scale data archives to use SAS, you will likewise have to pay thousands. The reasons for this is not just about increased speed. It is alto often the case that as the evolutions in technology happen, that your storage becomes full and becomes considerably more expensive to maintain a slowly obsolete system than it would have been to invest years earlier in a more future proofed device. Likewise the saving you may make today in a lower cost archive solution will be lost if later released HDD media is not compatible with your older archive. Even now their are businesses flailing as I type this to replace IDE drives or only able to use storage media that is under 1.5TB as their archive chipset doesn’t support larger disks and there isn’t a firmware update in sight!
What else can go wrong if I buy the wrong data storage device in the long term?
Worse still, what if the connection type to the device is no longer supported. In the last couple of years we have already seen a decline in the use of External SAS, giving way to the more flexible NAS option. An option only made appealing by the likes for 10Gbe and 40Gbe connections in offices. If you buy a SAS large scale Archive today, be sure to keep an eye on how your host machine(s) change and that they can still communicate with the cold storage array and, ultimately, your precious data.
3: What about accessing my Archives as and when needed?
As previously mentioned, the evolution of alternative data storage in the form of NAS and SAN means than unlike with previous long and short term data archiving being accessible by a bare few at a given time, you can now create vast RAID enabled and backed up storage arrays that can be accessed (with control) by hundreds of users at once. Each user can be given their own permissions and entitlements and archives of individual users can be kept separated from all overs but still held in the giant data array. Also, unlike the large scale cold storage and data center arrays of DAS and via SAS, the dangers of your connection type becoming obsolete are hugely reduced as the data storage array is connected via your internal network and even upscaling your internal network connection will not limit your access over the internet and most popular enterprise level NAS storage options have impressively frequent firmware updates tweeking the OS and overall maintaining compatibility.
How big will my Long Term and Cold storage get? How much is much?
It should not come as a vast shock that as the years pass the sheer scale of your data will become staggering. Every day most businesses are churning our gigabytes of data a the very least and the ability to selectively access this data whilst the newly created data is also stored can be overwhelming for less enterprise geared storage arrays. Finding the balance for some between cold storage of old data that is kept ‘in the event of’ and a fresh archive that builds daily can be difficult and it is often the practice to have two near identical storage setups(one the currently multiuser accessible archive and the other a long term cold storage that is accessible by far fewer) working together and when data reaches a certain age/time scale, the systems will work together to re-index and carry the data over to the cold storage array.
This can prove costly to buy and setup (often requiring a hired or company dedicated IT person to maintain) but this is all too often a fraction of the cost that the loss of said data will cost in the event of a poorly constructed archive-to-cold storage setup.
Can I buy a cost effect cold storage data solution instead?
Don’t believe me? Why not do the math in your head RIGHT NOW what would happen if you lost the last 6 months of data your company generated. Nothing as dramatic as it all just disappearing in a puff of smoke. No, just that the archive you created has failed and cannot be recovered and all you have is data up until 6 months ago and no local backup.
Still thinking about it? The number of hours to retrieve the information from loose e-mails and local documents on machines? The receipts? The customer accounts? Yep… we are talking BIG NUMBERS. Do yourself a favour and spend less now.
Alternatively you can look into far more cost effective archive drives such as the Seagate Archive ST8000AS0002, this are largescale storage drives that are by far the lowest cost per GB in the market today.
Of course you should always take the time to examine individual RAID and Connection options like SAS before buying your archive and cold storage media.
4: Which Storage Archive or Cold Storage provides the best Redundancy
The minute you start storing data and classing it as mission critical (a fancy way of saying very, very important both in he short and long term) then you should be factoring redundancy into your storage enclosure (desktop or rackmount of choice). This is often a factor that home and very small storage users, Redundancy can prove too costly as it can often mean you are paying for more for less available storage. However for Business and Enterprise user, a solid foundation in place for redundancy is paramount. In cave-man speak, redundancy means that you and your data are protected in the event of hardware failure. Whether you go for a tape based cold storage option or a disk based storage array, multiple methods have been created to ensure that if 1 or 2 media devices in your enclosure fail, then the who mass of data is still kept.
What are the downsides of a comprehensive redundancy solution?
However your concerns should extend to more than just having a viable way to reclaim your data. The act of re-building your data is also a worthy concern. With some remarkably cost effective RAID enabled storage options out there offering RAID 5 and more, you can be fooled into thinking that you can skate around it easily. No, The higher level RAID levels like RAID 5 and RAID 6 require minute calculations in the background of each read and write swipe of data and a poor or cheap data enclosure for your Hard Drives, SSD or Tapes will result in very slow data transfers which can have a realtime damaging effect on the production of your business. Additionally, after a media drive fails, recovery and rebuilding of your data from a RAID 5 can take time and a lower level specification or cost effective storage unit can result in this taking DAYS! Often with many units not giving you access to the pool of data until it is complete!
So when looking into the redundancy abilities of your archive and cold storage data array, look for devices that feature at least a RAID 5 or RAID 6 coverable (or even multiple RAID coverage to effectively double up coverage such as RAID 5+0 etc), hot swapping on media and have a CPU and RAM combination that is at least Quad-Core 64-bit in architecture or i5 and above from Intel. As this level of processing power will amply cover both the RAID calculations both in recovery and in use.
What is the difference between RAID and Backups?
Last point here and one I often make. RAID and Redundancy protection is NOT THE SAME AS BACKING UP. You would be amazed how often this needs to be highlighted. The processes and methods described in this point are to assist in recovering from hardware failure. Backing up is a method of creating time-based copies of your data as a whole (or selected files if you prefer). If you lose large scale multiple drives at once above 3 or 4 drives, suffered full system failure/corruption, theft, fire or a lose a file due to human error – RAID and REDUNDANCY will not help you! For that you need a back up that is either off-site or is attached to the host storage array but not accessible to users for change/editing.
5: How much does a good Archive or Cold Storage Array cost?
Possibly the most important consideration for some. It is all to easy to say “throw money at the problem”, but this simply isn’t the case. What is ideal for one company can be utterly unsuitable for another. For example, you can invest in a sold 10Gbe Rackmounted SAS Network Attached Storage solution for your office that communicates to a locally connected expansion as a backup device – this gives every user in your building access (with permissions) over 10Gb per second transmission, Also you have 12GB/s drive speeds in access, a fully competent RAID solution and an identical copy of the data in a local backup that itself also has a RAID capability. If you want this with 4TB drives (so in a 16 bay you would have 64TB…well…60TB really in a RAID 5… 56TB if you want a RAID 6. However you also need to buy 16 more 4TB SAS Hard drives for the backup device (gotta have the same amount of storage for your backup as the primary storage enclosure).
Are the most expensive data solutions necessarily the best archives for me?
So far, in todays market this solution will set you back around £15,000 without the VAT. However, to use this solution you would need a 10Gbe network connection throughout your building, 10Gbe on all machines, a rack cabinet or server room for the devices and even the setup costs. Pushing the budget of this installation well in the £25,000 territory. For some, that is a new sales team member or several years budget for marketing etc. No, you can achieve all of the other but with conservative compromises in some areas for £10,000 or even as low as £2-3000 once you know your priorities.