What is the best way to Backup 10 Terabytes of data every day?
Data is really, really important. That should not come as a staggering fact. If you found this article thanks to a rather perceptive google search, then clearly you think data is very important too. It is all to easy to rely on your data living on multiple machines in your office or home. Centralized backup is a very unappealing idea. For a start, it is expensive. You will spend hundreds, if not thousands of pounds on storage, and worse still it isn’t even being spend on making more space, but actually to duplicate your old data and not be used. You are spending all this money on what can be described as a remarkably large insurance policy. Worse still if you have all the machines in your home or office backing up to a device in that same location, almost all data safety and storage experts will scream to high heaven that this is still not enough.
Sadly this is true, because not only do you put yourself at risk of complete critical loss in the event of fire or flooding, but also in the case of theft, you stand to have created a convenient all included pile of data that can be stolen and cannot be replaced. So ultimate you have to stop thinking about this only in terms of how much it is going to cost. If you think like that, you will never move past stage one. The perfect backup solution operation looks like this:
- Primary Data (Where data is initially created/collected)
- On-site Backup solution (where multiple devices are backup up to internally)
- Off-site/different location Backup (where the copy of the on-site backup lives)
Stages 2 and 3 should be encrypted in case they are stolen/entered, as well as feature login credentials and an admin system.
That seems expensive. Can I skip any of them? Does it make a difference if I am a home user or business?
I know it seems frightfully expensive to have both stage #2 and #3 just making a duplicate of each other, but below are two examples of how that cost is pitifully small in real-terms
Why Business and Enterprise users need an Extensive Data Backup Plan
Your company has 10 employees. Each has their own workstation and they contact clients on a daily basis to drum up new sales, fulfil existing quotes and maintain a customer relationship manager (CRM). You have both #1 AND #2 covered. Plus you have business insurance in case of a fire or flood. One morning you arrive to find your office has been flooded/burnt/burgled/struck by lightning and everything is fried. At first you think, lucky we have insurance. They will pay up for whole new office equipment, PCs and your server. However, what about all that customer data? Not only can the insurance company not replace it but they will not pay its consequential value. So now you have to start from square 1. Plus now you also have a bundle of rather angry customers from the previous days and weeks whose requirements go unfulfilled. This coupled with starting your business network from scratch, employees salaries continuing as normal and several IT guys (or 1 guy working for days) setting everything up from scratch again (this WILL be the case from fire, flood or theft) could easily KILL a company. Now, in that context, isn’t a few thousand put towards an off-site #3 Backup so bad? Thought not.
Why Home and Private users need an Extensive Data Backup Plan
Ok, so now the mission critical data and the life or death nature of your information is less so. Or is it? What about all those important house documents your scanned? Those TV shows and movies you bought on a one time download? What about your wedding picture or those of your children growing up? Those videos of your friends and relatives that are no longer with us? Not to be bleak, but it is often the case that although much of a person’s data is not of huge financial value, it is still utterly and completely irreplaceable in the literal sense.
Likewise if you many devices (phones, computers, hard drives) get corrupted, hacked with malware or broken, don’t you want the peace of mind of knowing that there is always a backup of EVERYTHING? If you are copying the data of all your devices to a large hard drive enclosure in your home, this is NOT a perfect backup. It just protects you from the loss of one or more of your mobile/individual devices. a TRUE backup is one that is made of that central repository of your data. So, as you can see, the need for a reliable true backup is paramount regardless of whether you are a home user or business user. However accepting that you need a backup is not enough, you need to know what to consider when choosing the right backup.
What are the factors I need to take into account when considering my Backup Solution?
Choosing the best full backup for your data can be a little difficult. With so many variables ranging from cost to size to speed and more, it can be easy to go around in circles and still end up choosing nothing. In almost all cases, the deciding factor is cost. However this is closely followed by speed. Having a backup is all well and good, but if it takes too long to finish, it can often be slower than the speed at which you create data. likewise if you choose an unsuitable connection of choice between your primary storage and backup storage, then the two may communicate inefficiently. Below are the main overheads to consider when choosing your backup.
Distance – How far is the backup going to be from the primary data source? Many forms of connection lose latency (speed) the longer the cables get or the further away your device is. If you are having an off-site backup or intend to have it located several rooms away, ensure you get a suitable connection between them that wont result in speed loss over distance.
Power – Both power of the hardware inside the data storage device and the power consumption of the device from your mains power can really eliminate some options, particularly rackmount options or if you are dealing with hundreds of thousands of small files. Some connections require additional power to perform at their best and backup operations involving thousands of IOPS (input and output operations per second) can really narrow down your choices.
Physical Media – The enclosure of choice is only half of the battle. The media that you are storing the data too can make a huge impact too. Many have their own maximum speed or capacity, so even with a super fast enclosure and/or connection, you will be bottlenecked by the drives themselves. All media types e.g. HDD, SSD or Tape have their own maximum performance and some are lower than the connection maximum and some are faster and therefore end up creating a limited speed inadvertently.
The Media connection internally – As mentioned, the SATA port on most commercial HDD/SSD has a maximum of 6 gigabits (Gbs) in SATA III – SAS at 12 Gigabits. All of this are internal. LTO-7 = 300MB/s in uncompressed/RAW form and 750MB/s in compressed. So ensure that the media you want to use is compatible with the device you want to install. You can always get adapters and mux boards to cross over between connections, but it is not advised as you will see speed resistances appear.
The external connection – Lastly and possibly the most overlooked part, the communication between your primary and backup data itself. Not just the speed, but the resilience and future proofing. You need to consider what connection you are going to use today, tomorrow and years from now. The last thing you want is to saddle yourself with a connection now and later when you upgrade your primary hardware, end up with a device you cannot access or use. Then your data just becomes a fancy paper weight. Popular connections and their speed between host and client devices are:
- USB 2.0 = 480 Mbit/s
- 1GBe LAN/Ethernet = 1Gbit/s
- USB 3.0 (3.1 Gen 1) = 5 Gbit/s
- USB 3.1 Gen 2= 10 Gbit/s
- USB 3.2 = 20 Gbit/s
- Thunderbolt = 10 Gbit/s
- Thunderbolt 2 = 20 Gbit/s
- Thunderbolt 3 = 40 Gbit/s
- Fibrechannel – 1, 2, 4, 8, 16, 32, and eventually 128 gigabit per second rates
- Cloud storage – dependant on connection. As little as 1 or 2 Megabits per second, as high as hundreds, but costs a fortune. Current UK average is around 20-30Megabits per sec in homes, business slightly less for affordability
What is the difference in speed and cost between different Backup Solutions.
And so to the meat of the subject. Different solutions cost money and in the interests of SPEED, below i have detailed numerous solutions that will provide a backup solution of upto 10TB of storage. All Costs and speeds are based on a solution that is an acceptable distance away for maximum efficiency. Perfect speed results were provided with http://www.calctool.org/ , however it is worth noting that these are ‘perfect situation’ based and it would be tough to see this maximum threshold. you will comfortably see around 10-20% below this, but that is fine.
ALSO IMPORTANT – In all examples where a 4TB SSD is mentioned, you can use a 4TB HDD to save around £3,500 in most cases – but you will effective quadruple or more the time the initial backs will take. Likewise future incremental backups will be significantly reduced also. In examples where the SSD would have been substantially bottle necked by a connection, I have used HDD as you will not need to spend the extra.
The best LAN based Backup Solution for 10TB of data
For a solid LAN based back up (with optional internet access as needed for off-site work) I would recommend the Synology DS216 2 Bay Pentium NAS. aLongside this you will need a moderately smart Switch, 2x 10TB HDD as you will not see any speed difference on a network connection with SSD (RAID 1). This will cost around just under £900 without VAT.
What do CalcTool.org have to say about 1GBe
Over 1 Gigabit per second, in a perfect scenario – just over 20 hours. Realistically closer to 25 or 30 hours. SO the first few backups should be conducted over the weekend but all future ‘difference only’ backups should be fin at 12 hour intervals without harming the bandwidth too much,
The Best 10GBe Network Based Soluton for 10TB of data
In order to create the perfect cost-effective yet powerful 10GBe Network based backup solution (so 10x faster than normal LAN) I would recommend the QNAP TS-431X with 10GBe with SFP+ Connection and 2x 3m SFP+ Cables with transceivers attached. Additionally you will need a 10GBe switch, and for MAXIMUM speed 4x 4TB Samsung 850 EVO SSD in a RAID 5 which slows things a pinch but gives you the safety of 1 drive worth of redundancy. Of course you can downgrade to HGST NAS 4TB Hard drives and save over £3500, but you will see a noticeable dip in performance. sO the choice is yours. Lastly you will need a 10GBe interface on machine(s) you are backing up from in order to maintain the 10GBe throughput. In total this will cost around £4700+ for the SSD based solution and just £1,100 for the HDD solution.
What do CalcTool.org have to say about 10GBe
Performance will largely be dictated by the distance of the backups, choice of HDD or SSD and types of files. However, over 10 Gigabit per second, in a perfectt scenario – just over 2-3 hours. However this is a little optimistic and in practice it will realistically weight in closer to 4 hours and above. This of course is for the first few backups of a FULL 10 Terabytes of data. Later with incremental and ‘difference only’ backups, you will see times slashed heavily for the better.
The Best Thunderbolt 1, 2 or 3 Backup Solution for 10TB of data
Fast becoming a connection of choice for photo and video editors in both the Mac and Windows community, Thunderbolt is the no-fuss connection that promises speed, without the technical nonsense. Much like before you can choose to go with SSD drives for supreme speed (at a hefty price tag) or HDD if you want to make economies. Below are the options best suited for a Thunderbolt 1, Thunderbolt 2 and Thunderbolt 3 Backup.
- TB 1 DAS, 4-Bay, Cable, 4x 4TB SSD, RAID 5 enabled = £5000+ —- 2Hours + Backup time with SSD population / 3.5+ Hour for HDD Population
- TB 2 DAS, 4-Bay, Cable, 4x 4TB SSD, RAID 5 enabled = £5200+ —- 1 Hour + Backup time with SSD population / 2-3+ Hour for HDD Population
- TB 3 DAS, 4-Bay, Cable, 4x 4TB SSD, RAID 5 enabled = £6000+ —- 30-45min + Backup time with SSD population / 1.5 Hour for HDD Population
What do CalcTool.org have to say about Thunderbolt 1, 2 and 3
Thunderbolt does not lose speed over distance, however most conventional cables you can buy max out around 5 metres and the ones included with the above enclosures arrive at 1-1.8m. In a real world scenario you can realistically double this length of time listed above in the initial backups. However it will MASSIVELY improve with subsequent backups. With the exception of a few, most Thunderbolt backups arrive with only Thunderbolt ports, so in order to maintain the speed levels of this backup you need to either ensure that it is connected to your centralised depositary via Thunderbolt, or if it’s backing up multiple devices, that they are using a good networking device, as Thunderbolt Direct attached storage only allows a single connected device at any one time.
The best LTO-7 Tape Backup Solution for 10TB of data
In the case of LTO / tape, this kind of storage for 10TB can be incredibly inefficient for an extra layer of storage. You can purchase much smaller 1 and 2 tape frames/storage devices, but for what you are paying and the overall accessibility for all machines involved, it isn’t great. If you were regularly backing up 5x or 10x this amount of storage, it would be a different story. Internal operations can be upto 750MB with compressed data and 300MB for raw uncompressed data. SO unless you are synchronizing between two LTO tape loading machines, you will almost certainly use uncompressed. However these are internal operations and as we are discussing backing up from existing systems to a storage device, we have to focus on the external connection. Most likely 10GBe network or 12GB/s SAS will be the means of backing up to your tape Device. But Cost is hard to pin – easily £2000-3000 and upwards, over at least two 6/15TB tapes etc. Most likely around over 3 hours transfer time, but hugely impractical at this scale and most likely much higher in practice.
The Best USB 3.1 Gen 2 Backup Solution for 10TB of data
The latest available version of USB, known as USB 3.1 Gen 2, is easily the cheapest way to store a 10TB back up at a very respectable 10 Gb/s (comparable to Thunderbolt 1) speed. You will need to ensure that the connected device(s) that you are backing up too/from use the newer USB 3.1 Gen 2 port to ensure you do not get bottlenecked at 5 Gb/s, but this DAS enclosure,below, populated with either 4x 4TB SSD or 4x HDD (same price difference of £3,500 as before), RAID 5 enabled = £4200+ for an SSD Based solution and just £700+ for a HDD based version
What do CalcTool.org have to say about USB 3.1 Gen 2?
With the best drives available you will have this 10TB localized backup over USB 3.1 Gen 2 finished in just over 2 hours. However taking system overheads into consideration, as well as the RAID 5 into account (something you could counter with a RAID 10 and 4x 6TB HDD perhaps) you are looking at between 2.5 and 3.5 hours completion. ANother note if you are considering USB 3.1 Gen 2 in a 4-Bay or 5-Bay configuration is that you may also wish to consider USB 3.2 Perhaps. This has now been formally announced and scheduled for commercial release by vendors in 2018
Is the The Cloud suitable as a 10TB Backup solution?
You may wonder why i have not suggested the cloud as a regular backup yet. It is certainly appealing. No parts needed, just a healthy internet connection. You already have all the hardware you will need to establish this kind of synchronised backup – this should be by far the cheapest and easiest backup, right? Well yes and no. It IS cheap – in the short-term. Even if you take into account that your Business internet connection costs, from as little as £10 to £50 a month, reaching much higher once you consider fiber channeling, it is still pretty attractive. However you have to consider the time this backup will take and how it will affect the bandwidth throughout your business – otherwise you will need to be conducted them overnight due to limit consumption.
Also, remember we are talking about Upload, not download. Most internet services advertise incredible download speeds, but backups are almost exclusively upload based and upload speeds are normally a 10th or less than advertised download speeds. lastly, we can talk about costs. As although the initial costs are much less, let’s go for £50 a month for a dedicated high upload speed connection for your of-site back up. That is £600 a year. In 5 years, that is £3,000 (a cost that is the same or higher than most of the solutions discussed previously). THe real kicker is that after those 5 years, you either have to continue paying to maintain this backup OR buy a suitable local storage drive to download it too – something you could have had ALREADY by going for the other solutions and thereby saving you thousands of pounds more.
I took the trouble of using the awesome tool at http://www.thecloudcalculator.com/ and if you have a 30Mbps upload speed (fairly respectable). backing up 10TB initially would take 33 Days, 22 Hours, and 27 Minutes, 11 Seconds
That is horrendously long and you cannot just assume this is a one-off and negotiable with incremental backups and difference-only changes. You need a reliable and adaptive backup solution – not one that will do the job as long as you work within it’s limits. If you want to entertain the idea of a cloud based backup of 10TB on a regular basis, we have to look into fiber and at least 2Gbps (so 2000 Megabits) to get to 12 hours for an overnight full backup (non incremental). This is going to cost a small fortune and unless you intend to take advantage of this speed during the day-time, is a huge outlay for something that is not hugely accessible or reliable.