I ran into what is now quite an old photo (please excuse the camera phone quality!) of some drives that I was adding to a development system to get some baseline performance figures. It got me thinking about what we’re doing with storage today compared to what was typically found in use five to ten years ago. Typically, a small business would run some kind of NAS (Network Attached Storage) or file server with a basic RAID (Redundant Array of Inexpensive Disks) level and have some kind of nightly backup solution. As businesses grew, they’d often just add more of these self-contained servers. Larger again, and you’d find virtualised environments with servers and storage separated.
For a number of these businesses, these once familiar setups have been at least partially replaced with cloud-hosted systems, removing the requirement of running and maintaining local server hardware. Alongside this, the user-facing elements of cloud systems focus on real-time collaboration, both inside and outside of the organisation, as well as being accessible from multiple device platforms.
In other words, storing and working on data in the cloud can give you more than simply ‘putting a file on the server’ so other users can get at it.
The ability to move towards cloud systems has largely been enabled by vast improvements in the speed of commodity internet connections and a reduction in the price of dedicated fibre connections. For this, the network connection becomes the most critical bit of infrastructure (more on that in a future post).
However, cloud services are not one-size-fits-all. On-site storage is definitely required in many areas of computing and it’s always been the case that the more bandwidth and storage we have, the more data we expect to move across it!
For example, a company working in media production may be dealing with hours of raw footage shot in 4K, perhaps using Panasonic cameras encoding using AVC-Ultra @ 400Mbps. They could invest in a dedicated 1Gbps internet connection (and also pay for terabytes of online storage). This means, if trying to upload their footage and then work directly from the cloud, just two ‘streams’ of footage video are available before the internet connection is saturated.
In reality, many usage cases require access to terabytes of data, with sub 10ms latency and speeds of 10gbps and up. Just as I’ve said that cloud services are more than ‘putting a file on a server’ – storage appliances are also a long way from those servers with a pair of hard drives and a tape drive for the nightly backup. These devices are faster, more robust, and able to integrate with other systems directly.
As just one example – when we ‘write’ a bit of data on our own systems here at DNS:
• The data is written to a queue in high-speed flash storage
• The data is then committed to two sets of hard discs on the storage appliance
• Every 15 minutes, the newly written data is mirrored across a network onto another storage appliance (which also commits to two sets of discs)
• All data is copied to a 3rd off-site location on a nightly basis.
• Up to 25% of the stored data (the most commonly used) is held on flash storage.
• The data is scanned for errors regularly and the hardware is health checked.
I’ve set cloud and local storage apart to some extent – in truth, there are a wealth of approaches to blending the two. In the example of the media production company, their on-site storage system could be configured to access cloud-based storage and synchronise in the background – allowing remote fast access for those working out and about.
Sometimes, it’s easy to forget how much needs to go on behind saving and opening a file! Hopefully, the result is that nobody notices.