This is our essential new series, written by data specialist Dr Marc M. Batschkus. Part 1: Solid Backup of Productions and Assets.
"Backups are one area where paranoia is prudent.“
James Pond
Securing data means to copy. Copiare is the Latin word for copy by hand. When in former times books were copied manually, today we have to take care of our files.
This way production data stays available even when something unforeseen happens.
One of the most overlooked factors is the cost of downtime. Potential loss of trust by clients also makes it worth to plan ahead and build a robust data management procedure.
Both terms, backup and archive turn up in this context. The seeming similarity sometimes leads to misunderstandings. A quick glace at their differences makes it clearer.
The backup secures regularly and automatically the files of the ongoing production. It overwrites itself when the specified retention time is reached (e.g. after 3 months).
This retention time specifies how long one can go back in time to restore a lost or deleted file.
Only an automatic backup runs and protects even when the production schedule gets tight. The best time to have it run is when the network and workstations have lesser load.
In contrast to this, the archive is a data migration, moving finalized productions to the long-term storage. The archive grows continuously and is the central reference for all completed projects and their assets.
As a positive side effect, the archive reduces file count and load on the production storage. This reduces need for expansion oft he production storage. It also reduces the runtime of the backup.
Backup |
Archive |
Duplication of data of ongoing Production |
Migration of finalized productions |
Cyclic: overwriting itself when retention time is reached |
Continuously growing |
Automatic with schedule |
Manual or Watch-Folder |
Short to mid-term storage |
Long-term storage |
With backup there is one important and classic rule, the 3-2-1-backup-rule. It means that each important file needs to exist in 3 copies on 2 different storage media, one of which needs to be offsite.
Only the offsite storage protects against any local incidents and disasters and thus creates maximum security. Keeping the offsite copy also offline protects against cyber threats. More on that later.
The 3-2-1-Backup Rule
3 copies of each important file, one primary and two secondary.
2 different storage media or technologies to protect against any technology related incidents, bugs or failures
1 copy stored offsite (e.g. outside of the office or studio) to protect against local incidents.
One special kind of backup is data availability or cloning. With time critical data and workflows the best solution is to replicate or clone the complete production storage. Especially if multiple people are depending on the production storage (e.g. a SAN or NAS shared storage) this helps to prevent downtime.
If something goes wrong and the main storage stops working, the secondary identical file system can take over and replace it within minutes, no restore is necessary.
One solution that offers backups well as replication/data availability is P5 Archive by Archiware. See link at the end of the article for more details.
How long can you be without your data and what solution is the best protection?
Minutes |
Data availability/replication (No Restore necessary) |
Hour(s) |
Backup to Disk, Tape or Cloud, cyclic with Restore (and offsite storage) |
Day(s) |
Long-term Archive with offsite storage |
Segmenting data can be advantageous and offers options to protect each area with the most efficient method.
As real-world example I will refer to the software suite Archiware P5, because it offers replication/cloning, backup and archive in one solution so that all specifics can be demonstrated and solved. P5´s flexibility to address disk, tape and cloud storage as well as working on Synology, QNAP and NetGear adds to the versatility. As for the above table P5 Synchronize is the solution for replication/cloning, P5 backup for the backup to disk, tape or cloud and P5 Archive for long-term archive.
A backup needs to be complete. What sounds obvious at first, but might be less easy to achieve in reality.
Beyond the important files, there are other things that need to be included. That is basically, everything that is needed to re-build the setup in case of emergency. This includes documentation of the setup, servers, workstations and storage as well as licenses, configurations etc.
Editing workstations might be loaded with plugins and tools, specific drivers and components that might be critical for the production.
All this needs to be part of the "backup“ even if, in some cases, in the form of paper.
In most cases, it is practical to have a backup run over night. To check that it actually completed its run looking at its messages and logs is important. Usually, reporting options allow having a mail sent to the administrator automatically.
There are different types of backup
Full backup: All files from the data source will be saved. Usually the very first run of any Backup system is a fullBackup to reference against later.
Incremental backup: All files that have not been saved with the last full or incremental Backup will be saved. That means all new or changed files
Differential backup: This means all files that have not been saved by the previous full backup. With multiple runs, files will be saved multiple times. This reduces efficiency.
Incremental (top) and differential (bottom) backup.
Checklist for professional backup
√ Automatic backup
√ Completeness of backup
√ 3-2-1- rule
√ different storage technologies for maximum security (Disk+Tape, Disk+Cloud, SSD+Disk)
√ Reduce backup volume through archiving
√ Document the restore step-by-step
√ Check logs and error messages
√ Regular test of the restore
Links
The Archiware P5 platform offers archive, backup and cloning in one solution. Free 30-day trial available.
How to cover archive, backup and cloning with one solution.
Tags: Technology
Comments