Why not tarballs too large
From SciNet Users Documentation
The recommendation is to not generate tarballs larger than 500GB
The simplest way to understand why is by watching this domino chain reaction video: The only valid record requires all pieces to fall in sequence and uninterrupted .
- The HPSS system has many moving parts (computer nodes, databases, network, switches, cables, disks, tape drives, robot arms in the library, etc). There are a number of minor hiccups that can happen in the pipeline as data is transferred from GPFS to HPSS, and back from tape to GPFS later on.
- if one of these hiccups happens to a large tarball the whole transfer process is compromised, and you will have to start the transfer again from square one (that is, reassemble the whole domino sequence). The same type of hiccup may happen to ONE small tarball as well (just a few loops in the domino spiral), however the probability of it affecting ONE very large tarball is much higher. In this case, the waste of time if you have to restart the process and resume it trouble free is much higher.
- htar for instance, does not have a built-in retrial feature, it's not resilient to external problems, and it will not pickup the slack from where a transfer failed.
- besides our LTO7 tapes can only fit 6TB. It's easier to fit several 500GB files onto those tapes without wastage at the end than a 3TB file for instance. Although it's possible, we prefer to not split the same file over multiple tapes. By design, we do not stripe files over multiple tapes either.