Creating perpetual backups

In our backup we copy the FMS databases to a local disk first as FMS suggests.

But FMS limits you to only “99” possible max backups before overwrites occur, and want to store databases going back years and forward for many years with no limits on how many backups we keep.

Our solution is a 12 TB cloud storage system, which we can still grow, and a separate copy task.

I wrote a python program to do this external copy but it (not just the code) gets complex very quickly where you have to consider concurrency, size, and timing issues – both FMS backing up the files and the script scheduling. The databases backed up will grow and multiply as more users are added to the system (more FMP12 backup files).

Currently, the FMS backup file is over 18 GB for just the few databases we have. As this grows, even doubles, it will be an indeterminate amount of time needed for both FMS to write the backup and for us to “7.zip” it and copy it to the cloud storage.

If FMS isn’t done creating the backup file when the script runs, for example, the backup file will not be correct. I can program around part of this, but the complexity does not end there.

Just wanted to see if there was a good way to handle this complicated backup situation and hear what others are doing.

Thanks,

FMS might limit the max backups to keep per schedule to 99 right? So you could just create 2 backup schedules and offset them, have one run every M/W/F/S , and the other every T/T/S ?

I'm slightly confused (not unusual :slight_smile: ) as to why a limit of local backups (99) affects your overall policy of indefinite backups. I use a shell (bash) script to grab specific backups (in my cases it's usually just the midnight version, but in some cases others) and transfer the backup to cloud storage using zstd (to compress) and rsync to transfer.

I don't compress image files (container stuff, because they don't compress much), but I do tarball it. Then, once it's transferred to the cloud storage (AWS in my case), we rotate to cheaper, slower retrieval, options such as Glacier to keep costs low on legacy versions.

You mention "the FMS backup file is over 18GB": One client of mine has an ~87GB file; we transfer the 00:00 backup to offsite every morning at 1:30 AM (after other schedules are sure to be completed and traffic on the 'wire' potentially less).

So... the script grabs the DB files in the FMS_Backups folder from the local RAID storage; compresses it; transfers (rsync) to the destination directory. The the external container storage is grabbed from the RAID storage; tarballed and then transferred via rsync. I had started with secure copy (scp), but I wanted to retain certain meta data better and also have the "resume" option if the uploads are interrupted.

Of course, there's logging and email notifications as needed. My point in this long, blathering note (as if I was in the lounge) is that maybe a pivot in workflow will ease your pain. Hand off from FMS (as you mentioned) and continue in your python/perl/shell scripts. We keep hourly, daily, weekly, monthly, progressive, plus clones, but rotate locally and have different policies for our offsite copies.

1 Like

I would configure FileMake to only make one full backup each night.

Then there is a shell script to copy the backup to the NAS after FileMaker made the backup and puts it in a folder with timestamp. Make sure the script has only credentials to write the backup files and create a folder. No permission to delete or overwrite files.

Then NAS should have a script, which then moves the new backup folder to the archive. Maybe delete the oldest backup when disk space gets full.
Also the NAS should have snapshots, so you can get an old backup, if someone deleted it.

In general, make sure the credentials stored on the server can't do harm, if some hacker finds them.

2 Likes

Thanks,

My concern is that it’s unknown how long it will take FMS to make the ever-increasing backup. I can code the python script to “look” for the FMS backup to no longer increase in size, but the copy to the cloud itself is very slow.

We are using pCLoud, not a “NAS” in this case for the backup copies.

The script can wait until the newest file is 15 minutes old. Then move all files into different upload directory and take as long as it wants to upload.

1 Like

Yeah, the script already checks to make sure the latest FMS backup isn’t getting larger before it attempts to copy to pCloud. It seems a bit slow to get the backup zipped and copied.

I haven’t (yet) tested the case where the script could run again and there was a new FMS backup file also.

I currently have the script deleting the FMS backup file too as an option. Probably not needed since FMS will just overwrite the earliest backup automatically when you get to the max 99.

Thanks Christian

I haven’t tried this, but how about using a FMS Script Sequence

  1. The preprocessing script could trigger the backup using the CLI Backup Command
  2. the FileMaker script could do nothing
  3. the post processing script could copy the backup file to another destination.

I think this would ensure that each step completes before the next runs.

Thanks. Yep good points.

The Python script is mutl-threaded and handles all concerns.

I ran some tests for output size and elapsed time, comparing gzip to zstd and found that for FM files zstd was faster and output was about 8% more compact (in my environment). You may like to run similar comparisons if you're concerned about such things.

Costs are always a consideration and they occur in many ways. The backup process has costs so we can think about optimising it to reduce the amount of time and effort is required to perform that job.

Reducing Time and Energy

FileMaker files containing text compress well. You could expect them to compress to one third or one quarter. However, images are already in a binary format, so (1) they don’t reduce in size, (2) compression routines will struggle to compress them without much success, (3) that soaks up time, CPU, and energy. A good strategy is to put images into a separate file to the text (not always possible). You can then apply a strong compression ratio to the files containing text and you don’t waste time, CPU and energy trying to compress the images.

Reducing Storage Needs

Storage is cheap but you may want to moderate the amount you need. As the archive of backups age, how important is each backup? Everybody has different needs so the answer is always different. When my clients don’t have any special needs I suggest daily backups for the last month. Monthly backups for the last year. Everything older, one backup per quarter.

With that strategy the 99 file limit enforced by FMS is sufficient to retain 14 years of backups.

99 - 31 = 68 // daily for last month
68 - 12 = 56 // one per month for last 12 months
56 / 4  = 14 // one per quarter of each year

What amount of storage is required? Assuming that the database grows 10 GB per annum. Here’s the graph showing total storage requirements for my default backup strategy versus save all daily backups strategy. Obviously only keeping 4 per annum for previous years flattens the growth rate enormously.

1 Like

Our 18 GB FileMaker databases compress to 1.6 GB using 7-zip with our multi-threaded python script.

If this database size doesn’t grow, just as a point of reference, then…

12 TiB is the current disk space on pCloud

1.6 GiB is the size of the current backups.

365 is the normalized number of days per year (that is, ignoring leap years).

So,

12 TiB / 1.6 GiB / 365 = 21 years of daily backups.

Now, if we assume 20% increase in disk-space per year for backups, then we should still get 9 years of backups!

Here’s a little Python program to calculate this:

# Constants
S0 = 1.6  # Initial backup size per day in GiB
r = 0.2  # Yearly increase rate (20%)
total_space_GiB = 12000  # Total disk space in GiB (12 TB)
days_in_year = 365  # Assuming 365 days per year

# Initialize variables
cumulative_space_GiB = 0
years = 0

# Loop until cumulative space exceeds total space
while cumulative_space_GiB < total_space_GiB:
    # Calculate the daily backup size for this year
    daily_backup_size = S0 * (1 + r) ** years
    # Calculate the space used this year
    yearly_backup_size = daily_backup_size * days_in_year
    # Add to cumulative space
    cumulative_space_GiB += yearly_backup_size
    years += 1

print(f"12 TB will last for {years} years.")

The good news is we can increase our storage on pCloud to at least 14 TB.

1 Like

I noticed that after adding encryption to the FMP12 file (EAR), the file is less compressible. So, our 1.6 GB backup after 7-zip went up to over 10 GB.

Great. :frowning:

Yes, this is very true. I did compression comparison tests and found that encrypted DBs compress only about 30-35% instead of the 70+% (using zstd) with non-encrypted files. It's a known issue that I've read of elsewhere.