Incrementally Copying (Rsyncing) Files From A Kubernetes Pod

The most obvious choice for moving files in and out of containers is kubectl cp but it just does a straight copy of all the bytes. If you want to backup just a few changed bytes out of several hundred Gigabytes you will definitely want to do an incremental transfer. This will not only save time but also avoid problems like exhausting EFS Burst Credits.

Keep It Simple

When approaching any problem a good place to start is from something that works, in our case that's kubectl cp. A little digging revealed that kubectl cp is simply a tar pipe (aka kubectl exec pod tar cf - /path | tar -xf -). After seeing this I recalled that tar can do 'incremental' backups, I wont try to explain how that works here as there are many many many posts on the subject. The short version is that tar can keep track of files that have been added, deleted and modified since the last time it ran.

The tar command to create an incremental backup looks like this:

tar -C /precious/files --create --listed-incremental=/path/to/backupIndex -vv --file=backup.tar .

Now if we sprinkle in a little stdout magic and some pipes we can transfer only changed files:

kubectl exec -it <pod_id> tar -C /precious/files --create --listed-incremental=/path/to/backupIndex -vv --file=- . | tar -xvf - /precious/backups

The first time this is run it will transfer everything because there is no last time to compare to (aka the backupIndex is empty) but every run thereafter only changed files will be copied.

Caveats & Warnings

  • The backupIndex file is how tar knows what the last state was so it needs to be persistent or copied to the container before each run. If it is missing everything will be transferred.
  • This strategy also copies deletes, it is effectively the same as rsyncs --delete argument
  • From the tar manual "Incremental dumps depend crucially on time stamps". Checksums or hash comparisons are not possible with tar, if you want this then use rsync

Popular Reads

Subscribe

Keep up to date

Please provide your email address
Please provide your name
Please provide your name
No thanks