Backing Up tar Archives over ssh
OpenSSH (www.openssh.org) provides tools to securely do remote login, remote execution, and remote file copy over network interfaces. By setting up two machines to share encryption keys, you can transfer files between those machines without entering passwords for each transmission. That fact lets you create scripts to back up your data from an SSH client to an SSH server without any manual intervention.
From a central Linux system, you can gather backups from multiple client machines using OpenSSH commands. The following example runs the tar command on a remote site (to archive and compress the files), pipes the tar stream to standard output, and uses the ssh command to catch the backup locally (over ssh) with tar:
$ mkdir mybackup ; cd mybackup $ ssh chris@server1 'tar cf - myfile*' | tar xvf - chris@server1's password: ****** myfile1 myfile2
In the example just shown, all files beginning with myfile are copied from the home directory of chris on server1 and placed in the current directory. Note that the left side of the pipe creates the archive and the right side expands the files from the archive to the current directory. (Keep in mind that ssh will overwrite local files if they exist, which is why you created an empty directory in the example.)
To reverse the process and copy files from the local system to the remote system, you can run a local tar command first. This example, however, adds a cd command to put the files in the /home/chris/myfolder directory on the remote machine:
$ tar cf - myfile* | ssh chris@server1 \ 'cd /home/chris/myfolder; tar xvf -' chris@server1's password: ****** myfile1 myfile2
In this next example, you’re not going to untar the files on the receiving end, but instead write the results to tgz files:
$ ssh chris@server1 'tar czf - myfile*' | cat > myfiles.tgz $ tar cvzf - myfile* | ssh chris@server1 'cat > myfiles.tgz'
The first example takes all files beginning with myfile from the chris user’s home directory on server1, tars and compresses those files, and directs those compressed files to the myfiles.tgz file on the local system. The second example does the reverse by taking all files beginning with myfile in the local directory and sending them to a myfiles.tgz file on the remote system.
The examples just shown are good for copying files over the network. In addition to providing compression, they also enable you to use any
Backing Up Files with rsync
A more feature-rich command for doing backups is rsync. What makes rsync so unique is the rsync algorithm, which compares the local and remote files one small block at a time using checksums, and only transfers the blocks that are different. This algorithm is so efficient that it has been reused in many backup products.
The rsync command can work either on top of a remote shell (ssh) or by running an rsyncd daemon on the server end. The following example uses rsync over ssh to mirror a directory:
$ rsync -avz --delete chris@server1:/home/chris/pics/ chrispics/
The command just shown is intended to mirror the remote directory structure (/home/chris/pics/) on the local system. The -a says to run in archive mode (recursively copying all files from the remote directory), the -z option compresses the files, and -v makes the output verbose. The –delete tells rsync to delete any files on the local system that no longer exist on the remote system.
For ongoing backups, you can have rsync do seven-day incremental backups. Here’s an example:
# mkdir /var/backups # rsync --delete --backup \ --backup-dir=/var/backups/backup-`date +%A` \ -avz chris@server1:/home/chris/Personal/ \ /var/backups/current-backup/
When the command just shown runs, all the files from /home/chris/Personal on the remote system server1 are copied to the local directory /var/backups/current-backup. All files modified today are copied to a directory named after today’s day of the week, such as /var/backups/backup-Monday. Over a week, seven directories will be created that reflect changes over each of the past seven days.
Another trick for rotated backups is to use hard links instead of multiple copies of the files. This two-step process consists of rotating the files, and then running rsync:
# rm -rf /var/backups/backup-old/ # mv /var/backups/backup-current/ /var/backups/backup-old/ # rsync --delete --link-dest=/var/backups/backup-old -avz \ chris@server1:/home/chris/Personal/ /var/backups/backup-current/
In the previous procedure, the existing backup-current directory replaces the backup-old directory, deleting the two-week-old full backup with last week’s full backup. When the new full backup is run with rsync using the –link-dest option, if any of the files being backed up from the remote Personal directory on server1 existed during the previous backup (now in backup-old), a hard link is created between the file in the backup-current directory and backup-old directory.
You can save a lot of space by having hard links between files in your backup-old and backup-current directory. For example, if you had a file named file1.txt in both directories, you could check that both were the same physical file by listing the files’ inodes as follows:
$ ls -i /var/backups/backup*/file1.txt 260761 /var/backups/backup-current/file1.txt 260761 /var/backups/backup-old/file1.txt