Rsync
|
- The title of this article is incorrect because of technical limitations. The correct title is rsync.
rsync is a computer program which synchronises files and directories from one location to another while minimizing data transfer using delta encoding when appropriate. An important feature of rsync not found in most similar programs/protocols is that the mirroring takes place with only one transmission in each direction.
rsync can copy or display directory contents and copy files, optionally using compression and recursion.
rsync has the default UDP/TCP port of 873.
Algorithm
The rsync utility uses an algorithm (invented by Australian computer programmer Andrew Tridgell) for efficiently transmitting a structure (such as a file) across a communications link when the receiving computer already has a different version of the same structure.
The recipient splits its copy of the file into fixed-size non-overlapping chunks, say of size S, and computes two checksums for each chunk: the MD4 hash, and a weaker 'rolling checksum'. It sends these checksums to the sender.
The sender computes the rolling checksum for every chunk of size S in its own version of the file, even overlapping chunks. It can do this efficiently because of a special property of the rolling checksum: if the rolling checksum of bytes n through n+S-1 is R, one can readily compute the rolling checksum of bytes n+1 through n+S from R, byte n, and byte n+S; without having to examine the intervening bytes. Thus, if you had already calculated the rolling checksum of bytes 1-25, one could calculate the rolling checksum of bytes 2-26 solely from the previous checksum, and from bytes 1 and 26.
The sender then compares its rolling checksums with the set sent by the recipient to determine if any matches exist. If they do, it verifies the match by computing the MD4 checksum for the matching block and by comparing it with the MD4 checksum sent by the recipient.
The sender then sends the recipient those parts of its file that didn't match any of the recipient's blocks, along with assembly instructions on how to merge these blocks into the recipient's version to create a file identical to the sender's copy.
If the sender's and recipient's versions of the file have many sections in common, the utility needs to transfer relatively little data to synchronise the files.
See also
External links
- rsync homepage (http://rsync.samba.org)
- A useful tutorial (http://everythinglinux.org/rsync/)
- rsync algorithm (http://rsync.samba.org/tech_report/node2.html)
- Xdelta (http://xdelta.org/) - alternative implementation of file differencing and delta encoding
- A backup utility utilising rsync (http://erlang.no/rsyncbackup)
- rsnapshot: A snapshot backup utility utilising rsync (http://www.rsnapshot.org/)