File transfer speed over ssh

I’ve known that ssh encryption has an effect on the speed of file xfers. So doing thing such as rsync (which will use ssh) or even plain scp can be pretty darn slow, especially on large files and on system with old/slow CPU.

I also know about the recommendation to use different type of encryption when transferring files. Some people recommend blowfish, others arcfour. So I thought I’d do a little bit of testing in a controlled environment.

I have two recent vintage HP servers with the following specs.

HP ProLiant DL360p Gen8
Dual quad core Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz (8 core, 16 threads total)
64G RAM
4 x 3TB, mdadm RAID10, formatted as XFS, mounted noatime,logbufs=8
Tigon ethernet NIC, connected as GigE, full duplex to HP ProCurve 2848 switch
(both servers connected to same switch)

The test file is:
3921247501 Mar 4 08:22 bigdata.tar.bz2 (3.8GB)

I am using OpenSSH_5.3p1, OpenSSL 1.0.0-fips 29 Mar 2010
Kernel is 3.8.1-1.el6.elrepo.x86_64 #1 SMP Thu Feb 28 19:15:22 EST 2013 x86_64 x86_64 x86_64 GNU/Linux

I am going to copy this file from hp1 to hp2, using scp, rsync and ftp. With scp, I’ll try different encryption, no compression to see how the different encryption affect the transfers. For comparison purposes, I also timed using plain ole FTP transfer, which mean no encryption and very little system processing; and the timing proves that.  Also tested with plain rsync protocol (direct to rsyncd).

I run this 3 times. Without specifying encryption, ssh/scp will use the default, which depends on the version of OpenSSH (for this version, the default is aes128-ctr).  NOTE: the file is rm’ed each time at the dest before I do copy.

run Xfer type real user system
1 scp -o Compression=no 0m52.175s 0m12.709s 0m6.504s
2 scp -o Compression=no 0m47.872s 0m12.603s 0m6.806s
3 scp -o Compression=no 0m49.317s 0m12.748s 0m6.710s
1 scp -c arcfour -o Compression=no 0m49.536s 0m14.161s 0m6.903s
2 scp -c arcfour -o Compression=no 0m49.088s 0m14.045s 0m6.921s
3 scp -c arcfour -o Compression=no 0m50.698s 0m14.162s 0m6.728s
1 scp -c blowfish-cbc -o Compression=no 0m58.673s 0m44.295s 0m13.495s
2 scp -c blowfish-cbc -o Compression=no 0m56.399s 0m43.860s 0m9.036s
3 scp -c blowfish-cbc -o Compression=no 0m54.869s 0m43.949s 0m10.673s
1 scp -c aes128-cbc -o Compression=no 0m49.776s 0m14.641s 0m7.083s
2 scp -c aes128-cbc -o Compression=no 0m48.527s 0m15.154s 0m7.068s
3 scp -c aes128-cbc -o Compression=no 0m50.554s 0m15.334s 0m6.983s
1 ncftpput -m -u ftptest -p ‘XXXXXX’ hp2 /data/ /data/bigdata.tar.bz2 0m34.306s 0m0.141s 0m4.062s
2 ncftpput -m -u ftptest -p ‘XXXXXX’ hp2 /data/ /data/bigdata.tar.bz2 0m33.351s 0m0.160s 0m3.863s
3 ncftpput -m -u ftptest -p ‘XXXXXX’ hp2 /data/ /data/bigdata.tar.bz2 0m33.839s 0m0.154s 0m3.732s
1 rsync –stats -a /data/bigdata.tar.bz2 hp2::data/bigdata.tar.bz2.1 0m33.485s 0m10.221s 0m6.692s
2 rsync –stats -a /data/bigdata.tar.bz2 hp2::data/bigdata.tar.bz2.2 0m33.490s 0m10.234s 0m6.703s
3 rsync –stats -a /data/bigdata.tar.bz2 hp2::data/bigdata.tar.bz2.3 0m33.497s 0m10.163s 0m6.545s

In terms of speed, we have:

Average over 3 runs

RSYNC:         real=33.491  user=10.206  sys=6.6467
FTP:           real=33.832  user=0.1517  sys=3.8857
AES128-CBC:    real=49.619  user=15.043  sys=7.0447
ARCFOUR:       real=49.774  user=14.1226 sys=6.8507
AES128-CTR:    real=49.788  user=12.687  sys=6.6734
BLOWFISH-CBC:  real=56.647  user=44.0347 sys=11.068

So it look like in modern OpenSSH, using AES, it’s a wash which cipher/encryption method you want to use.

Note that rsync protocol itself is pretty darn efficient, slightly faster than FTP.

3/6/13 Update

AES in SSH.  I’ve tested again from an old Dell using Pentium 4 to the fast HP, with no AES support in hardware and the default AES128-CTR is much slower.  However, good news is that AES128-CBC is still faster than BLOWFISH, but slightly slower than ARCFOUR.  As for FTP and RSYNC, they are neck-and-neck in speed, no clear winner.

So my conclusion is that whether using AES with hardware support (in new Intel CPUs) or software, using the CBC (block mode) variant of AES is usually good enough.

 

 

5 thoughts on “File transfer speed over ssh”

  1. For a “from scratch” transfer with rsync, I recommend using the ‘-W’ (copy files whole, without delta algorithm). I forgot to do benchmark with that when I had the chance, but in my experience, it is faster to start when it’s a fresh rsync or ‘from scratch’ as you put it.

  2. Argh. Kill it…. Sorry dude – didn’t really the blog formatting was going to eat the code that badly.

    >”Note that rsync protocol itself is pretty darn efficient, slightly faster than http://FTP.”

    Only if you are doing an update in my experience. Then it absolutely rocks.

    But to make a ‘from scratch’ transfer of many files over ssh scream I haven’t found anything better than tar piped through ssh with arcfour encryption. It can pretty much saturate the network (as long as your hard drives are fast enough) on even pretty old CPUs. About 2x compared with rsync over ssh in my experience on the same hardware.

    I found it so convenient that I actually wrote a Perl wrapper script some years ago for it:

    http://snowhare.com/utilities/script_tricks/pull-sync-via-tar.pl

    http://snowhare.com/utilities/script_tricks/push-sync-via-tar.pl

  3. >”Note that rsync protocol itself is pretty darn efficient, slightly faster than FTP.”

    Only if you are doing an update in my experience. Then it absolutely rocks.

    But to make a ‘from scratch’ transfer of many files over ssh scream I haven’t found anything better than tar piped through ssh with arcfour encryption. It can pretty much saturate the network (as long as your hard drives are fast enough) on even pretty old CPUs. About 2x compared with rsync over ssh in my experience on the same hardware.

    I found it so convenient that I actually wrote a Perl wrapper script some years ago for it:

    #!/usr/bin/perl

    use strict;
    use warnings;

    use Pod::Usage qw(pod2usage);
    use Getopt::Long qw(GetOptions);

    $|++;

    #our ($tar, $ssh) = (‘/bin/tar’, ‘/usr/bin/ssh -2 -c arcfour -T -x’);
    our ($tar, $ssh) = (‘tar’, ‘ssh -2 -c arcfour -T -x’);
    our $tar_options = ‘–exclude=sys –exclude=proc –exclude=mnt –exclude=*/lost+found –exclude=cache’;
    my ($help, $man, $debug, $dry_run) = (0,0,0,0);
    my $source_dir = ‘/’;
    my ($target_host, $target_dir);

    GetOptions( ‘target_host=s’ => \$target_host,
    ‘source_dir=s’ => \$source_dir,
    ‘target_dir=s’ => \$target_dir,
    ‘debug!’ => \$debug,
    ‘dry-run!’ => \$dry_run,
    ‘help|?’ => \$help,
    ‘man!’ => \$man,
    ‘tar’ => \$tar,
    ‘tar-options’ => \$tar_options,
    ‘ssh’ => \$ssh,
    ) or pod2usage(2);

    pod2usage(1) if $help;
    my $errors = ”;
    if ((! defined ($target_dir)) || ($target_dir eq ”)) {
    $errors .= “Missing required –target_dir parameter\n”;
    }
    if ((! defined ($source_dir)) || ($source_dir eq ”)) {
    $errors .= “Missing required –target_host parameter\n”;
    }
    if ($errors ne ”) {
    print STDERR “\n$errors\n”;
    pod2usage( -exitstatus => 1, -verbose => 1);
    }
    if ($man) {
    pod2usage( -exitstatus => 0, -verbose => 2);
    }
    if ($dry_run) { print “Dry run\n”; }

    #######################################################################

    my $cmd = “$tar –directory=$source_dir $tar_options -Scpf – . | $ssh $target_host ‘$tar –directory=$target_dir -Spxf -‘”;
    if ($debug || $dry_run) { print “$cmd\n”; }
    if (! $dry_run) { system($cmd); }

    exit;

    1;

    #######################################################################
    #######################################################################
    #######################################################################

    __END__

    =head1 NAME

    push-sync-via-tar.pl – Performs remote backups using tar over ssh

    =head1 SYNOPSIS

    push-sync-via-tar.pl –target_host=example.com –source_dir=/ –target_dir=/backups/data/production-servers/daily/box1.example.com

    # Show full man page on program
    push-sync-via-tar.pl –man

    =cut

    =head1 DESCRIPTION

    Uses tar over an ssh connection to make a full image backup of a specified directory to a remote server.

    =head2 Notes

    By design this is a ‘as fast as we can possibly go’ implementation. It is I to make a full copy
    to the specified remote directory B. This means that it can and B saturate
    a network link between the machines while running if the machines and their disk drives
    are at all reasonably fast. I have saturated a 100 megabit ethernet network when running this script between
    fast machines.

    It was written to run on Redhat Linux machines. It will probably work on any *nix type machine, although you will
    probably need to adjust the –tar-options to exclude system specific directories on non-Linux machines
    as well as the –ssh, –tar settings.

    This is essentially a convienence automation of a simple shell script.

    =head1 OPTIONS

    The command line options are as follows:

    =over 4

    =item B

    Prints a brief help message and exits

    =back

    =over 4

    =item B

    Prints the manual page and exits

    =back

    =over 4

    =item B

    Specifies the remote source host’s name (ie. ‘example.com’)

    Ex.
    –target_host=example.com

    This is a B option.

    =back

    =over 4

    =item B

    Specifies the source directory on the source host (ie. ‘/’). Defaults to ‘/’ if not specified.

    Ex.
    –source_dir=/

    =back

    =over 4

    =item B

    Specifies the target directory where the copy of the source dir will be put.

    Ex.
    –target_dir=/backups/data/napa-servers/daily/example.com

    This is a B option.

    =back

    =over 4

    =item B

    Override for the the ssh command and parameters. The script assumes that
    the openssh version of ssh is in your path. The default is ‘ssh -2 -c arcfour -T -x’

    =back

    =over 4

    =item B

    Override for tar command. The default is ‘tar’. The script
    assumes the GNU version of tar is in your path.

    =back

    =over 4

    =item B

    Override for the options to the remote tar command. The default is ‘–exclude=sys –exclude=proc –exclude=mnt –exclude=*/lost+found –exclude=cache’

    =back

    =over 4

    =item B

    Flag for turning on the ‘dry run’ mode. In the dry run mode, the program
    prints its actions, but does not actually do them.

    =back

    =over 4

    =item B

    Flag for turning on debugging information

    =back

    =head1 AUTHOR

    Benjamin Franz,

    =head1 TODO

    Nothing.

    =head LICENSE

    This program is free software; you can redistribute it and/or modify it
    under the same terms and conditions as Perl itself.

    This means that you can, at your option, redistribute it and/or modify
    it under either the terms the GNU Public License (GPL) version 1 or
    later, or under the Perl Artistic License.

    See http://dev.perl.org/licenses/

    =head1 DISCLAIMER

    THIS SOFTWARE IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS
    OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE
    IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
    PARTICULAR PURPOSE.

    Use of this software in any way or in any form, source or binary,
    is not allowed in any country which prohibits disclaimers of any
    implied warranties of merchantability or fitness for a particular
    purpose or any disclaimers of a similar nature.

    IN NO EVENT SHALL I BE LIABLE TO ANY PARTY FOR DIRECT, INDIRECT,
    SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
    USE OF THIS SOFTWARE AND ITS DOCUMENTATION (INCLUDING, BUT NOT
    LIMITED TO, LOST PROFITS) EVEN IF I HAVE BEEN ADVISED OF THE
    POSSIBILITY OF SUCH DAMAGE

    =head1 SEE ALSO

    L L

    =cut

  4. I don’t have access to the equipments anymore to do testing. It would be interesting to compare against hpn-ssh.

    Another thing I missed in my testing was rsync over “local” filesystem and NFS mounted fs. That would give a nice baseline for testing.

  5. HI:

    I have found that file transfer over hpn-SSH tunnel is way faster than any other methods.
    1) hardware firewall with NAT, got around 200KB/s.
    2) linux firewall with NAT, got around 200KB/s.
    3) bbcp over SSH tunnel, got around 1.0MB/s. (rsync with this tunnel got 200KB/s)
    4) rsync with hpn-ssh tunnel, got around 1.4MB/s. That is 700% faster than regular rsync.

Leave a Reply