FTP Usage Guide
FTP (File Transfer Protocol) is an industry-standard protocol and user interface for transferring files between computer systems by means of a series of interactive commands. FTP involves a local client (software you execute to send or receive files) and a remote server (software elsewhere that responds to instructions from your client to accept or deliver files).
FTP features include:
- Use of standard TCP/IP network protocols to move files between machines.
- Support for transfers to or from nonUNIX systems as well as among computers runing UNIX.
- Use of IP host addresses as well as domain names to specify transfer targets.
- Interactive login, usually with password, to begin transfers to or from each remote machine (at LC, some transfers are preauthenticated to omit the password).
LC users with special file-transfer needs (such as for batch-oriented command files, extensive tracking of each transfer, or persistent transfers if network problems arise) may prefer to use the NFT file-transfer tool to move files among LC machines. See EZOUTPUT for a comparison of FTP and NFT. Users whose primary interest in FTP is as an interface to LC's High Performance Storage System (archival file storage) may want to consult the EZSTORAGE for helpful comparisons and alternative interfaces (such as LSTORAGE, CHMODSTG, and CHGRPSTG). For an alternative file-transfer tool that also relies on FTP software daemons to move files (but that offers special services beyond the basic FTP interface and that transfers to or from storage), see the HSI manual and the HTAR Reference Manual. If you prefer to FTP using a graphical user interface, then run Hopper on any LC production machine and select FTP from Hopper's Connect menu.
This manual explains how to run FTP and shows a typical FTP session. Standard FTP commands and server replies and error codes are easily searched online and are therefore not in this guide. On many LC production machines, a parallel FTP client (PFTP) is now the default, and parallel transfers occur automatically when they are possible. Instructions for invoking a nondefault nonparallel FTP client are also included here, along with advice on using jumbo-frame gigabit Ethernet links. Users who need to transfer files (to FIS) with their data encrypted can try Secure FTP, a special FTP client with very limited server support.
For help, contact the LC Hotline at 925-422-4531 or via e-mail (OCF: email@example.com, SCF: firstname.lastname@example.org).
To run FTP on any LC machine, type
where remotehost is either the IP address or the domain name of the computer with which you want to exchange files. The machine on which you run FTP is the "client" or "local" machine, and the machine whose address or name you specify on the execute line is the "server" or "remote" machine (for purposes of describing commands and file transfers below). If run with no remotehost, FTP prompts for input (and you will need to use its interactive open command to specify a target host).
You must log in to your local machine to run FTP, and you must also log in to the specified remote machine at the start of each FTP file-transfer session (when you are prompted for your remote user name and password, which might be different from the local ones). FTP expects file transfers to be done by a series of interactive commands, and it does not allow "third-party" transfers (between two remote machines). On LC production machines, Hopper serves as a graphical controller for FTP.
LC uses its hardware/software security firewall to block direct FTP connections from machines outside the llnl.gov domain to LC machines within llnl.gov. Offsite users must either log on to some llnl.gov production machine, execute FTP there, and then draw external files toward them (with the get option), or log in to the Lab's Virtual Private Network (VPN) before beginning an FTP session. For more information, see the Access Information section of LC's Computing Web pages.
On all LC production machines, open and secure (but not necessarily on LC's other machines), a parallel FTP client (PFTP) is the default. Parallel file transfers occur automatically when they are possible. But the extra verbosity of the parallel client might sometimes pose problems (e.g., for scripts). To overtly invoke a "standard," nonparallel FTP client, either type the special execute line
or else set the environment variable OLD_FTP_CLIENT to any nonnull value. For example:
setenv OLD_FTP_CLIENT 1
and then use the regular execute line shown at the start of this section. (Unless you include this setting in your login file, FTP will revert to the parallel default client with every new login session.)
For information on how executing the specialized secure FTP (SFTP) client differs from running standard FTP, consult the SFTP section below.
Suppose you want to bundle a set of files (perhaps including directory trees) and transfer the resulting archive to another LC machine, but you lack enough disk space to run TAR locally and (hence) double your disk usage on the client machine before you invoke FTP for the transfer. LC's special HTAR utility offers a -f option with which you can open a parallel connection to any preauthenticated LC FTP server and transfer files directly into an archive (TAR file) created and filled only on the remote (receiving) machine. Extractions from and even indexing of remote nonstorage archives are also supported.
See the HTAR Reference Manual for details and annotated examples.
One common use for FTP on LC machines is to put files into or get files from archival storage (HPSS, storage.llnl.gov), but not all FTP clients interact equally well with HPSS. If you work on LC Linux/CHAOS systems, you have access to /usr/kerberos/bin/ftp, but you should instead run /usr/local/bin/ftp to reach storage. Under some conditions, the former (but not the latter) client refuses to log you into HPSS or needlessly asks you to "please log in with USER and PASS."
Parallel FTP service is available between each LC production machine and (both OCF and SCF) storage, as well as between pairs of LC production machines themselves. In all cases where parallel service is available, it is automatic. Multiple "command completed" messages (one for each parallel stripe) betray the parallel transfer of large files with FTP. (HTAR also automatically uses parallel transfers but does not execute the PFTP client.)
Also, the login nodes on most LC production machines are connected to "jumbo-frame gigabit Ethernet links" for fast network traffic. Naturally, the best file-transfer rates occur when you get parallel FTP between a pair of machines that also has the fast jumbo-frame links. For example, NFT automatically routes storage transfers to a cluster's login nodes to take advantage of the jumbo-frame links.
LC has installed jumbo-frame gigabit Ethernet links on both OCF and SCF storage systems and the login nodes of most LC clusters (AIX and Linux/CHAOS). Use of these jumbo-frame network links for faster file transfers is automatic among those LC computers that have them installed. FTP, HSI, HTAR, and NFT all utilize the jumbo-frame gigabit Ethernet links automatically.
The default FTP client on all LC production machines (but not necessarily on special-purpose machines) is a locally developed version that enables you to transfer data in parallel. When parallel FTP transfers are possible, they occur automatically. The FTP server ("daemon") on your destination (target) machine determines whether or not the file transfer is automatically parallel. This is the case for all LC production clusters (AIX or Linux/CHAOS) and storage. For all files over 4 Mbytes, FTP file transfers to storage from all LC production machines (both directions) are automatically parallel (OCF and SCF). Transfers originating on those machines with jumbo-frame gigabit Ethernet links also automatically use those links for even faster data movement. FTP's parallel command now simply reports the current parallel stripe width and block size.
The PFTP client offers a number of extra commands (beyond the usual set offered by FTP) to specifically manage parallel file transfers (for example, pget and mpget perform parallel gets). On LC production machines these special PFTP commands are quite unnecessary because parallel transfers occur automatically where they are possible. At other (ASC tri-lab) sites, you may need to remember the special PFTP commands to perform parallel file transfers (especially to storage). See LC's HPSS User Guide for information on extra PFTP commands.
LC's parallel FTP client is more verbose than the standard FTP client during file transfers. Parallel FTP users may want a complete record of each verbose FTP dialog in their batch log files. The child mode execute-line option (-c) causes all interactive output to be sent during batch runs of FTP, and the echo mode (-e) option copies FTP input commands into your batch output. Thus, running FTP with the execute line
ftp -ce remotehost
will preserve all the details of a parallel session even within a batch job. For more information on these and other FTP-related options, consult the FTP man pages.
FTP sessions with storage (and with some other target machines) are fully preauthenticated and do not prompt for your user name, while in other cases FTP returns a
prompt to which you must reply to continue. Parallel users who want to eliminate this Name: prompt from all sessions (including batch sessions) can install a file called .netrc in their (global) home directory, containing the following lines:
where the last line in the .netrc file is present but blank. This will put you into binary mode every time, which is fine as there is no need to use the default ASCII mode.
The following sample session (with annotated steps) shows a typical dialog by which a user (Jane) transfers files interactively using FTP. In this case, the local machine (on which Jane executes the FTP client) is Cab, and the remote machine that files are copied to and from is Depserver. (For an alternative approach on LC production machines, you can use Hopper as a graphical controller for FTP.)
(1) The user runs FTP (on Cab) with the remote machine's domain name as an argument.
Connected to depserver.llnl.gov.
220 [NOTICE TO USERS -- very long legal statement]
222 depserver.llnl.gov FTP server (Version LLNL-27...) ready.
202 Command not implemented.
(2) FTP prompts for a user ID and a password to log in to Depserver (some LC machines "preauthenticate" and skip this password step).
Name (depserver.llnl.gov:jane): jane
331 Password required for jane.
Password: [does not echo]
230 User jane logged in. Remote system type is UNIX.
Using binary mode to transfer files.
(3) At the FTP prompt, change remote directories to /var/tmp/jane (which is not shared among LC machines).
ftp> cd /var/tmp/jane
250 CWD command successful.
(4) Next, get the file nft.ps (copy
it from Depserver to Cab).
ftp> get nft.ps
200 PORT command successful.
150 Opening Binary data connection for nft.ps
226 Transfer complete.
1602470 bytes received in 0.579 seconds (2.64 Mbytes/s)
(5) Next, put the file testfile (copy
it from Cab to Depserver).
ftp> put testfile
200 PORT command successful.
150 Opening Binary data connection for testfile
226 Transfer complete. 5264 bytes sent in 0 seconds (5.14 Kbytes/s)
(6) The user then transfers a 1.1-Gbyte file called large from
Cab to Depserver. FTP automatically invokes four parallel stripes (each separately
reported as FTP "completed" commands in the output).
ftp> put large
200 Command complete (11496780, large, 0, 4, 4194304)
200 Command complete. Address 1 is 18.104.22.168.2356
200 Command complete. Address 2 is 22.214.171.124.2357
200 Command complete. Address 3 is 126.96.36.199.2358
200 Command complete. Address 4 is 188.8.131.52.2359
150 Transfer starting.
226 Transfer complete. (moved = 11496780).
11496780 bytes sent in 0.79 seconds (16.3 Mbytes/s)
200 Command complete.
(7) When the file transfers are done and confirmed, quit FTP.
FTP commands are described in the FTP man pages. Access them by typing man ftp at the command line prompt. Type help at the FTP prompt to see a list of available commands. Information about standard FTP commands and other FTP issues is easily found online using your favorite Web search engine.
When you enter an FTP command, you receive a corresponding reply that indicates the command was accepted, rejected, or is being processed. An FTP reply consists of a three-digit code followed by a brief description of the result (as seen in the Sample FTP Session above). Information about common FTP replies and errors is easily found online using your favorite Web search engine.
FTP reply codes and their meanings are easily found online using your favorite Web search engine. The exact text accompanying each reply code depends on the command issued.
Standard FTP clients do not encrypt the data that they send to remote hosts, which theoretically allows malicious third parties to intercept and read that data. Secure FTP (SFTP) is a modified client that does encrypt all the files that it sends for greater safety.
SFTP clients reside on all OCF and SCF production machines.
- FIS—LC's File Interchange Service (FIS, at fis.llnl.gov) is the only LC server that now accepts incoming files from SFTP clients. FIS only accepts SFTP transfers from within the LC firewall, so direct SFTP transfers from outside machines by means of OTS or VPN are not accepted.
- Others—No other LC FTP servers accept SFTP transfers. In particular, you cannot store files (at storage.llnl.gov) from any host by running SFTP.
SFTP clients present a different user dialog than do standard FTP clients on LC machines. While some differences are trivial, others require different user responses to open connections or to transfer files successfully. SFTP:
- Does not request your user name (nor present it as a default to which you can simply respond by hitting the RETURN key).
- Checks for a host key for every new host to which you try to connect and, if not found, asks if you want to continue connecting (yes/no) anyway.
- Requests your one-time password (OTP) to open every connection unless you have Kerberos or public key authentication (no default preauthentication occurs, unlike for standard FTP connections among LC machines).
- Prompts for input with sftp>.
SFTP recognizes many of the usual set of FTP control options. Type ? or help at the SFTP prompt to see the list of available commands.
Among the most useful standard FTP options that SFTP does not accept are:
- In most situations, the SFTP alternative option (ls -l) lists files and their properties just as dir does for standard FTP sessions.
- The SFTP alternative rm removes remote files and performs the same functions as delete during standard FTP sessions.
- ascii, binary, parallel, quote, site
- SFTP provides no alternative options for these commands. It is supposed to automatically detect ASCII and BINARY files on arrival and transfer them in the appropriate mode, but you cannot force the mode if inappropriate transfers occur.
If you prefer not to use your OTP (one-time password) to authenticate every SFTP session, you can create and install a special file called an SSH public key, generated using OpenSSH, on every pair of machines between which you transfer files with SFTP. Generating an appropriate public key, converting it to the needed OpenSSH format if needed, and installing it in the right directories (including those on the open FIS node) is a complex, multi-step process. If you need assistance with creating a public key, please contact the LC Hotline.
Note: On OCF, SSH public key authentication is only allowed in limited cases. One is between production clusters using port 622, and the other is when uploading to FIS.