Data Transfer and Management Guide

Outline

Data Transfer and Management Guide

Transferring Portal data from your local machine to one of TACC's remote storage systems can be accomplished using two methods: command line tools (scp, sftp, rsync) and graphical user interface (Cyberduck).

Data Transfer Methods

Usage Mode Transfer Method
Command Line Tool scp
Command Line Tool sftp
Command Line Tool rsync
Graphical Tool Cyberduck

What is TACC's Storage Server?

A TACC storage system is a logically defined resource designed to provide data storage and management capabilities to TACC portal users through the portal interface. Each portal is accessible over a URL path (e.g., sub.domain.tacc.utexas.edu ) which we will subsequently refer to as “host”.

Storage systems can be configured for both normal and protected data (e.g., HIPAA) in a secured location, depending on the project requirements established for the portal. This location is exposed as a path to a directory on the secure system (e.g., /secure-server-root/projects/directory_name ) which this document will subsequently refer to as /transfer/directory/path .

Storage systems are to be used exclusively for transferring and accessing data through the portal.

Prerequisites for Portal User

There are two prerequisites for accessing a portal and transferring data:

  • A TACC Account
  • Multi-Factor Authentication (MFA) pairing with the TACC Token app

All portal users will need to create a TACC account in the TACC User Portal (which can be accessed at TACC Portal. If you have forgotten your TACC account credentials, please refer to your email for a message titled “TACC Account Request Confirmation” or use the TACC Portal's password reset form or username recovery form.

Access to all TACC resources requires a completed Multi-Factor Authentication pairing with your TACC credentials. To set up MFA, please reference TACC Portal's Multi-Factor Authentication tutorial.

Using Command Line Tools to Transfer and Organize Data

A common method of transferring files between TACC resources and/or your local machine is through the command line.

scp, sftp, & rsync

These three command line tools are secure and can be used to accomplish data transfer. You can run these commands directly from the terminal if your local system runs Linux or macOS.

Note: It is possible to use these command line tools if your local machine runs Windows, but you will need to use a ssh client (ex. PuTTY ).

To simplify the data transfer process, it is recommended that Windows users follow the How to Transfer Data with Cyberduck guide as detailed below.

For users that are new to the command line, using either scp or sftp to transfer data is advised.

Prerequisites for Data Transfer with Command Line Tools

Before we begin, you will need to know:

  • the path to your data file(s) on your local system
  • the path to your transfer directory on the remote storage server

Determining the Path to Your Data File(s) on Your Local System

In order to transfer your project data, you will first need to know where the files are located on your local system.

To do so, navigate to the location of the files on your computer. This can be accomplished on a Mac by using the Finder application or on Windows with File Explorer application. Common locations for user data at the user's home directory, the Desktop and My Documents.

Once you have identified the location of the files, you can right-click on them and select either Get Info (on Mac) or Properties (on Windows) to view the path location on your local system.

Figure 1. Use Get Info to determine “Where” the path of your data file(s) is

For example, a file located in a folder named portal-data under Documents would have the following path:

On Mac
/Users/username/Documents/portal-data/my_file.txt
On Windows
\Users\username\My Documents\portal-data\my_file.txt

Determining the Path to Your Transfer Directory

A transfer directory on the remote storage server associated with the portal you are accessing it through will be established when your account is given access to the portal and completes the on-boarding procedure. The transfer directory path will be unique for every institution and project.

Examples: /corral-secure/projects/A2CPS/submissions/utaustin/

If you are unsure of your transfer directory path, please consult your project PI directly.

How to Transfer Data with scp

scp copies files between hosts on a network. To transfer a file (ex. my_file.txt) to the remote secure system via scp, open a terminal on your local computer and navigate to the path where your data file is located.

On Mac
localhost$ cd ~/Documents/portal-data/
On Windows
localhost$ cd %HOMEPATH%\Documents\portal-data\

Assuming your TACC username is jdoe and you are affiliated with UT Austin, a scp transfer that pushes my_file.txt from the current directory of your local computer to the remote secure system would look like this:

localhost$ scp ./my_file.txt jdoe@host:/transfer/directory/path

Note: This command will copy your data file directly to your individualized transfer directory on the remote storage system.

If you have not done so already, enter this command in your terminal, replacing the file name, TACC username, and your individualized transfer directory path appropriately.

After entering the command, you will be prompted to login to the remote secure system by entering the password associated with your TACC account as well as the token value generated from your TACC token app.

A successful data transfer will generate terminal output similar to this:

my_file.txt     100% ##  #.#          KB/s   ##:##

If you wish to learn more about scp and how to synchronize your file transfer, you can do so the online man page for scp or follow the file transfer section of the user guide for the appropriate TACC system:

How to Transfer Data with sftp

sftp is a file transfer program that allows you to interactively navigate between your local file system and the remote secure system. To transfer a file (ex. my_file.txt) to the remote secure system via sftp, open a terminal on your local computer and navigate to the path where your data file is located. 

On Mac
localhost$ cd ~/Documents/portal-data/
On Windows
localhost$ cd %HOMEPATH%\Documents\portal-data\

Assuming your TACC username is jdoe and you are affiliated with UT Austin, an sftp transfer that pushes my_file.txt from the current directory of your local computer to the remote secure system would look like this:

localhost$ sftp jdoe@host:/transfer/directory/path
Password:
TACC Token Code:
Connected to host.
Changing to:
  /transfer/directory/path
sftp>

If you have not done so already, enter this command in your terminal, replacing the TACC username and your individualized transfer directory path appropriately.

You are now logged into the remote secure system and have been redirected to your transfer directory. To confirm your location on the server, enter the following command:

sftp> pwd
Remote working directory:
  /transfer/directory/path

To list the files currently in your transfer directory:

sftp> ls
utaustin_dir.txt

To list the files currently in your local directory:

sftp> lls
my_file.txt

Note: The leading l in the lls command denotes that you are listing the contents of your local working directory.

To transfer my_file.txt from your local computer to your transfer directory:

sftp> put my_file.txt
Uploading my_file.txt to /transfer/directory/path
my_file.txt     100% ##  #.#          KB/s   ##:#

To check if my_file.txt is in the utaustin subfolder:

sftp> ls
my_file.txt
utaustin_dir.txt

To exit out of sftp on the terminal:

sftp> bye
localhost1$

If you wish to learn more about sftp, you can do so at the online man page for scp.

How to Transfer Data with rsync

rsyncis a file copying tool that can reduce the amount of data transferred by sending only the differences between the source files on your local system and the existing files in your transfer directory. To transfer a file (ex. my_file.txt) to the remote secure system via rsync, open a terminal on your local computer and navigate to the path where your data file is located.

On Mac
localhost$ cd ~/Documents/portal-data/
On Windows
localhost$ cd %HOMEPATH%\Documents\portal-data\

Assuming your TACC username is jdoe and you are affiliated with UT Austin, an rsync transfer that pushes my_file.txt from the current directory of your local computer to the remote secure system would look like this:

localhost$ rsync ./my_file.txt jdoe@host:/transfer/directory/path

If you have not done so already, enter this command in your terminal, replacing the TACC username and your individualized transfer directory path appropriately.

If the command returns 0 in your terminal, the data transfer was successful.

If you wish to learn more about rsync and how to synchronize your file transfer, you can do so the online man page for rsync or follow the file transfer section of the user guide for the appropriate TACC system:

How to Transfer Data with Cyberduck

Cyberduck is a free graphical user interface for data transfer and is an alternative to using the command line. With a drag-and-drop interface, it is easy to transfer a file from your local system to the remote secure system. You can use Cyberduck for Windows or macOS.

For Windows

Download and install Cyberduck for Windows on your local machine.

Once installed, click “Open Connection” in the top left corner of your Cyberduck window.

Figure 2. Windows Cyberduck and “Open Connection” setup screen

To setup a connection, type in the server name, host. Add your TACC username and password in the spaces provided. If the “More Options” area is not shown, click the small triangle button to expand the window; this will allow you to enter the path to your transfer directory, /transfer/directory/path, so that when Cyberduck opens the connection you will immediately be in your individualized transfer directory on the system. Click the “Connect” button to open your connection.

Consult Figure 3. below to ensure the information you have provided is correct. If you have not done so already, replace the “Path” with the path to your individualized transfer directory.

Figure 3. Windows “Open Connection” setup screen

Note: You will be prompted to “allow unknown fingerprint…” upon connection. Select “allow” and enter your TACC token value.

Once connected, you can navigate through your remote file hierarchy using the graphical user interface. You may also drag-and-drop files from your local computer into the Cyberduck window to transfer files to the system.

For Mac

Download and install Cyberduck for macOS on your local machine.

Once installed, go to “Bookmark > New Bookmark” to setup a connection.

Note: You cannot select “Open Connection” in the top left corner of your Cyberduck window as macOS’ setup screen is missing the “More Options” button.

To setup a connection using “New Bookmark", type in the server name, host. Add your TACC username and password in the spaces provided. If the “More Options” area is not shown, click the small triangle or button to expand the window; this will allow you to enter the path to your transfer directory, /transfer/directory/path, so that when Cyberduck opens the connection you will immediately be in your individualized transfer directory on the system. As you fill out the information, Cyberduck will create the bookmark for you. Exit out of the setup screen and click on your newly created bookmark to launch the connection.

Figure 4. macOS “New Bookmark” setup screen

Consult Figure 4. above to ensure the information you have provided is correct. If you have not done so already, replace the “Path” with the path to your individualized transfer directory.

Note: You will be prompted to “allow unknown fingerprint…” upon connection. Select “allow” and enter your TACC token value.

Once connected, you can navigate through your remote file hierarchy using the graphical user interface. You may also drag-and-drop files from your local computer into the Cyberduck window to transfer files to the storage system.

References