Downloading Files

It is recommended that you download files from the 4DN data portal as an authenticated user. Accounts can be created by anyone, including those not part of the 4DN Network; for more information on account creation, see the Account Creation page. If, however, you really don't wish to create an account on the portal, all but the most recent released data files are available for unauthenticated public download from the AWS open data public buckets. See section below for details.

Single File Download

Single files can be downloaded directly with your browser from the web portal after logging in.

Navigate to the File page for the file you are interested in and click the Download button. The file should start downloading automatically. If you have not yet logged in you will be directed to a sign in page, in order to do so.

Batch Download

To download multiple files at one time you first need to get a metadata.tsv file that contains the information necessary for download as well as additional helpful information about the selected files. First select and filter the file(s) from either the browse Experiment Sets page, the search Files page, or from the page of an Experiment Set containing the desired files using the selector check boxes and facet filters. The Download button will generate and download a file 'metadata.tsv' to your local machine.

Instructions to download the files included in the metadata.tsv file as well as some general statistics on the files you selected are included at the bottom of the file itself.

You will need to create a new access key, if you don't have an access key already. Note that access keys created for Jupyterhub can't be used for the rest of the portal.

To create a new access key, first log in to the data portal, then click on your account in the upper right and click on Profile from the dropdown menu. There will be a button near the bottom of the page to add an access key. Save the key and secret; typically this is done by creating a file in your home directory called keypairs.json with the following contents/format (replacing the x’s with the appropriate key and secret, of course):

{
    "default": {
        "key": "XXXXXXXX",
        "secret": "xxxxxxxxxxxxxxxx",
        "server": "https://data.4dnucleome.org"
    }
}

Once the access key is created and stored, a file can be downloaded via curl with the following command:

curl -O -L --user <key>:<secret> <download-url>

Downloading Released Files from AWS Open Data

As part of the AWS Open Data program the 4DN-DCIC hosts publicly released data in an open s3 bucket. Some basic information on the data and access provided by 4DN via this program can be found here. Accessing data that resides in these buckets via the mechanisms described in the previous sections will work as expected and should be transparent to you. However, if you do not wish to create an account on the 4DN data portal but do want to download specific data files along with informative metadata you can use the following mechanism.

Follow the guide to selecting files and generate a metadata.tsv file for your files of interest as described in the section above.

If there is a value in the open_data_url column for a file you have selected, then that file lives in the open data bucket and can be directly accessed using that URL:

eg. curl -O <open_data_url>

If you wish to use AWS command line tools you will need to parse the provided open data URL to convert the file location to the proper form.

For example:

https://4dn-open-data-public.s3.amazonaws.com/fourfront-webprod/files/34067f2a-0586-44a8-adf8-d4336db309c5/4DNFIFISE78E.fastq.gz

must be converted to:

s3://4dn-open-data-public/fourfront-webprod/files/34067f2a-0586-44a8-adf8-d4336db309c5/4DNFIFISE78E.fastq.gz

to be accessed with the CLI.