Reproducible Data Analysis

Overview

  • To ensure reproducible data analysis, 4DN data are processed through pipelines packaged as Docker images and described in Common Workflow Language (CWL).
  • All the workflow description CWL files can be found on a public Github repository, https://github.com/4dn-dcic/pipelines-cwl as a comprehensive collection. CWLs are versioned according to the release versions of this repo.
  • Individual docker images are stored in Docker Hub (https://hub.docker.com) and the source files for individual docker image are stored and versioned in GitHub. CWL files for individual pipelines can also be found on the docker source repository of the corresponding pipeline on Github.

Tibanna

Pipelines are executed and monitored using Tibanna, a workflow management system for cloud-based data processing.