{"lab": {"status": "current", "display_title": "4DN DCIC, HMS", "uuid": "828cd4fe-ebb0-4b36-a94a-d2e3a36cc989", "title": "4DN DCIC, HMS", "@id": "/labs/4dn-dcic-lab/", "correspondence": [{"contact_email": "cGV0ZXJfcGFya0BobXMuaGFydmFyZC5lZHU=", "@id": "/users/fb287a31-e765-41c5-8c1d-665f8e9f025b/", "display_title": "Peter Park"}], "@type": ["Lab", "Item"], "pi": {"error": "no view permissions"}, "principals_allowed": {"view": ["system.Everyone"], "edit": ["group.admin", "role.lab_submitter", "submits_for.828cd4fe-ebb0-4b36-a94a-d2e3a36cc989"]}}, "body": "Interaction pairs are parsed from the `bam` files using [`pairtools`](https://github.com/mirnylab/pairtools) version 0.2.2. Filtering consists of several commands:\n\n* `pairtools parse`\n   * Produces a [pairsam](https://pairsamtools.readthedocs.io/en/latest/pairsam.html) file from an input `bam` file.\n   * The pairsam file is a pairs file, listing one read pair per line, with additional columns to track the sam-file lines, and a pairtools read classification.\n   * These classifications include information on whether the read aligned to 0, 1, or multiple places in the genome and whether it aligned end-to-end or if it was clipped.\n   * This tool also upper-triangularizes the reads, i.e. if the coordinate of second read is higher than the first, the reads are flipped.\n   * For more details, see the [pairtools documentation](https://pairtools.readthedocs.io/en/latest/parsing.html).\n\n* `pairtools sort`\n   * Produces a sorted `pairsam` file from an input `pairsam` file.\n   * Note that the flipping order and sort order of chromosomes is not identical. See [the docs](https://pairtools.readthedocs.io/en/latest/sorting.html#chromosomal-order-for-sorting-and-flipping) for more details.\n\n* `pairtools dedup --mark-dups`\n   * (equivalent to `pairtools markasdup`)\n   * Identify duplicate alignments.\n   * Arbitrarily designate the duplicate status among the two duplicate alignments.\n\n* `pairtools select`\n   * Remove duplicates, multi-mapped reads, and reads non-uniquely mapped at the 5' end.\n\nSource files (v1.1.1\\_dcic\\_4): \n\n* Workflow: https://data.4dnucleome.org/workflows/4DNWFMRGIPB1/\n* CWL: https://github.com/4dn-dcic/iMARGI-Docker/blob/v1.1.1_dcic_4/src/cwl/imargi-processing-bam.cwl", "name": "resources.data-analysis.imargi-processing-pipeline.bam", "award": {"@id": "/awards/2U01CA200059-06/", "project": "4DN", "@type": ["Award", "Item"], "name": "2U01CA200059-06", "display_title": "4D NUCLEOME NETWORK DATA COORDINATION AND INTEGRATION CENTER - PHASE II", "uuid": "71171a4e-dca1-44cb-8375-fafd896c6923", "description": "DCIC: The goals of the 4D Nucleome (4DN) Data Coordination and Integration Center (DCIC) are to collect, store, curate, display, and analyze data generated in the 4DN Network. We have assembled a team of investigators, staff scientists, and developers with a strong track record in analysis of chromatin interaction data, image processing, data visualization, integrative analysis of genomic and epigenomic data, data portal development, large-scale computing, and development of secure and \ufb02exible cloud technologies. In the \ufb01rst phase of the 4DN Project, we have developed the 4DN Data Portal as a central resource with tools for data submission, curation, analysis and quality control, visualization, exploration, and download. The portal provides an easy-to-navigate interface for accessing raw and intermediate data \ufb01les, allows for programmatic access via APIs, and incorporates novel analysis and visualization tools developed by DCIC as well as other Network members. In the second phase of the 4DN Project, we will continue to support the research activities by the 4DN Network, and to lead the creation of a well curated 4DN data resource for the scienti\ufb01c community. At the same time, we propose to enhance the utility of the 4DN Scienti\ufb01c Data and the Data Portal in multiple ways: i. We will create a platform to integrate imaging and sequencing data and support the creating of reference nuclear maps in a common coordinate system; ii. We will provide support for 4DN Projects on Human Health and Disease with customized ontology applications and protected data management; iii. We will develop new cloud platform capabilities to bring user analyses to the 4DN Data Portal, and apply cost-ef\ufb01ciency improvements to support increasing data volumes; iv. We will perform regular outreach activities to raise awareness about the data and tools generated by the Network and DCIC. Overall, we will ensure that the data generated in 4DN will have maximal impact for the scienti\ufb01c community.", "status": "current", "center_title": "DCIC - Park", "pi": {"error": "no view permissions"}, "principals_allowed": {"view": ["system.Everyone"], "edit": ["group.admin"]}}, "title": "Parsing", "status": "released", "aliases": ["4dn-dcic-lab:resources.data-analysis.imargi-processing-pipeline.bam"], "options": {"filetype": "md", "collapsible": false, "default_open": true, "convert_ext_links": true}, "date_created": "2021-10-04T15:19:02.775182+00:00", "section_type": "Page Section", "submitted_by": {"error": "no view permissions"}, "last_modified": {"modified_by": {"error": "no view permissions"}, "date_modified": "2024-03-26T18:52:41.840887+00:00"}, "schema_version": "2", "@id": "/static-sections/7d3dfe7b-f35f-4681-aa3e-46bdf3ecde54/", "@type": ["StaticSection", "UserContent", "Item"], "uuid": "7d3dfe7b-f35f-4681-aa3e-46bdf3ecde54", "principals_allowed": {"view": ["system.Everyone"], "edit": ["group.admin", "role.owner", "userid.545f1931-792c-4a7e-83b3-3e91baea4e30"]}, "display_title": "Parsing", "external_references": [], "content": "Interaction pairs are parsed from the `bam` files using [`pairtools`](https://github.com/mirnylab/pairtools) version 0.2.2. Filtering consists of several commands:\n\n* `pairtools parse`\n   * Produces a [pairsam](https://pairsamtools.readthedocs.io/en/latest/pairsam.html) file from an input `bam` file.\n   * The pairsam file is a pairs file, listing one read pair per line, with additional columns to track the sam-file lines, and a pairtools read classification.\n   * These classifications include information on whether the read aligned to 0, 1, or multiple places in the genome and whether it aligned end-to-end or if it was clipped.\n   * This tool also upper-triangularizes the reads, i.e. if the coordinate of second read is higher than the first, the reads are flipped.\n   * For more details, see the [pairtools documentation](https://pairtools.readthedocs.io/en/latest/parsing.html).\n\n* `pairtools sort`\n   * Produces a sorted `pairsam` file from an input `pairsam` file.\n   * Note that the flipping order and sort order of chromosomes is not identical. See [the docs](https://pairtools.readthedocs.io/en/latest/sorting.html#chromosomal-order-for-sorting-and-flipping) for more details.\n\n* `pairtools dedup --mark-dups`\n   * (equivalent to `pairtools markasdup`)\n   * Identify duplicate alignments.\n   * Arbitrarily designate the duplicate status among the two duplicate alignments.\n\n* `pairtools select`\n   * Remove duplicates, multi-mapped reads, and reads non-uniquely mapped at the 5' end.\n\nSource files (v1.1.1\\_dcic\\_4): \n\n* Workflow: https://data.4dnucleome.org/workflows/4DNWFMRGIPB1/\n* CWL: https://github.com/4dn-dcic/iMARGI-Docker/blob/v1.1.1_dcic_4/src/cwl/imargi-processing-bam.cwl", "filetype": "md", "content_as_html": "<div class=\"markdown-container\"><p>Interaction pairs are parsed from the <code>bam</code> files using <a href=\"https://github.com/mirnylab/pairtools\" rel=\"noopener noreferrer\" target=\"_blank\"><code>pairtools</code></a> version 0.2.2. Filtering consists of several commands:</p>\n<ul>\n<li><code>pairtools parse</code></li>\n<li>Produces a <a href=\"https://pairsamtools.readthedocs.io/en/latest/pairsam.html\" rel=\"noopener noreferrer\" target=\"_blank\">pairsam</a> file from an input <code>bam</code> file.</li>\n<li>The pairsam file is a pairs file, listing one read pair per line, with additional columns to track the sam-file lines, and a pairtools read classification.</li>\n<li>These classifications include information on whether the read aligned to 0, 1, or multiple places in the genome and whether it aligned end-to-end or if it was clipped.</li>\n<li>This tool also upper-triangularizes the reads, i.e. if the coordinate of second read is higher than the first, the reads are flipped.</li>\n<li>\n<p>For more details, see the <a href=\"https://pairtools.readthedocs.io/en/latest/parsing.html\" rel=\"noopener noreferrer\" target=\"_blank\">pairtools documentation</a>.</p>\n</li>\n<li>\n<p><code>pairtools sort</code></p>\n</li>\n<li>Produces a sorted <code>pairsam</code> file from an input <code>pairsam</code> file.</li>\n<li>\n<p>Note that the flipping order and sort order of chromosomes is not identical. See <a href=\"https://pairtools.readthedocs.io/en/latest/sorting.html#chromosomal-order-for-sorting-and-flipping\" rel=\"noopener noreferrer\" target=\"_blank\">the docs</a> for more details.</p>\n</li>\n<li>\n<p><code>pairtools dedup --mark-dups</code></p>\n</li>\n<li>(equivalent to <code>pairtools markasdup</code>)</li>\n<li>Identify duplicate alignments.</li>\n<li>\n<p>Arbitrarily designate the duplicate status among the two duplicate alignments.</p>\n</li>\n<li>\n<p><code>pairtools select</code></p>\n</li>\n<li>Remove duplicates, multi-mapped reads, and reads non-uniquely mapped at the 5' end.</li>\n</ul>\n<p>Source files (v1.1.1_dcic_4): </p>\n<ul>\n<li>Workflow: https://data.4dnucleome.org/workflows/4DNWFMRGIPB1/</li>\n<li>CWL: https://github.com/4dn-dcic/iMARGI-Docker/blob/v1.1.1_dcic_4/src/cwl/imargi-processing-bam.cwl</li>\n</ul></div>", "@context": "/terms/", "aggregated-items": {}, "validation-errors": []}