+
Skip to content

File name collision when using 1-treatment replicate vs. >1-control replicates, i.e. cannot specify the same sample,replicate pair for each different control replicate #467

Open
@a1ultima

Description

@a1ultima

Description of the bug

Background

Akin to the exact same error behaviour and outcome to a related issue here: #440 (comment):

We have metadata from a large scale plant chipseq study called ChipHub, in which there are cases where they run a chipseq pipeline for samples that have a 1-treatment bio replicate -to- many-control bio replicate relationship:

sample,fastq_1,fastq_2,replicate,antibody,control,control_replicate
WT_BCATENIN_IP,BLA203A1_S27_L006_R1_001.fastq.gz,,1,BCATENIN,WT_INPUT,1
WT_BCATENIN_IP,BLA203A1_S27_L006_R1_001.fastq.gz,,1,BCATENIN,WT_INPUT,2
WT_INPUT,BLA203A6_S32_L006_R1_001.fastq.gz,,1,,,
WT_INPUT,BLA203A30_S21_L001_R1_001.fastq.gz,,2,,,

In this minimal example, we accomodate for the case where the sample and replicate values must be flattened (repeated) vs. each different control replicate specified by ChipHub for us to perform peak calling against (i.e. we want to make a comparison of each treatment replicate vs. each different control replicate:

For clarity, we focus just on the treatment rows, annotated in comments as repeat_i=0, and repeat_i=1 respectively:for:

sample,fastq_1,fastq_2,replicate,antibody,control,control_replicate
WT_BCATENIN_IP,BLA203A1_S27_L006_R1_001.fastq.gz,,1,BCATENIN,WT_INPUT,1  // <-- repeat_i=0, 
WT_BCATENIN_IP,BLA203A1_S27_L006_R1_001.fastq.gz,,1,BCATENIN,WT_INPUT,2 //  <-- repeat_i=1, but diff control rep

Is a repeated, in the sense that we want to keep everything equal (sample,replicate,control,antibody), but only differ in which control bio replicate we want to get a comparison against (e.g. for peak calling):

Matching columns

sample,fastq_1,fastq_2,replicate,antibody,control,
WT_BCATENIN_IP,BLA203A1_S27_L006_R1_001.fastq.gz,,1
WT_BCATENIN_IP,BLA203A1_S27_L006_R1_001.fastq.gz,,1

Differing columns:


Command used and terminal output

Relevant files

No response

System information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载