这是indexloc提供的服务,不要输入任何密码
Skip to content

Repeated files and reusing input and filtered data #312

@jamesaliba

Description

@jamesaliba

I have inputs IPed for several targets, the fact that i need to run the pipeline for every target means the inputs are being re aligned and filtered every time thus wasting time, computational power and space. In general, space is also an issue, every run generates 300GB of files, could you implement an option that deletes some files after use. This is because i cannot run all the .json one after the other automatically, i run out of space and have to run each json alone, delete hundreds of GBs then run the next json. hence i am stuck while i would rather set it and forget it.

TLDR:
1-Run all jsons combined to reduce processing of the identical samples
2-Auto delete redundant files as the pipeline goes (it also feels like the same files are being duplicated to serve as inputs for the many jobs, one duplicate per job)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions