Skip to content

Feature request: modify "data/vcf/phased/" folder #8

@bschilder

Description

@bschilder

Hi!,

Would it be possible to add an extra param to the config to let users specify the location of the intermediate vcf outputs?

I have a situation where the VCFs are actually sharded into many different smaller VCFs (even within the same chrom) so I need a way of avoiding overwriting each new VCF from a particular shard.

Currently, this is hard-coded in the Snakefile as data/vcf/phased/:
https://github.com/search?q=repo%3AProGenNo%2FProHap%20data%2Fvcf%2Fphased&type=code

In the meantime, I wrote a python script to modify the Snakefile:

def edit_snakefile(input_file=os.path.join(os.environ['PWD'],"ProHap/Snakefile"),
                   output_file="auto", 
                   replace_dict={}):

    if output_file=="auto":
        output_file = input_file
     
    with open(input_file, "r") as f:
        snakefile = f.read()
    
    if len(replace_dict)>0:
        for k,v in replace_dict.items():
            snakefile = snakefile.replace(k,v)

    if output_file is not None:    
        with open(output_file, "w") as f:
            f.write(snakefile)
            
    return output_file


edit_snakefile(os.path.join(os.environ['PWD'],"ProHap/Snakefile_chr22"),
              replace_dict={"data/vcf/phased/":"data/vcf/phased_0001234/"})

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions