DELFI#

finaletoolkit.frag.delfi(input_file: str, autosomes: str, bins_file: str, reference_file: str, blacklist_file: str = None, gap_file: Union(str, GenomeGaps) = None, output_file: str = None, gc_correct: bool = True, merge_bins: bool = True, window_size: int = 5000000, subsample_coverage: float = 2, quality_threshold: int = 30, workers: int = 1, preprocessing: bool = True, verbose: int | bool = False) pandas.DataFrame#

A function that replicates the methodology of Christiano et al (2019).

Parameters:
  • input_file (str) – Path string pointing to a bam file containing PE fragment reads.

  • autosomes (str) – Path string to a .genome file containing only autosomal chromosomes

  • bins_file (str) – Path string to a BED file containing 100kb bins for reference genome of choice. Cristiano et al uses

  • reference_file (str) – Path string to .2bit file.

  • blacklist_file (str) – Path string to BED file containing genome blacklist.

  • gap_file (str) – Path string to a BED4+ file where each interval is a centromere or telomere. A bed file can be used only if the fourth field for each entry corresponding to a telomere or centromere is labled “telomere” or “centromere, respectively.

  • output_file (str, optional) – Path to output tsv.

  • window_size (int) – Size of non-overlapping windows to cover genome. Default is 5 megabases.

  • subsample_coverage (int, optional) – The depth at which to subsample the input_bam. Default is 2.

  • workers (int, optional) – Number of worker processes to use. Default is 1.

  • preprocessing (bool, optional) – Christiano et al (2019)

  • verbose (int or bool, optional) – Determines how many print statements and loading bars appear in stdout. Default is False.

finaletoolkit.frag.delfi_gc_correct(windows: DataFrame, alpha: float = 0.75, it: int = 8, verbose: bool = False)#

Helper function that takes window data and performs GC adjustment.

finaletoolkit.frag.delfi_merge_bins(hundred_kb_bins: DataFrame, gc_corrected: bool = True, add_chr: bool = False, verbose: bool = False)#