Window Protection Score (WPS)#

finaletoolkit.frag.wps(input_file: str | AlignmentFile, contig: str, start: int | str, stop: int | str, output_file: str = None, window_size: int = 120, fraction_low: int = 120, fraction_high: int = 180, quality_threshold: int = 30, verbose: bool | int = 0) ndarray#

Return (raw) Windowed Protection Scores as specified in Snyder et al (2016) over a region [start,stop).

Parameters#

input_filestr or pysam.AlignmentFile

BAM, SAM or tabix file containing paired-end fragment reads or its path. AlignmentFile must be opened in read mode.

contig : str start : int stop : int output_file : string, optional window_size : int, optional

Size of window to calculate WPS. Default is k = 120, equivalent to L-WPS.

fraction_lowint, optional

Specifies lowest fragment length included in calculation. Default is 120, equivalent to long fraction.

fraction_highint, optional

Specifies highest fragment length included in calculation. Default is 180, equivalent to long fraction.

quality_threshold : int, optional workers : int, optional verbose : bool, optional

Returns#

scoresnumpy.ndarray

np array of shape (n, 2) where column 1 is the coordinate and column 2 is the score and n is the number of coordinates in region [start,stop)

finaletoolkit.frag.multi_wps(input_file: AlignmentFile | str, site_bed: str, output_file: str | None = None, window_size: int = 120, interval_size: int = 5000, fraction_low: int = 120, fraction_high: int = 180, quality_threshold: int = 30, workers: int = 1, verbose: bool | int = 0) ndarray#

Function that aggregates WPS over sites in BED file according to the method described by Snyder et al (2016).

Parameters#

input_filestr or pysam.AlignmentFile

BAM, SAM, or tabix file containing paired-end fragment reads or its path. AlignmentFile must be opened in read mode.

site_bed: str

Bed file containing intervals to perform WPS on.

output_file : string, optional window_size : int, optional

Size of window to calculate WPS. Default is k = 120, equivalent to L-WPS.

interval_sizeint, optional

Size of each interval specified in the bed file. Should be the same for every interbal. Default is 5000.

fraction_lowint, optional

Specifies lowest fragment length included in calculation. Default is 120, equivalent to long fraction.

fraction_highint, optional

Specifies highest fragment length included in calculation. Default is 120, equivalent to long fraction.

quality_threshold : int, optional workers : int, optional verbose : bool, optional

Returns#

scoresnumpy.ndarray

np array of shape (n, 2) where column 1 is the coordinate and column 2 is the score and n is the number of coordinates in region [start,stop)

finaletoolkit.frag.adjust_wps(input_file: str, interval_file: str, output_file: str, genome_file: str, median_window_size: int = 1000, savgol_window_size: int = 21, savgol_poly_deg: int = 2, mean: bool = False, subtract_edges: bool = False, edge_size: int = 500, workers: int = 1, verbose: Union(bool, int) = False)#

Adjusts raw WPS data in a BigWig by applying a median filter and Savitsky-Golay filter (Savitsky and Golay, 1964).

Parameters#

input_filestr

Path string to a BigWig containing raw WPS data.

interval_filestr

BED format file containing intervals over which WPS was calculated on.

output_filestr

BigWig file to write adjusted WPS to.

genome_filestr

The genome file for the reference genome that WGS was aligned to. A tab delimited file where column 1 contains the name of chromosomes and column 2 contains chromosome length.

median_window_sizeint, optional

Size of median filter window. Default is 1000.

savgol_window_sizeint, optional

Size of Savitsky Golay filter window. Default is 21.

savgol_poly_degint, optional

Degree polynomial for Savitsky Golay filter. Default is 2.

meanbool, optional

If true, a mean filter is used instead of median. Default is False.

subtract_edgesbool, optional

If true, take the median of the first and last 500 bases in a window and subtract from the whole interval. Default is False.

workersint, optional

Number of processes to use. Default is 1.

verbosebool or int, optional

Default is False.