Window Protection Score (WPS)#
- finaletoolkit.frag.wps(input_file: str | AlignmentFile, contig: str, start: int | str, stop: int | str, output_file: str = None, window_size: int = 120, fraction_low: int = 120, fraction_high: int = 180, quality_threshold: int = 30, verbose: bool | int = 0) ndarray #
Return (raw) Windowed Protection Scores as specified in Snyder et al (2016) over a region [start,stop).
Parameters#
- input_filestr or pysam.AlignmentFile
BAM, SAM or tabix file containing paired-end fragment reads or its path. AlignmentFile must be opened in read mode.
contig : str start : int stop : int output_file : string, optional window_size : int, optional
Size of window to calculate WPS. Default is k = 120, equivalent to L-WPS.
- fraction_lowint, optional
Specifies lowest fragment length included in calculation. Default is 120, equivalent to long fraction.
- fraction_highint, optional
Specifies highest fragment length included in calculation. Default is 180, equivalent to long fraction.
quality_threshold : int, optional workers : int, optional verbose : bool, optional
Returns#
- scoresnumpy.ndarray
np array of shape (n, 2) where column 1 is the coordinate and column 2 is the score and n is the number of coordinates in region [start,stop)
- finaletoolkit.frag.multi_wps(input_file: AlignmentFile | str, site_bed: str, output_file: str | None = None, window_size: int = 120, interval_size: int = 5000, fraction_low: int = 120, fraction_high: int = 180, quality_threshold: int = 30, workers: int = 1, verbose: bool | int = 0) ndarray #
Function that aggregates WPS over sites in BED file according to the method described by Snyder et al (2016).
Parameters#
- input_filestr or pysam.AlignmentFile
BAM, SAM, or tabix file containing paired-end fragment reads or its path. AlignmentFile must be opened in read mode.
- site_bed: str
Bed file containing intervals to perform WPS on.
output_file : string, optional window_size : int, optional
Size of window to calculate WPS. Default is k = 120, equivalent to L-WPS.
- interval_sizeint, optional
Size of each interval specified in the bed file. Should be the same for every interbal. Default is 5000.
- fraction_lowint, optional
Specifies lowest fragment length included in calculation. Default is 120, equivalent to long fraction.
- fraction_highint, optional
Specifies highest fragment length included in calculation. Default is 120, equivalent to long fraction.
quality_threshold : int, optional workers : int, optional verbose : bool, optional
Returns#
- scoresnumpy.ndarray
np array of shape (n, 2) where column 1 is the coordinate and column 2 is the score and n is the number of coordinates in region [start,stop)
- finaletoolkit.frag.adjust_wps(input_file: str, interval_file: str, output_file: str, genome_file: str, median_window_size: int = 1000, savgol_window_size: int = 21, savgol_poly_deg: int = 2, mean: bool = False, subtract_edges: bool = False, edge_size: int = 500, workers: int = 1, verbose: Union(bool, int) = False)#
Adjusts raw WPS data in a BigWig by applying a median filter and Savitsky-Golay filter (Savitsky and Golay, 1964).
Parameters#
- input_filestr
Path string to a BigWig containing raw WPS data.
- interval_filestr
BED format file containing intervals over which WPS was calculated on.
- output_filestr
BigWig file to write adjusted WPS to.
- genome_filestr
The genome file for the reference genome that WGS was aligned to. A tab delimited file where column 1 contains the name of chromosomes and column 2 contains chromosome length.
- median_window_sizeint, optional
Size of median filter window. Default is 1000.
- savgol_window_sizeint, optional
Size of Savitsky Golay filter window. Default is 21.
- savgol_poly_degint, optional
Degree polynomial for Savitsky Golay filter. Default is 2.
- meanbool, optional
If true, a mean filter is used instead of median. Default is False.
- subtract_edgesbool, optional
If true, take the median of the first and last 500 bases in a window and subtract from the whole interval. Default is False.
- workersint, optional
Number of processes to use. Default is 1.
- verbosebool or int, optional
Default is False.