coolpup.py CLI

Use coolpup.py command to perform pileups, and plotpup.py to visualize them.

Submodules

coolpup.py command

usage: coolpup.py [-h] [--features_format {bed,bedpe,auto}] [--view VIEW]
                  [--flank FLANK] [--minshift MINSHIFT] [--maxshift MAXSHIFT]
                  [--nshifts NSHIFTS] [--expected EXPECTED] [--ooe OOE]
                  [--mindist MINDIST] [--maxdist MAXDIST]
                  [--ignore_diags IGNORE_DIAGS] [--excl_chrs EXCL_CHRS]
                  [--incl_chrs INCL_CHRS] [--subset SUBSET] [--anchor ANCHOR]
                  [--by_window] [--by_strand] [--by_distance] [--local]
                  [--coverage_norm] [--rescale]
                  [--rescale_flank RESCALE_FLANK]
                  [--rescale_size RESCALE_SIZE]
                  [--clr_weight_name CLR_WEIGHT_NAME] [-p N_PROC] [-o OUTNAME]
                  [--seed SEED] [-l {DEBUG,INFO,WARNING,ERROR,CRITICAL}]
                  [--post_mortem] [-v]
                  cool_path features

Positional Arguments

cool_path

Cooler file with your Hi-C data

features
A 3-column bed file or a 6-column double-bed file

i.e. chr1,start1,end1,chr2,start2,end2. Should be tab-delimited.

With a bed file, will consider all cis combinations of intervals. To pileup features along the diagonal instead, use the --local argument.

Can be piped in via stdin, then use “-”

Named Arguments

--features_format, --basetype

Possible choices: bed, bedpe, auto

Format of the features. Options:

bed: chrom, start, end bedpe: chrom1, start1, end1, chrom2, start2, end2 auto (default): determined from the file name extension Has to be explicitly provided is features is piped through stdin

Default: “auto”

--view

Path to a file which defines which regions of the chromosomes to use

--flank, --pad
Flanking of the windows around the centres of specified features

i.e. final size of the matrix is 2 × flank+res, in bp. Ignored with --rescale, use --rescale_flank instead

Default: 100000

--minshift

Shortest shift for random controls, bp

Default: 100000

--maxshift

Longest shift for random controls, bp

Default: 1000000

--nshifts

Number of control regions per averaged window

Default: 10

--expected
File with expected (output of cooltools compute-expected).

If None, don’t use expected and use randomly shifted controls

--ooe
If expected is provided, normalize each snipper individually. If False,

will accumulate all expected snippets just like forrandomly shifted controls

Default: True

--mindist
Minimal distance of interactions to use, bp.

If “auto”, uses 2*flank+2 (in bins) as mindist to avoid first two diagonals

--maxdist

Maximal distance of interactions to use

--ignore_diags

How many diagonals to ignore

Default: 2

--excl_chrs

Exclude these chromosomes from analysis

Default: “chrY,chrM”

--incl_chrs
Include these chromosomes; default is all.

--excl_chrs overrides this

Default: “all”

--subset
Take a random sample of the bed file.

Useful for files with too many featuers to run as is, i.e. some repetitive elements. Set to 0 or lower to keep all data

Default: 0

--anchor
A UCSC-style coordinate.

Use as an anchor to create intersections with coordinates in the features

--by_window
Perform by-window pile-ups.

Create a pile-up for each coordinate in the features. Not compatible with –by_strand and –by_distance

Default: False

--by_strand
Perform by-strand pile-ups.

Create a separate pile-up for each strand combination in the features.

Default: False

--by_distance
Perform by-distance pile-ups.

Create a separate pile-up for each distance band using [0, 50000, 100000, 200000, …) as edges.

Default: False

--local

Create local pileups, i.e. along the diagonal

Default: False

--coverage_norm

If empty clr_weight_name, add coverage normalization using chromosome marginals

Default: False

--rescale
Rescale all features to the same size.

Do not use centres of features and flank, and rather use the actual feature sizes and rescale pileups to the same shape and size

Default: False

--rescale_flank, --rescale_pad

If –rescale, flanking in fraction of feature length

Default: 1.0

--rescale_size
Size to rescale to.

If --rescale, used to determine the final size of the pileup, i.e. it will be size×size. Due to technical limitation in the current implementation, has to be an odd number

Default: 99

--clr_weight_name, --weight_name
Name of the norm to use for getting balanced data.

Provide empty argument to calculate pileups on raw data (no masking bad pixels).

Default: “weight”

-p, --nproc, --n_proc
Number of processes to use.

Each process works on a separate chromosome, so might require quite a bit more memory, although the data are always stored as sparse matrices

Default: 1

-o, --outname
Name of the output file.

If not set, file is saved in the current directory and the name is generated automatically to include important information and avoid overwriting files generated with different settings.

Default: “auto”

--seed

Set specific seed value to ensure reproducibility

-l, --log

Possible choices: DEBUG, INFO, WARNING, ERROR, CRITICAL

Set the logging level

Default: “INFO”

--post_mortem

Enter debugger if there is an error

Default: False

-v, --version

show program’s version number and exit

plotpup.py command

usage: plotpup.py [-h] [--cmap CMAP] [--symmetric SYMMETRIC] [--vmin VMIN]
                  [--vmax VMAX] [--scale {linear,log}] [--cols COLS]
                  [--rows ROWS] [--col_order COL_ORDER]
                  [--row_order ROW_ORDER] [--query QUERY]
                  [--norm_corners NORM_CORNERS] [--score SCORE]
                  [--center CENTER] [--ignore_central IGNORE_CENTRAL]
                  [--quaich] [--dpi DPI] [--output OUTPUT] [--post_mortem]
                  [--input_pups INPUT_PUPS [INPUT_PUPS ...]] [-v]

Named Arguments

--cmap
Colourmap to use

(see https://matplotlib.org/users/colormaps.html)

Default: “coolwarm”

--symmetric

Whether to make colormap symmetric around 1, if log scale

Default: True

--vmin

Value for the lowest colour

--vmax

Value for the highest colour

--scale

Possible choices: linear, log

Whether to use linear or log scaling for mapping colours

Default: “log”

--cols

Which value to map as columns

--rows

Which value to map as rows

--col_order

Order of columns to use, space or comma separated

--row_order

Order of rows to use, space or comma separated

--query

“Pandas query top select pups to plot from concatenated input files

--norm_corners
Whether to normalize pileups by their top left and bottom right corners.

0 for no normalization, positive number to define the size of the corner squares whose values are averaged

Default: 0

--score
Whether to calculate score and add it to the top right corner of each

pileup. Will use the ‘coolpup.get_score’ function with ‘center’ and ‘ignore_central’ arguments.

Default: True

--center
How many central pixels to consider when calculating enrichment for

off-diagonal pileups.

Default: 3

--ignore_central
How many central bins to ignore when calculating insulation for

local (on-diagonal) non-rescaled pileups.

Default: 3

--quaich
Activate if pileups are named accodring to Quaich naming convention

to get information from the file name

Default: False

--dpi

DPI of the output plot. Try increasing if heatmaps look blurry

Default: 300

--output, -o

Where to save the plot

Default: “pup.pdf”

--post_mortem

Enter debugger if there is an error

Default: False

--input_pups

All files to plot

-v, --version

show program’s version number and exit