coolpup.py CLI

Use coolpup.py command to perform pileups, and plotpup.py to visualize them.

Submodules

coolpup.py command

usage: coolpup.py [-h] [--features_format {bed,bedpe,auto}] [--view VIEW]
                  [--flank FLANK] [--minshift MINSHIFT] [--maxshift MAXSHIFT]
                  [--nshifts NSHIFTS] [--expected EXPECTED] [--not_ooe]
                  [--mindist MINDIST] [--maxdist MAXDIST]
                  [--ignore_diags IGNORE_DIAGS] [--subset SUBSET]
                  [--by_window] [--by_strand]
                  [--by_distance [BY_DISTANCE ...]] [--groupby [GROUPBY ...]]
                  [--ignore_group_order [IGNORE_GROUP_ORDER ...]]
                  [--flip_negative_strand] [--local]
                  [--coverage_norm [COVERAGE_NORM]] [--trans]
                  [--store_stripes] [--rescale]
                  [--rescale_flank RESCALE_FLANK]
                  [--rescale_size RESCALE_SIZE]
                  [--clr_weight_name [CLR_WEIGHT_NAME]] [-o OUTNAME]
                  [-p N_PROC] [--seed SEED]
                  [-l {DEBUG,INFO,WARNING,ERROR,CRITICAL}] [--post_mortem]
                  [-v]
                  cool_path features

Positional Arguments

cool_path

Cooler file with your Hi-C data

features
A 3-column bed file or a 6-column double-bed file

i.e. chr1,start1,end1,chr2,start2,end2. Should be tab-delimited.

With a bed file, will consider all combinations of intervals. To pileup features along the diagonal instead, use the --local argument.

Can be piped in via stdin, then use “-”

Named Arguments

--features_format, --features-format, --format, --basetype

Possible choices: bed, bedpe, auto

Format of the features.

Options: bed: chrom, start, end bedpe: chrom1, start1, end1, chrom2, start2, end2 auto (default): determined from the file name extension Has to be explicitly provided is features is piped through stdin

Default: “auto”

--view

Path to a file which defines which regions of the chromosomes to use

--flank, --pad
Flanking of the windows around the centres of specified features

i.e. final size of the matrix is 2 × flank+res, in bp. Ignored with --rescale, use --rescale_flank instead

Default: 100000

--minshift

Shortest shift for random controls, bp

Default: 100000

--maxshift

Longest shift for random controls, bp

Default: 1000000

--nshifts

Number of control regions per averaged window

Default: 10

--expected
File with expected (output of cooltools compute-expected).

If None, don’t use expected and use randomly shifted controls

--not_ooe, --not-ooe

If expected is provided, will accumulate all expected snippets just like for randomly shifted controls, instead of normalizing each snippet individually

Default: True

--mindist
Minimal distance of interactions to use, bp.

If not provided, uses 2*flank+2 (in bins) as mindist to avoid first two diagonals

--maxdist

Maximal distance of interactions to use

--ignore_diags, --ignore-diags

How many diagonals to ignore

Default: 2

--subset
Take a random sample of the bed file.

Useful for files with too many featuers to run as is, i.e. some repetitive elements. Set to 0 or lower to keep all data

Default: 0

--by_window, --by-window
Perform by-window pile-ups.

Create a pile-up for each coordinate in the features. Not compatible with –by_strand and –by_distance.

Only works with bed format features, and generates pairwise combinations of each feature against the rest.

Default: False

--by_strand, --by-strand
Perform by-strand pile-ups.

Create a separate pile-up for each strand combination in the features.

Default: False

--by_distance, --by-distance
Perform by-distance pile-ups.

Create a separate pile-up for each distance band. If empty, will use default (0,50000,100000,200000,…) edges. Specify edges using multiple argument values, e.g. –by_distance 1000000 2000000

--groupby
Additional columns of features to use for groupby, space separated.

If feature_format==’bed’, each columns should be specified twice with suffixes ‘1’ and ‘2’, i.e. if features have a column ‘group’, specify ‘group1 group2’., e.g. –groupby chrom1 chrom2

--ignore_group_order
When using groupby, reorder so that e.g. group1-group2 and group2-group1 will be

combined into one and flipped to the correct orientation. If using multiple paired groupings (e.g. group1-group2 and category1-category2), need to specify which grouping should be prioritised, e.g. “group” or “group1 group2”. For flip_negative_strand, +- and -+ strands will be combined

--flip_negative_strand, --flip-negative-strand
Flip snippets so the positive strand always points to bottom-right.

Requires strands to be annotated for each feature (or two strands for bedpe format features)

Default: False

--local

Create local pileups, i.e. along the diagonal

Default: False

--coverage_norm, --coverage-norm

Normalize the final pileup by accumulated coverage as an alternative to balancing. Useful for single-cell Hi-C data. Can be a string: “cis” or “total” to use “cov_cis_raw” or “cov_tot_raw” columns in the cooler bin table, respectively. If they are not present, will calculate coverage with same ignore_diags as used in coolpup.py and store result in the cooler. Alternatively, if a different string is provided, will attempt to use a column with the that name in the cooler bin table, and will raise a ValueError if it does not exist. If no argument is given following the option string, will use “total”. Only allowed when using empty –clr_weight_name

Default: “”

--trans
Perform inter-chromosomal (trans) pileups.

This ignores all contacts in cis.

Default: False

--store_stripes

Store horizontal and vertical stripes in pileup output

Default: False

--rescale
Rescale all features to the same size.

Do not use centres of features and flank, and rather use the actual feature sizes and rescale pileups to the same shape and size

Default: False

--rescale_flank, --rescale_pad, --rescale-flank, --rescale-pad

If –rescale, flanking in fraction of feature length

Default: 1.0

--rescale_size, --rescale-size
Size to rescale to.

If --rescale, used to determine the final size of the pileup, i.e. it will be size×size. Due to technical limitation in the current implementation, has to be an odd number

Default: 99

--clr_weight_name, --weight_name, --clr-weight-name, --weight-name
Name of the norm to use for getting balanced data.

Provide empty argument to calculate pileups on raw data (no masking bad pixels).

Default: “weight”

-o, --outname, --output
Name of the output file.

If not set, file is saved in the current directory and the name is generated automatically to include important information and avoid overwriting files generated with different settings.

Default: “auto”

-p, --nproc, --n_proc, --n-proc
Number of processes to use.

Each process works on a separate chromosome, so might require quite a bit more memory, although the data are always stored as sparse matrices. Set to 0 to use all available cores.

Default: 1

--seed

Set specific seed value to ensure reproducibility

-l, --log

Possible choices: DEBUG, INFO, WARNING, ERROR, CRITICAL

Set the logging level

Default: “INFO”

--post_mortem, --post-mortem

Enter debugger if there is an error

Default: False

-v, --version

show program’s version number and exit

dividepups.py command

usage: dividepups.py [-h] [-v] [-o OUTNAME] input_pups [input_pups ...]

Positional Arguments

input_pups

Two pileups to divide

Named Arguments

-v, --version

show program’s version number and exit

-o, --outname
Name of the output file.

If not set, file is saved in the current directory and the name is generated automatically.

Default: “auto”

plotpup.py command

usage: plotpup.py [-h] [--cmap CMAP] [--not_symmetric] [--vmin VMIN]
                  [--vmax VMAX] [--scale {log,linear}] [--stripe STRIPE]
                  [--stripe_sort STRIPE_SORT] [--lineplot]
                  [--out_sorted_bedpe OUT_SORTED_BEDPE] [--divide_pups]
                  [--font FONT] [--font_scale FONT_SCALE] [--cols COLS]
                  [--rows ROWS] [--col_order COL_ORDER]
                  [--row_order ROW_ORDER] [--colnames COLNAMES [COLNAMES ...]]
                  [--rownames ROWNAMES [ROWNAMES ...]] [--query QUERY]
                  [--norm_corners NORM_CORNERS] [--no_score] [--center CENTER]
                  [--ignore_central IGNORE_CENTRAL] [--quaich] [--dpi DPI]
                  [--height HEIGHT] [--plot_ticks] [--output OUTPUT]
                  [-l {DEBUG,INFO,WARNING,ERROR,CRITICAL}] [--post_mortem]
                  [--input_pups INPUT_PUPS [INPUT_PUPS ...]] [-v]

Named Arguments

--cmap
Colormap to use

(see https://matplotlib.org/users/colormaps.html)

Default: “coolwarm”

--not_symmetric, --not-symmetric, --not_symmetrical, --not-symmetrical

Whether to not make colormap symmetric around 1, if log scale

Default: False

--vmin

Value for the lowest colour

--vmax

Value for the highest colour

--scale

Possible choices: log, linear

Whether to use linear or log scaling for mapping colours

Default: “log”

--stripe

For plotting stripe stackups

--stripe_sort

Whether to sort stripe stackups by total signal (sum), central pixel signal (center_pixel), or not at all (None)

Default: “sum”

--lineplot
Whether to plot the average lineplot above stripes.

This only works for a single plot, i.e. without rows/columns

Default: False

--out_sorted_bedpe

Output bedpe of sorted stripe regions

--divide_pups

Whether to divide two pileups and plot the result

Default: False

--font

Font to use for plotting

Default: “DejaVu Sans”

--font_scale

Font scale to use for plotting. Defaults to 1

Default: 1

--cols

Which value to map as columns

--rows

Which value to map as rows

--col_order

Order of columns to use, space separated inside quotes

--row_order

Order of rows to use, space separated inside quotes

--colnames

Names to plot for columns, space separated.

--rownames

Names to plot for rows, space separated.

--query
Pandas query to select pups to plot from concatenated input files.

Multiple query arguments can be used. Usage example: –query “orientation == ‘+-’ | orientation == ‘-+’”

--norm_corners
Whether to normalize pileups by their top left and bottom right corners.

0 for no normalization, positive number to define the size of the corner squares whose values are averaged

Default: 0

--no_score

If central pixel score should not be shown in top left corner

Default: False

--center

How many central pixels to consider when calculating enrichment for off-diagonal pileups.

Default: 3

--ignore_central

How many central bins to ignore when calculating insulation for local (on-diagonal) non-rescaled pileups.

Default: 3

--quaich

Activate if pileups are named accodring to Quaich naming convention to get information from the file name

Default: False

--dpi

DPI of the output plot. Try increasing if heatmaps look blurry

Default: 300

--height

Height of the plot

Default: 1

--plot_ticks

Whether to plot ticks demarkating the center and flanking regions, only applicable for non-stripes

Default: False

--output, -o, --outname

Where to save the plot

Default: “pup.pdf”

-l, --log

Possible choices: DEBUG, INFO, WARNING, ERROR, CRITICAL

Set the logging level

Default: “INFO”

--post_mortem

Enter debugger if there is an error

Default: False

--input_pups

All files to plot

-v, --version

show program’s version number and exit