crispr_analysis_utils.normalization.counts_per_million
crispr_analysis_utils.normalization.counts_per_million
counts_per_million(counts: DataFrame | ndarray, pseudocount: float = 0.0, axis: Literal[0, 1] = 0) -> pd.DataFrame | np.ndarray
Convert raw counts to counts per million.
axis=0 normalizes each column independently, which is the usual layout
for CRISPR screens with guides/features in rows and samples in columns.
axis=1 normalizes each row independently.
Parameters:
-
counts(DataFrame | ndarray) –Numeric count matrix as a pandas DataFrame or NumPy array.
-
pseudocount(float, default:0.0) –Non-negative value added to every entry before normalization.
-
axis(Literal[0, 1], default:0) –Dimension to sum over before scaling. Use
0for columns and1for rows.
Returns:
-
DataFrame or ndarray–Counts scaled so each selected margin sums to one million.
Examples:
>>> import pandas as pd
>>> counts = pd.DataFrame({"sample_a": [100, 300], "sample_b": [50, 50]})
>>> counts_per_million(counts)
sample_a sample_b
0 250000.0 500000.0
1 750000.0 500000.0
Source code in src/crispr_analysis_utils/normalization.py
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | |