cut / qcut: Binning Continuous Data
cut and qcut partition continuous numeric values into
discrete intervals — the TypeScript equivalents of
pandas.cut
and
pandas.qcut.
cut — Fixed-Width BinningBin values into equal-width (or user-specified) intervals. Pass an integer for automatic bins, or an explicit edge array.
import { cut } from "tsb";
const ages = [5, 18, 25, 35, 50, 70];
const { codes, labels, bins } = cut(ages, 3);
// labels: ["(5.0, 26.7]", "(26.7, 48.3]", "(48.3, 70.0]"]
// bins: [4.935, 26.667, 48.333, 70]
// codes: [0, 0, 0, 1, 1, 2]
console.table(ages.map((a, i) => ({ age: a, bin: labels[codes[i]!] })));
const scores = [55, 65, 72, 80, 91, 98];
const { codes, labels } = cut(scores, [0, 60, 70, 80, 90, 100], {
labels: ["F", "D", "C", "B", "A"],
include_lowest: true,
});
// codes: [0, 1, 2, 3, 4, 4]
// labels[codes[0]] → "F"
// labels[codes[5]] → "A"
| Option | Default | Description |
|---|---|---|
right | true | Intervals closed on right: (a, b]. Set false for [a, b). |
include_lowest | false | Make lowest interval left-closed: [a, b]. |
labels | auto | Custom string labels, or false for integer codes. |
precision | 3 | Decimal places in auto-generated labels. |
duplicates | "raise" | "drop" to silently remove duplicate bin edges. |
qcut — Quantile-Based BinningDivide values into bins of (approximately) equal population using quantiles. Useful for creating percentile buckets or roughly equal-sized groups.
import { qcut } from "tsb";
const values = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
const { codes, labels, bins } = qcut(values, 4);
// labels: ["[1, 3.25]", "(3.25, 5.5]", "(5.5, 7.75]", "(7.75, 10]"]
// Every bin has ~2-3 elements
const { labels } = qcut(values, [0, 0.1, 0.5, 0.9, 1], {
labels: ["bottom 10%", "lower middle", "upper middle", "top 10%"],
});
const { codes } = qcut(data, 10, { labels: false });
// codes[i] is 0..9 — the decile bucket index
BinResultinterface BinResult {
codes: ReadonlyArray<number | null>; // bin index per value; null for NaN
labels: readonly string[]; // ordered label per bin
bins: readonly number[]; // bin edge array (labels.length + 1)
}
NaN and Infinity are
assigned null in the codes array and are never placed
in a bin.
cut vs qcutcut | qcut | |
|---|---|---|
| Bin width | Equal (uniform edges) | Varies (equal population) |
| Bin count | Determined by bins | Determined by q |
| Best for | Meaningful thresholds (age groups, grade bands) | Percentile buckets, rank-based analysis |
| Left edge of first bin | Open ( unless include_lowest | Always closed [ |
# Python pandas
pd.cut([1, 2, 3, 4, 5], 2)
# Interval(0.996, 3.0, closed='right') ...
# tsb equivalent
cut([1, 2, 3, 4, 5], 2)
// codes: [0, 0, 0, 1, 1]
// labels: ["(0.996, 3.0]", "(3.0, 5.0]"]
Both cut and qcut follow pandas semantics exactly:
right-closed by default, linear interpolation for quantiles, and duplicate-edge
handling via duplicates.