snap — Compute & Save Statistics¶
The snap subcommand reads result folders produced by kmeans-model, ptep-model, or fgbuster-model, computes statistics (residuals, power spectra, \(r\) estimation), and saves them to lightweight .parquet files for later plotting.
Basic Usage¶
r_analysis snap \
-n 64 \
-r "kmeans_BD10000_TD500_BS500_GAL020" \
-ird results/ \
-o snapshots/my_run.parquet
Run Matching with -r¶
The -r flag controls which result folders are selected. It supports three matching modes depending on the pattern syntax.
Token Matching (Exact)¶
When the pattern contains no regex metacharacters, it is split by _ into tokens. A folder matches if all tokens are present in the folder name (AND logic).
# Match folders containing BOTH "kmeans" AND "BD10000"
r_analysis snap -r "kmeans_BD10000" -ird results/ -o out.parquet
# Match folders containing "kmeans", "BD4000", "TD500", "BS500", "GAL020"
r_analysis snap -r "BD4000_TD500_BS500_GAL020" -ird results/ -o out.parquet
The folder name is also split by _, and each token from the pattern must match at least one token in the folder name.
Regex Matching (Expand)¶
When a token contains regex metacharacters (capture groups with \d, \w, etc.), each unique combination of captured values creates a separate entry in the output.
# Match all runs, extract BD/TD/BS/GAL values — each unique combination becomes a separate kw
r_analysis snap -r "BD(\d+)_TD(\d+)_BS(\d+)_GAL(\d+)" -ird results/ -o out.parquet
For example, if results/ contains:
kmeans_c1d1s1_BD4000_TD500_BS500_..._GAL020_...
kmeans_c1d1s1_BD8000_TD500_BS500_..._GAL020_...
kmeans_c1d1s1_BD4000_TD500_BS500_..._GAL040_...
The pattern BD(\d+)_TD(\d+)_BS(\d+)_GAL(\d+) produces three separate entries:
BD4000_TD500_BS500_GAL020BD8000_TD500_BS500_GAL020BD4000_TD500_BS500_GAL040
Partial Matching (Merge Masks)¶
You can use a partial pattern to match folders across different mask configurations. All matched folders are merged into a single entry.
# Match all runs with BD4000_TD500_BS500 regardless of mask
# This effectively merges GAL020 + GAL040 + GAL060 masks → fsky ≈ 60%
r_analysis snap -r "BD4000_TD500_BS500" -ird results/ -o out.parquet
Combining Runs¶
--combine¶
Merge all matched result directories into a single entry rather than keeping them separate:
r_analysis snap -r "kmeans_BD4000" "ptep_BD64" \
-ird results/ \
--combine \
--name "combined_run" \
-o out.parquet
--name¶
Set display names for each run group:
r_analysis snap -r "kmeans_BD4000" "ptep_BD64" \
-ird results/ \
--name "K-Means (4000)" "PTEP (64)" \
-o out.parquet
by default, the display name is the matched pattern (e.g., BD4000_TD500_BS500).
For combined runs, it is recommended to give an explicit name so it easier to match when plotting using plot
See also
For reducing the number of clusters via post-clustering parameter binning, see bin.
All Arguments¶
Flag |
Type |
Default |
Description |
|---|---|---|---|
|
|
required |
Path to output |
|
|
|
Which noise realization to use: |
|
|
all |
Maximum number of noise realizations |
|
flag |
|
Skip rendering mollview images (faster) |
|
flag |
|
Merge all matched dirs into one entry |
|
|
auto |
Display names for run groups |
|
|
unlimited |
Max entries per parquet file (splits into numbered files) |
Plus all common arguments (-n, -r, -ird, etc.).
Output¶
The output is a .parquet file (powered by HuggingFace datasets) containing one row per matched run group. Each row stores:
CMB reconstruction maps and patch assignments
Power spectra (\(C_\ell^{BB}\) observed, templates, residuals)
\(r\) estimation (best fit, confidence bounds, likelihood curve)
Systematic and statistical residual maps
Foreground parameter maps (\(\beta_d\), \(T_d\), \(\beta_s\))
Metadata (keyword, number of clusters, NLL, mask info)