Architecture
Current Layout
The repository is organized as a small Python package plus local experiment assets:
euclid_dsps/
assets.py Download small DSPS smoke-test assets.
cli.py Command-line parser and command dispatch.
config.py YAML loading and default normalization.
cosmos.py COSMOS-template proxy SED reconstruction.
filters.py Euclid/LSST transmission curve loading.
fit.py MAP and population optimization.
io.py Parquet, row, unit, JSON, and CSV helpers.
jax_runtime.py Conservative JAX runtime setup for local WSL/shine.
likelihood.py Shared likelihood helpers.
mcmc.py NumPyro posterior sampling.
model.py Native DSPS boundary.
nebular.py Diagnostic-only SSP emission-line tables and crossings.
performance.py Runtime, throughput, and device-cost summaries.
photometry.py Central AB magnitude and Fnu flux conversions.
pipeline.py Deprecated compatibility facade for workflow imports.
reports.py Deprecated compatibility facade for reporting imports.
selection.py Single-row catalog selection.
reporting/
cosmos.py COSMOS SED diagnostic plots.
eda.py EDA report exports.
fit.py MAP/population report exports.
forward.py Forward-model report exports.
posterior.py Posterior report exports.
workflow.py Composite workflow report exports.
core.py Report tables and plots.
workflows/
bayesian.py Bayesian workflow exports.
cosmos.py COSMOS SED reconstruction workflow.
eda.py EDA workflow exports.
forward.py Forward-model workflow exports.
map_fit.py MAP workflow exports.
population.py Population workflow exports.
workflow.py Composite workflow exports.
core.py End-to-end CLI workflows.
configs/
fs2_phz1_science.yaml Active LSST+Euclid science setup.
legacy/ Old config examples, not active workflow defaults.
smoke_test.yaml Lightweight smoke-test setup.
scripts/
quickstart_one_galaxy.py
convert_euclid_filters.py
Data/ Local data and DSPS assets, not source.
outputs/ Generated run outputs, not source.
The current package has good high-level boundaries. The main cleanup need is not a rewrite; it is reducing module size and documenting contracts so new science experiments stay local to config, model, fit, or reporting layers.
Layer Responsibilities
config.pyLoads YAML, applies defaults, and keeps run setup explicit. It should not read catalog data or call DSPS.
io.pyOwns the catalog contract: parquet reads, required columns, row index handling, truth value transforms, photometry unit conversion, and JSON serialization.
filters.pyLoads exact passbands from ASCII, HDF5, or FITS. Approximate top-hat filters are a fallback for smoke tests only.
model.pyContains the native DSPS boundary. Other modules should pass normalized dataclasses and parameter dictionaries into this layer rather than importing DSPS directly.
cosmos.pyReconstructs template-level COSMOS proxy SEDs from
sed_cosmos_*,ebv_cosmos_*,ext_curve_cosmos_*, andfrac_cosmos_*. It owns SciPIC value-added or LePhare template/extinction loading, attenuation, synthetic photometry, rest-frame absolute-flux normalization, population validation, and COSMOS-vs-DSPS metrics.jax_runtime.pyApplies config/env JAX runtime choices before JAX-heavy modules are imported. Auto switch between cpu if GPU not found. GPU runs are enabled by changing
runtime.jax_platformsand plugin autoload settings.fit.pyandmcmc.pyOwn optimizer and sampler behavior. They should depend on the model boundary and observation dataclasses, not on parquet or report-writing concerns.
nebular.pyReads line metadata already loaded by
model.pyand writes diagnostic line/filter crossing artifacts. It must not alter the science likelihood until a no-double-count line model exists.performance.pyOwns wall-time, throughput, memory, JAX device, and GPU-hour reporting. It should stay lightweight and never require a GPU to import.
workflows/*.pyComposes workflows from the layers above. It is allowed to orchestrate, but should avoid complex scientific logic that belongs in
model.py,fit.py, orio.py. Focused modules expose stable entry points by workflow type, whilecore.pykeeps the shared implementation and helpers.reporting/*.pyOwns artifact writing. Focused modules expose stable entry points by report type, while
core.pykeeps shared plotting/table implementation.pipeline.pyandreports.pyDeprecated compatibility facades retained for existing scripts and notebooks. They contain no workflow or plotting implementation. New source code should import from
euclid_dsps.workflowsandeuclid_dsps.reporting. They can be removed after local scripts such asscripts/quickstart_one_galaxy.pyand downstream notebooks no longer import them.
Design Rules
Keep DSPS imports isolated in
model.py.Keep catalog-specific aliases and truth transforms in config or
io.py.Keep output files deterministic and named with snake_case.
Treat
Data/andoutputs/as local runtime state.Add tests or smoke commands when changing model, fit, sampling, or catalog contracts.
Prefer new config keys over hidden constants when changing scientific setup.
Remaining Cleanup
The main architectural risk is the size of the shared implementation modules.
workflows/core.py still owns many orchestration helpers, and
reporting/core.py still owns many plot families. Future refactors should
move those internals while keeping the stable euclid_dsps.workflows and
euclid_dsps.reporting imports.