# PEM User Guide

This guide is the main source-tree entry point for PEM users and contributors.

## Start Here

Use this path if you are new to PEM:

1. Compile PEM (see [Compilation](#compilation)).
2. Prepare one simulation directory (see [Run a Chained PCM/PEM Workflow](#run-a-chained-pcmpem-workflow)).
3. Run the workflow and inspect outputs (see [Outputs](#outputs)).
4. Post-process diagnostics (see [Toolbox](#toolbox)).

If you contribute to source code, continue with [Contributor Coding Guidance](#contributor-coding-guidance).

## PEM Layout

- `src/common`, `src/mars`, `src/generic`: Fortran sources.
- `make_pem_fcm`: PEM build driver based on FCM.
- `compile-example.sh`: example wrapper around `make_pem_fcm`.
- `build_pem`: build/job launcher with lock protection.
- `arch/`: architecture files (`arch-<name>.fcm`, `.path`, optional `.env`).
- `deftank/`: chained workflow templates and helper scripts.
- `datadir/`: workflow input data files (including orbital forcing tables).
- `toolbox/`: post-processing and utility scripts.
- `doc/`: Doxygen configuration, theme, and documentation helpers.

## Glossary

- `PEM`: Planetary Evolution Model.
- `PCM`: Planetary Climate Model.
- `FCM`: build system wrapper used by `make_pem_fcm`.
- `IOIPSL`: I/O library used by LMDZ/PEM workflows.
- `XIOS`: XML-driven output system used by PCM outputs.

## Compilation

### Prerequisites

Before compiling PEM, verify:

- `fcm` is available in `PATH`.
- your architecture files exist under `arch/`.
- NetCDF and IOIPSL are available.
- XIOS is available when PCM is compiled with `-io xios`.
- `LMDZ.COMMON` is available (default: `../LMDZ.COMMON`).

### Quick Start

```bash
cp compile-example.sh compile-local.sh
./compile-local.sh
```

### Direct Build Command

```bash
./make_pem_fcm -arch <arch> -p <planet> -d <dimensions> -j <nproc> [mode] [options] pem
```

Required arguments:

1. `<arch>`: architecture identifier used in `arch/arch-<arch>.*`.
2. `<planet>`: `mars` or `generic`.
3. `<dimensions>`: executable tag (for example `8` for 1D or `64x48x32` for 3D).
4. `<nproc>`: number of parallel compile jobs.

Optional build flags:

- `[mode]`: `-prod` (default), `-dev`, or `-debug`.
- `[options]`: `-full` (force full rebuild) or `-clean` (remove build directory and exit).

Build outputs:

- executables in `bin/` (for example `pem_8_mars_seq.e`),
- build products in `lib/`.

## Run a Chained PCM/PEM Workflow

### Quick Commands

From your simulation directory:

```bash
./pem_workflow.sh
```

Resume a successful chain:

```bash
./pem_workflow.sh re
```

Detailed setup is documented in `deftank/README.md`.

### Required Files in Simulation Directory

- PCM and PEM executables built with compatible dimensions/options.
- Workflow scripts: `pem_workflow.sh`, `pem_workflow_lib.sh`, `pcm_run.job`, `pem_run.job`.
- Runtime configs: `run_pcm.def`, `run_pem.def`, `callphys.def`, `z2sig.def`, `traceur.def`.
- XIOS XML files: `iodef.xml`, `context_pcm_physics.xml`, `field_def_physics.xml`, `file_def_physics.xml`.
- Start files: `startfi.nc` and `start.nc` or `start1D.txt`.

Optional runtime files:

- orbital forcing file configured by `orbitdata_path` in `run_pem.def`,
- `diagevo.def`,
- `startevo.nc` (restart input).

### XIOS Requirement Rationale

XIOS XML files are mandatory for chained runs because PCM outputs consumed by PEM (`xoutdaily4pem*.nc`, `xoutyearly4pem*.nc`) are defined through these XML contexts and file definitions.

### diagevo.def Behavior

- missing file: no PEM diagnostic variable output,
- empty file: all coded PEM diagnostic variables,
- non-empty file: only listed variables.

Edit `diagevo.def` before launching or resuming a cycle.

## Outputs

A chained run produces:

- PCM outputs (`restartfi.nc`, `restart.nc`, diagnostics),
- XIOS outputs (`xoutdaily4pem*.nc`, `xoutyearly4pem*.nc`),
- workflow state/log files (`pem_workflow.log`, `pem_workflow.sts`, optional `kill_pem_workflow.sh`),
- PEM outputs (`restartevo.nc`, restart files, `diagevo.nc`).

Subdirectories used during runs:

- `logs/`: run logs,
- `starts/`: restart snapshots,
- `diags/`: diagnostics.

## Executable Version and Provenance

PEM embeds build and version-control provenance in the executable at compile time.

Get version/provenance information:

```bash
./pem_<config> --version [output_file]
```

Examples:

```bash
./pem_8_mars_seq.e --version
./pem_8_mars_seq.e --version pem_build_info.txt
```

What is reported:

- runtime context (user, machine, command, date),
- compilation command and timestamp,
- per-repository VCS metadata (Git/SVN where available),
- working-copy status and diffs captured at build time.

Implementation path:

- `make_pem_fcm` generates version-report code,
- `config/ppsrc/misc_req/program_options.f90` parses `--version`,
- `config/ppsrc/misc_req/version_control.f90` writes report content.

## Contributor Coding Guidance

This section is based on current PEM source patterns (`src/common`, `src/mars`, `src/generic`).

### Contributor Quick Checklist

Before adding or changing code, check:

1. Does the feature belong in `src/common`, `src/mars`, or `src/generic`?
2. Are new variables local by default, with module-level state only when needed?
3. Are all procedure arguments explicit with `intent(...)`?
4. Is visibility explicit (`private`, `protected`, `parameter`) for new module state?
5. Are validation/error paths explicit and early (`stop_clean`, input checks)?
6. Is the implementation simple enough for long-term maintenance?

### 1. Module Placement (Common vs Generic vs Mars)

Default rule:

- put planet-independent logic in `src/common`,
- put Mars-only physics/processes in `src/mars`,
- put generic/exoplanet-specific alternatives in `src/generic`.

Examples from current code:

- `src/common/physics.F90`, `src/common/geometry.F90`: shared kernels,
- `src/mars/sorption.F90`, `src/mars/glaciers.F90`, `src/mars/layered_deposits.F90`: Mars-only processes,
- `src/generic/hydrology.F90`: generic alternative physics.

Recommendation:

- if a routine depends on Mars-only modules (for example sorption/layered deposits), keep it out of `src/common`.

Motivation:

- preserves profile isolation,
- reduces accidental cross-profile coupling,
- keeps documentation and dependency analysis readable.

Practical feature routing examples:

- shared numerics/physics utility: `src/common`
- Mars-only cryosphere/sorption behavior: `src/mars`
- generic/exoplanet alternative hydrology or process model: `src/generic`

### 2. Variable Placement (Local vs Module Scope)

Recommended default:

- keep variables local to subroutines/functions unless persistent cross-call state is required.
- expose module state as `protected` when read access is needed externally.
- keep internal mutable state `private`.

Use module-level state only for:

- simulation fields that represent persistent model state,
- configuration loaded once and reused by many routines.

Motivation:

- local scope improves testability and thread safety,
- controlled module scope reduces side effects and debugging time.

Source-grounded rule used in PEM `pem.F90` and `planet.F90`:

- keep persistent climate state in module `planet` (values reused across many timesteps),
- keep transient workflow/control variables in program `pem.F90` (one-shot buffers and per-iteration work arrays).

Concrete examples from current sources:

- persistent state in `planet` modules:
	- pressure/surface/soil state (`ps_avg`, `tsurf_avg`, `tsoil_avg`),
	- evolving ice/tracer state (`h2o_ice`, `co2_ice`, `q_co2_ts`, `q_h2o_ts`),
	- profile-specific persistent fields (for example Mars layering/sorption arrays).
- transient state in `pem.F90`:
	- initialization buffers loaded then deallocated,
	- per-iteration temporary arrays allocated for one evolution step then released,
	- workflow control scalars and stop flags.

Other coder recommendations directly aligned with source patterns:

- separate allocation lifecycle in dedicated routines (`allocate_xios_state`, `allocate_deviation_state`, `allocate_startevo_state`, etc.),
- deallocate temporary arrays as soon as their stage ends to limit memory growth on large grids,
- keep profile-specific state isolated (`src/mars` vs `src/generic`) and avoid leaking profile assumptions into `src/common`.

### 3. Subroutine/Function Argument Conventions

Follow current PEM patterns:

- always use explicit `intent(in|out|inout)`,
- prefer explicit shapes or assumed-shape arrays with clear contract,
- avoid optional arguments unless they improve API clarity for real use cases,
- avoid hidden large temporaries in argument expressions for large arrays.

Motivation:

- prevents misuse at call sites,
- improves compiler diagnostics,
- avoids memory spikes and performance regressions in HPC runs.

### 4. Visibility and API Design (public/private/protected/parameter)

Recommended logic:

- `parameter`: constants,
- `private`: implementation details that callers must not touch,
- `protected`: exported state that must not be modified directly by callers,
- procedures: expose narrow public interfaces (`ini_*`, `end_*`, `set_*`) and hide helpers.

Motivation:

- explicit boundaries improve reliability,
- smaller public APIs are easier to maintain and document.

### 5. Naming, Style, and Comments

Keep the existing PEM style:

- lowercase module/procedure/variable names with underscores,
- `use <module>, only: ...` imports,
- `implicit none` in every module,
- structured module and routine headers (`NAME`, `DESCRIPTION`, `AUTHORS & DATE`, `NOTES`),
- concise inline comments focused on units, assumptions, and non-obvious logic.

Recommendation:

- standardize new parameter names in lowercase underscore style for consistency.

Motivation:

- consistency improves discoverability in search and Doxygen,
- concise scientific comments reduce onboarding time for new contributors.

### 6. Architecture and Maintainability Rules

Recommended defaults:

- one module, one clear responsibility,
- centralize initialization/finalization through `ini_*` / `end_*` orchestration,
- validate critical inputs early and fail with explicit context,
- keep I/O wrappers separated from scientific kernels.

Motivation:

- easier refactoring,
- clearer dependency graphs,
- less risk of cross-cutting regressions.

Source-aligned reliability patterns:

- validate configuration and dimensions before expensive loops,
- guard I/O with explicit checks,
- keep failure messages explicit and actionable.

In PEM this typically appears as `stop_clean(__FILE__, __LINE__, ... )`-style failure points and explicit NetCDF status checks in I/O wrappers.

### 7. Memory vs Compute Trade-off Rules

Recommended defaults:

- prioritize memory safety first on large grids,
- avoid hidden temporary arrays in hot paths,
- allocate once and reuse where practical,
- optimize compute only after profiling confirms bottlenecks.

Motivation:

- PEM workloads can be long-running and grid-heavy,
- memory spikes and hidden temporaries can be more damaging than small compute overhead,
- staged optimization improves reliability while preserving performance gains.

## Code Quality Characteristics

| Characteristic | Recommendation | Motivation in PEM context |
|---|---|---|
| Reliability | Validate dimensions/config values early; fail fast with context. | Prevents silent corruption across long workflows. |
| Robustness | Guard I/O and restart handling with explicit checks and clear errors. | Supports restart/resume in HPC environments. |
| Maintainability | Prefer small procedures with explicit contracts and clear init/end ownership. | Reduces regression risk when extending long-lived climate workflows. |
| Modularity | Keep physics, numerics, I/O, and workflow concerns in separate modules. | Limits coupling and keeps profile-specific code isolated. |
| Readability | Use consistent naming and short comments on units/assumptions. | Scientific intent stays clear for users and reviewers. |
| Simplicity | Prefer straightforward implementations over clever shortcuts. | Easier review, debugging, and long-term maintenance. |
| Discoverability | Keep predictable file/module names and link docs to modules. | Faster onboarding and easier navigation in Doxygen. |
| Efficiency (memory) | Avoid unnecessary temporaries, allocate once when possible. | Prevents memory pressure on large grids. |
| Efficiency (compute) | Keep hot loops simple and avoid hidden expensive operations. | Preserves runtime throughput in chained simulations. |
| Extensibility | Add new physics behind narrow module interfaces and profile boundaries. | New features can be integrated without destabilizing shared kernels. |
| Scalability | Keep algorithms/data structures compatible with larger grids and longer runs. | Enables profile growth without redesign. |
| Portability | Avoid compiler-specific behavior unless isolated behind architecture files. | Same source can build on local and HPC environments. |
| Usability | Keep command interfaces explicit and documented with required vs optional arguments. | Newcomers can run workflows correctly on first attempt. |

## Related Documents

- `deftank/README.md`: workflow setup and run templates.
- `datadir/README.md`: orbital/input data usage.
- `toolbox/README.md`: post-processing and analysis tools.
- `doc/README.md`: source-tree documentation index and Doxygen workflow.

Online resources:

- Wiki: https://lmdz-forge.lmd.jussieu.fr/mediawiki/Planets/index.php/PEM_(Planetary_Evolution_Model)
- SVN/Trac: https://trac.lmd.jussieu.fr/Planeto
- Change history: `changelog.txt`

## Build Local Doxygen Documentation

From PEM root:

```bash
./doc/build_doc.sh
```

Useful commands:

```bash
./doc/build_doc.sh --check-deps
./doc/build_doc.sh --clean
./doc/build_doc.sh --profile mars
./doc/build_doc.sh --profile generic
./doc/build_doc.sh --profile all
./doc/build_doc.sh --update-warnings-baseline
./doc/build_doc.sh --strict-warnings
```

Direct Doxygen command (static Mars-oriented config):

```bash
doxygen doc/Doxyfile
```

Prefer `./doc/build_doc.sh` for profile-aware generation.
