β‘ Quickstart: End-to-End Pipelines¶
This guide provides a complete overview of how to run node-fdm end-to-end. It covers the abstract workflow used by all architectures and provides a step-by-step walkthrough of the OpenSky 2025 reference implementation.
Configuration
All paths assume you are at the repository root. Each pipeline ships its own configuration file:
scripts/opensky/config.yamlscripts/qar/config.yaml
π General Pattern¶
Regardless of the data source (ADS-B or QAR), every architecture follows this 7-step logic.
graph LR
subgraph Prep [1. Data Preparation]
direction TB
S1[Collect & Map] --> S2[Decode & Clean]
S2 --> S3[Feature Enrichment]
S3 --> S4[Dataset Split]
end
subgraph Learn [2. Learning]
direction TB
S5[Train ODETrainer]
end
subgraph Eval [3. Deployment]
direction TB
S6[Inference] --> S7[Evaluation & Viz]
end
Prep --> Learn --> Eval
classDef phase fill:#f9f9f9,stroke:#333,stroke-width:1px;
class Prep,Learn,Eval phase;
- Collect and prepare raw data: Ensure inputs map to the architectureβs
Columndefinitions. - Decode, resample, and clean: Build consistent time steps and remove invalid segments.
- Feature enrichment: Add environmental inputs (e.g., ERA5) and compute derived physics quantities.
- Dataset splitting: Generate train/val/test lists pointing to processed parquet files.
- Training: Run
ODETrainer, loading the specificmodel_paramsfrommodel.py. - Inference: Load checkpoints with
NodeFDMPredictorto generate trajectory rollouts. - Evaluation: Compute metrics (MAE/MAPE) and generate comparison plots.
π‘ OpenSky 2025 (ADS-B) Pipeline¶
This reference pipeline processes public ADS-B data. The scripts are located in scripts/opensky/.
These steps fetch raw data and transform it into enriched, training-ready tensors.
1. Aircraft Sampling Builds the database of target aircraft types.
* Input: Trino SQL connection. * Output:data/aircraft_db.csv.
2. Download Raw Data Fetches flight history and extended tables.
* Output:data/downloaded_parquet/.
3. Decode & Resample Decodes specific BDS messages, filters short flights, and resamples to 4s.
* Note: Handles ADEP/ADES distance computation.4. Enrichment Injects ERA5 weather data and smooths signals.
* Output: Enriched files indata/processed_flights/<TYPECODE>/.
Train the Neural ODE using the architecture defined in node_fdm.architectures.opensky_2025.
5. Train Model
* Action: UsesODETrainer.
* Output: Checkpoints saved to models/opensky_<TYPECODE>/.
Generate predictions and benchmark against BADA.
6. Inference (Rollouts)
* Output:data/predicted_flights/<TYPECODE>/.
7. Baselines & Metrics Run these scripts in order to assess performance:
07_bada_prediction.py: Computes physical baseline (Requires BADA 4.2 files).08_visualize_predictions.py: Generates overlays (Ground Truth vs Model vs BADA).09_performance_aggregation.py: Computes MAE/MAPE metrics per flight phase.10_dataset_stats.py: Generates coverage statistics.
π‘ General Tips¶
Single Source of Truth
Always use the pipeline's config.yaml to define paths, typecodes, and shared parameters. Do not hardcode paths in scripts.
Caching
Ensure data/era5_cache exists. Meteorological data download is slow; caching prevents repeated downloads of the same ERA5 fields.
Hardware Optimization
If you face memory issues, adjust the following in model_config:
- Decrease
batch_size. - Decrease
num_workers. - Adjust
seq_len(sequence length).
π Next Steps¶
Now that you understand the core building blocks, start applying them by following the step-by-step development workflow:
- 1. Configure Project: Set up paths, typecodes, and key hyperparameters for your pipeline.
- 2. Create an Architecture: Define your custom model's column groups, preprocessing hooks, and layer stack.
- 3. Train a Model: Launch the learning process using the
ODETrainer. - 4. Run Inference: Generate trajectory rollouts and evaluate the model's performance.