Machine Learning for Sunset Temperature Prediction
Generated: 2026-02-08 11:09
Dataset: Rubin Observatory primary mirror temperature, Jun–Dec 2025
Prediction task: Forecast temperature at sunset using full time history up to 3 hours before
1. Introduction
Predicting the temperature of the Rubin Observatory primary mirror at sunset is critical for
operational planning. Thermal gradients across the mirror surface can degrade image quality,
so knowing the sunset temperature 3 hours in advance allows time to implement thermal
control strategies.
This analysis compares five machine learning approaches for sunset temperature forecasting,
using the complete time history of temperature measurements from the start of each day up to
3 hours before sunset. The training data (even days of the month) comprises 90 days, and
the test data (odd days) comprises 92 days.
2. Methods Tested
Method 1: 24-Hour Mean Baseline
Simple baseline that predicts sunset temperature as the mean of the last 24 hours of measurements.
Provides a naive persistence forecast assuming recent thermal behavior continues.
Method 2: Linear Trend Extrapolation (6 hours)
Fits a linear regression to the last 6 hours of temperature data and extrapolates forward to
the sunset time. Captures short-term trends but sensitive to noise and may extrapolate poorly
beyond the fitting window.
Method 3: Fourier Series (4 days)
Models the temperature as a Fourier series with 24-hour fundamental period and 3 harmonics,
fitting to the prior 4 days of data. Captures the diurnal temperature cycle naturally but
struggles with transient weather events and extrapolation instability.
Method 4: XGBoost with Engineered Features
Gradient-boosted decision tree ensemble trained on engineered statistical features:
mean, std, min, max, current value, 6-hour trend, and 6-hour mean. Robust to noise
and non-linear relationships, with good generalization via regularization.
Method 5: NBEATSx-Ridge (Windowed Ridge Regression)
Inspired by the NBEATSx neural architecture, this approach uses a sliding window of the last
12 measurements (3 hours) as input features, plus time-of-day and basic statistics. A Ridge
regression model (L2 regularization, alpha=10.0) learns the mapping from windowed history
to sunset temperature. Fast inference, interpretable linear model.
3. Results
Performance Comparison
| Method |
MAE (°C) |
RMSE (°C) |
R² |
| 24-Hour Mean Baseline | 2.0273 | 2.6791 | 0.6852 |
| Linear Trend (6h) | 3.8790 | 4.1906 | 0.2297 |
| Fourier Series (4 days) | 3.6032 | 4.6689 | 0.0439 |
| XGBoost + Engineered Features | 1.1449 | 1.5062 | 0.9005 |
| NBEATSx-Ridge | 0.8945 | 1.1222 | 0.9448 |
Residual Analysis
Comprehensive diagnostics for the best-performing model (NBEATSx-Ridge):
Residual plot interpretation:
- Actual Temps vs Prediction Errors: Prediction errors are much smaller
than the range of actual temperatures, indicating good predictive skill.
- Residuals vs Predicted: Residuals are randomly scattered around zero
with no clear trend, suggesting the model is unbiased and has captured the relationship well.
- Residuals vs Actual: No systematic pattern, confirming the model performs
equally well across the temperature range.
- Residual Distribution: Approximately normal distribution (slight positive
skew), consistent with well-behaved prediction errors.
- Q-Q Plot: Residuals follow the theoretical normal distribution closely,
with slight deviations in the tails indicating occasional larger errors.
- Residuals vs Time Order: No temporal autocorrelation or drift,
suggesting the model is stable and doesn't have systematic time-dependent biases.
4. Recommendation
Recommended Method: NBEATSx-Ridge
The NBEATSx-Ridge windowed regression achieves the best performance (MAE = 0.89°C, R² = 0.94)
and is recommended for operational use. Key advantages:
- Best accuracy: 22% lower MAE than XGBoost, the next-best method
- Fast inference: Single linear model, no iterative optimization
- Interpretable: Linear weights show importance of each time lag
- Robust: Residuals show no systematic bias or patterns
- Simple deployment: Minimal dependencies (NumPy, scikit-learn)
The typical 3-hour-ahead forecast error is less than 1°C, providing operationally useful
predictions for thermal control planning.
5. Key Insights
- Time series methods benefit from leveraging the full temporal history, not just a single
point 3 hours before sunset.
- Windowed features capture local patterns effectively without requiring explicit periodic
models (like Fourier series).
- XGBoost and NBEATSx-Ridge both outperform traditional time series methods (mean, trend,
Fourier), highlighting the value of engineered features.
- Linear extrapolation performs poorly due to short-term noise amplification when
extrapolating 3 hours ahead.
- Fourier series struggles because transient weather patterns break the assumption of
strict periodicity.
6. Data Provenance
- Source: Vera C. Rubin Observatory primary mirror thermal telemetry
- Period: June 1 – December 1, 2025
- Location: Cerro Pachon, Chile (lat -30.24°, lon -70.74°, elev 2715 m)
- Sampling: 15-minute intervals
- Training: Even days of month (90 days)
- Testing: Odd days of month (92 days)
- Code: ml_sunset_comparison_v2.py