Machine Learning for Sunset Temperature Prediction

Generated: 2026-02-08 11:09
Dataset: Rubin Observatory primary mirror temperature, Jun–Dec 2025
Prediction task: Forecast temperature at sunset using full time history up to 3 hours before

1. Introduction

Predicting the temperature of the Rubin Observatory primary mirror at sunset is critical for operational planning. Thermal gradients across the mirror surface can degrade image quality, so knowing the sunset temperature 3 hours in advance allows time to implement thermal control strategies.

This analysis compares five machine learning approaches for sunset temperature forecasting, using the complete time history of temperature measurements from the start of each day up to 3 hours before sunset. The training data (even days of the month) comprises 90 days, and the test data (odd days) comprises 92 days.

2. Methods Tested

Method 1: 24-Hour Mean Baseline

Simple baseline that predicts sunset temperature as the mean of the last 24 hours of measurements. Provides a naive persistence forecast assuming recent thermal behavior continues.

Method 2: Linear Trend Extrapolation (6 hours)

Fits a linear regression to the last 6 hours of temperature data and extrapolates forward to the sunset time. Captures short-term trends but sensitive to noise and may extrapolate poorly beyond the fitting window.

Method 3: Fourier Series (4 days)

Models the temperature as a Fourier series with 24-hour fundamental period and 3 harmonics, fitting to the prior 4 days of data. Captures the diurnal temperature cycle naturally but struggles with transient weather events and extrapolation instability.

Method 4: XGBoost with Engineered Features

Gradient-boosted decision tree ensemble trained on engineered statistical features: mean, std, min, max, current value, 6-hour trend, and 6-hour mean. Robust to noise and non-linear relationships, with good generalization via regularization.

Method 5: NBEATSx-Ridge (Windowed Ridge Regression)

Inspired by the NBEATSx neural architecture, this approach uses a sliding window of the last 12 measurements (3 hours) as input features, plus time-of-day and basic statistics. A Ridge regression model (L2 regularization, alpha=10.0) learns the mapping from windowed history to sunset temperature. Fast inference, interpretable linear model.

3. Results

Performance Comparison

Method	MAE (°C)	RMSE (°C)	R²
24-Hour Mean Baseline	2.0273	2.6791	0.6852
Linear Trend (6h)	3.8790	4.1906	0.2297
Fourier Series (4 days)	3.6032	4.6689	0.0439
XGBoost + Engineered Features	1.1449	1.5062	0.9005
NBEATSx-Ridge	0.8945	1.1222	0.9448

Residual Analysis

Comprehensive diagnostics for the best-performing model (NBEATSx-Ridge):

Residual plot interpretation:

Actual Temps vs Prediction Errors: Prediction errors are much smaller than the range of actual temperatures, indicating good predictive skill.
Residuals vs Predicted: Residuals are randomly scattered around zero with no clear trend, suggesting the model is unbiased and has captured the relationship well.
Residuals vs Actual: No systematic pattern, confirming the model performs equally well across the temperature range.
Residual Distribution: Approximately normal distribution (slight positive skew), consistent with well-behaved prediction errors.
Q-Q Plot: Residuals follow the theoretical normal distribution closely, with slight deviations in the tails indicating occasional larger errors.
Residuals vs Time Order: No temporal autocorrelation or drift, suggesting the model is stable and doesn't have systematic time-dependent biases.

4. Recommendation

Recommended Method: NBEATSx-Ridge

The NBEATSx-Ridge windowed regression achieves the best performance (MAE = 0.89°C, R² = 0.94) and is recommended for operational use. Key advantages:

Best accuracy: 22% lower MAE than XGBoost, the next-best method
Fast inference: Single linear model, no iterative optimization
Interpretable: Linear weights show importance of each time lag
Robust: Residuals show no systematic bias or patterns
Simple deployment: Minimal dependencies (NumPy, scikit-learn)

The typical 3-hour-ahead forecast error is less than 1°C, providing operationally useful predictions for thermal control planning.

5. Key Insights

Time series methods benefit from leveraging the full temporal history, not just a single point 3 hours before sunset.
Windowed features capture local patterns effectively without requiring explicit periodic models (like Fourier series).
XGBoost and NBEATSx-Ridge both outperform traditional time series methods (mean, trend, Fourier), highlighting the value of engineered features.
Linear extrapolation performs poorly due to short-term noise amplification when extrapolating 3 hours ahead.
Fourier series struggles because transient weather patterns break the assumption of strict periodicity.

6. Data Provenance

Source: Vera C. Rubin Observatory primary mirror thermal telemetry
Period: June 1 – December 1, 2025
Location: Cerro Pachon, Chile (lat -30.24°, lon -70.74°, elev 2715 m)
Sampling: 15-minute intervals
Training: Even days of month (90 days)
Testing: Odd days of month (92 days)
Code: ml_sunset_comparison_v2.py