Machine Learning for Sunset Temperature Prediction

Generated: 2026-02-08 11:09
Dataset: Rubin Observatory primary mirror temperature, Jun–Dec 2025
Prediction task: Forecast temperature at sunset using full time history up to 3 hours before

1. Introduction

Predicting the temperature of the Rubin Observatory primary mirror at sunset is critical for operational planning. Thermal gradients across the mirror surface can degrade image quality, so knowing the sunset temperature 3 hours in advance allows time to implement thermal control strategies.

This analysis compares five machine learning approaches for sunset temperature forecasting, using the complete time history of temperature measurements from the start of each day up to 3 hours before sunset. The training data (even days of the month) comprises 90 days, and the test data (odd days) comprises 92 days.

2. Methods Tested

Method 1: 24-Hour Mean Baseline

Simple baseline that predicts sunset temperature as the mean of the last 24 hours of measurements. Provides a naive persistence forecast assuming recent thermal behavior continues.

Method 2: Linear Trend Extrapolation (6 hours)

Fits a linear regression to the last 6 hours of temperature data and extrapolates forward to the sunset time. Captures short-term trends but sensitive to noise and may extrapolate poorly beyond the fitting window.

Method 3: Fourier Series (4 days)

Models the temperature as a Fourier series with 24-hour fundamental period and 3 harmonics, fitting to the prior 4 days of data. Captures the diurnal temperature cycle naturally but struggles with transient weather events and extrapolation instability.

Method 4: XGBoost with Engineered Features

Gradient-boosted decision tree ensemble trained on engineered statistical features: mean, std, min, max, current value, 6-hour trend, and 6-hour mean. Robust to noise and non-linear relationships, with good generalization via regularization.

Method 5: NBEATSx-Ridge (Windowed Ridge Regression)

Inspired by the NBEATSx neural architecture, this approach uses a sliding window of the last 12 measurements (3 hours) as input features, plus time-of-day and basic statistics. A Ridge regression model (L2 regularization, alpha=10.0) learns the mapping from windowed history to sunset temperature. Fast inference, interpretable linear model.

3. Results

Performance Comparison

Model performance comparison
Method MAE (°C) RMSE (°C)
24-Hour Mean Baseline2.02732.67910.6852
Linear Trend (6h)3.87904.19060.2297
Fourier Series (4 days)3.60324.66890.0439
XGBoost + Engineered Features1.14491.50620.9005
NBEATSx-Ridge0.89451.12220.9448

Residual Analysis

Comprehensive diagnostics for the best-performing model (NBEATSx-Ridge):

Residual analysis

Residual plot interpretation:

4. Recommendation

Recommended Method: NBEATSx-Ridge

The NBEATSx-Ridge windowed regression achieves the best performance (MAE = 0.89°C, R² = 0.94) and is recommended for operational use. Key advantages:

The typical 3-hour-ahead forecast error is less than 1°C, providing operationally useful predictions for thermal control planning.

5. Key Insights

6. Data Provenance