Generative AI for Scholarship

Harvard Data Science Initiative (HDSI) & Faculty of Arts and Sciences (FAS)

Exercise: Telescope Thermal Data Analysis

In this exercise you will use Claude Code to analyze real thermal data from the Vera C. Rubin Observatory primary mirror. The dataset contains temperature measurements at 15-minute intervals from June through December 2025, recorded at the summit in Chile.

Setup: Create Working Directory and Get the Data

First, create a subdirectory for this session's work and navigate to it:

cd ~/GAIsandbox
mkdir -p session3
cd session3

Now download the data file:

curl -O https://astrostubbs.github.io/GenAI-for-Scholarship/data/session3/rubin_mirror_temps.csv

The file contains about 17,600 rows with two columns:

timestamp Date and time of the measurement (UTC), at 15-minute intervals
temperature Mirror temperature (°C)

The Exercise

Start Claude Code in your ~/GAIsandbox/session3 directory:

claude

Work through the following steps. For each one, type the prompt (or your own version of it) into Claude Code and observe what it does.

Step 1: Explore the data

Try a prompt like:

"Look at the file rubin_mirror_temps.csv. Tell me the column names
and show me the first few lines of data. Are there any anomalies
in the data file?"

Notice how Claude Code reads the file, summarizes its structure, and flags any issues (missing values, gaps in the time series, etc.) without you writing any code.

Step 2: Plot the temperature time series

"Write a Python program to plot temperature vs. time from
rubin_mirror_temps.csv. Label the axes with units and give
the plot a descriptive title."

Claude Code will write a Python script, run it, and display the resulting plot. Look at the code it wrote — is it what you would have written?

Step 3: Add a histogram

"Add a histogram of the temperature values. Show it as a
separate figure."

This builds on the previous step. Notice how Claude Code remembers the context of what it has already done.

Step 4: Find the extremes

"When did the maximum and minimum temperatures occur?
Report the dates, times, and values."

Step 5: Mark sunset on the plot

"Modify the plotting program to show only the most recent
week of data. For each day, estimate the time of sunset at
Cerro Pachon, Chile (latitude -30.24, longitude -70.74) and
mark it on the time series plot with a large red dot."

This is a more complex request — Claude Code needs to figure out how to compute sunset times (it will likely use an astronomy library like astropy or ephem), interpolate the temperature at that time, and add the markers to the plot. Watch how it breaks the problem into pieces.

Step 6: Fourier analysis

"Take the Fourier transform of the temperature time series
and report the dominant periods, in days. Plot the power
spectrum."

Look at the periods Claude Code reports as dominant. Do they make physical sense? What period would you expect to see most prominently for outdoor temperature data? Is the answer you got consistent with that expectation?

Step 7: Think critically about the result

Pause here. Look at the dominant periods Claude Code reported. Are we asking the right question of the data? Discuss with your neighbors — what would you expect to see, and does the answer match? If not, think about why, and then ask Claude Code a better question.

Step 8: Practice resuming your session

Claude Code automatically saves your conversation history in the ~/.claude/ directory, organized by the working directory where you ran it. Let's test this.

First, quit Claude Code by typing /exit or pressing Ctrl-D. Then restart it:

claude -r

You'll see a list of your recent sessions. Select the one for ~/GAIsandbox/session3. The session resumes with full context — you can continue asking questions about the data, request modifications to the plots, or reference anything from earlier in the conversation. Try asking:

"What was the maximum temperature we found earlier?"

Claude Code remembers. This makes it easy to work in stages — do some analysis, take a break, come back later and continue where you left off.

Step 9: Generate a lab notebook entry

Good research practice means documenting what you did. Ask Claude Code to write up the session as a self-contained HTML lab notebook:

"Create a NEW HTML file called lab_notebook.html that documents
the analysis we just did. Include:
- A title with today's date
- A summary of each analysis step and what we found
- All the plots, embedded directly in the HTML as base64 images
- Links to each Python source file used
- A section at the end listing the data file and its provenance

The HTML should be self-contained — openable in any browser
with no external dependencies."

Open lab_notebook.html in your browser to see the result. This is a reproducible record of your analysis session.

Using Plan Mode for Complex Tasks

For complex, multi-step tasks, Claude Code offers Plan Mode. In Plan Mode, Claude explores your codebase and designs an implementation approach first, presenting you with a detailed plan before executing anything. This is especially useful for tasks that require:

Multiple implementation steps
Exploring trade-offs between approaches
Installing new dependencies or libraries
Significant changes to existing code

The Plan Mode Workflow

Enter Plan Mode: Press Shift+Tab (or type /plan). The mode indicator in the status area will change to show you're in Plan Mode.
Submit your prompt: Describe what you want to accomplish. Claude will explore the relevant files, think through the approach, and present a detailed plan with step-by-step implementation strategy.
Review and approve: Read the plan carefully. If it looks good, approve it. If you want changes, you can discuss modifications while still in Plan Mode.
Exit Plan Mode to execute: Press Shift+Tab again to cycle out of Plan Mode. Claude will then execute the approved plan, writing code, installing packages, running tests, etc.

Think of Plan Mode as the "design" phase and normal mode as the "implementation" phase. This two-phase workflow prevents Claude from diving into implementation before you've agreed on the approach.

Step 10 (Advanced): Machine Learning Prediction

Now let's use Plan Mode for a challenging task: predict the temperature at sunset, three hours ahead of time, by comparing multiple machine learning methods.

Here's the prompt to use in Plan Mode:

I want you to act as an expert in machine learning methods. The goal is
to predict the temperature at sunset on any given day, using the entire
time history of temperature measurements up to 3 hours before sunset.

For each day:
- Input: all temperature measurements from the start of that day up to
  3 hours before sunset
- Target: the actual temperature at sunset

Break the data file into 2 sections: use even days of the month for
training, and odd days of the month for testing and validation.

The metric to optimize is MAE (mean absolute error) between predicted
and actual sunset temperature.

First, suggest several approaches that might work well for this problem,
considering that we have rich time series history as input. Then
implement and compare at least 3 different methods. Make a quantitative
performance comparison and give me a recommendation on what I should use.

In Plan Mode, Claude will explore the data, think through the approach, and present you with a detailed plan covering steps like:

Computing sunset times for all dates in the dataset
Extracting temperature 3 hours before sunset as features
Splitting data into even/odd days for train/test
Implementing multiple ML models (Random Forest, Prophet, etc.)
Training and evaluating each on the test set
Comparing performance metrics

Review the plan. If it looks good, approve it and exit Plan Mode (press Shift+Tab again). Claude will then execute the plan: installing packages, writing code, training models, and producing comparison plots and metrics.

This demonstrates the full power of an agentic coding assistant on a real research problem — planning, implementing, and evaluating a solution end-to-end.

Step 11: Create a Summary Report

After running the ML comparison, you'll have two PNG files with results: ml_comparison_v2.png (model performance comparison) and ml_residuals_v2.png (comprehensive residual analysis). Now create a professional HTML report to document your findings.

Ask Claude Code:

"Create a NEW HTML file called ml_summary.html that presents the machine
learning comparison results. This should be a separate report from the
lab notebook. Include:

1. Title: 'Machine Learning for Sunset Temperature Prediction'
2. Introduction: Brief description of the prediction problem and why it matters
3. Methods: Summarize each of the 5 ML approaches tested and what they do
4. Results: Embed both PNG files (ml_comparison_v2.png and ml_residuals_v2.png)
5. Performance table: Show MAE, RMSE, and R² for each method
6. Recommendation: Which method should be used and why?
7. Interpretation: What do the residual plots tell us about model quality?

The HTML should be self-contained with embedded images and professional
styling."

This creates a polished summary suitable for sharing with collaborators or including in research documentation. Notice how Claude Code can take the raw analysis results and transform them into a publication-ready format.

Session Management

Claude Code saves all sessions locally in ~/.claude/, organized by the directory where you ran it. This means:

claude -c — continues the most recent session in the current directory
claude -r — shows a picker to resume any previous session
Sessions include full conversation history and context
You can work in stages across multiple sessions

The session files themselves are in ~/.claude/sessions/ as JSON files, but the intended way to access them is through claude -c and claude -r.

Best Practice: Organize Your Work by Directory

Since Claude Code sessions are tied to the working directory, use your directory structure to organize different projects and analyses. For example:

~/GAIsandbox/
  session1/             # Session 1 work
  session2/             # Session 2 work
  session3/             # This session - thermal analysis
  my-project/           # Your own research project

When you navigate to a project directory and run claude -c, you automatically resume in the context of that project. This keeps different analyses separate and makes it easy to switch between them. Think of each directory as a separate "lab notebook" with its own conversation history. For this exercise, all work will be in ~/GAIsandbox/session3/, keeping it separate from other sessions.

Useful Claude Code Commands

Claude Code has several built-in commands (slash commands) that are useful during your session. Type them at the prompt:

/help Show help and available commands
/permissions View and manage tool permissions (Bash, Edit, Write, etc.)
/cost Show API cost for the current session
/model Show which model you're using (opus, sonnet, haiku)
/plan Enter Plan Mode (same as Shift+Tab)
/exit Exit Claude Code (same as Ctrl-D)
/clear Clear the conversation history
/tasks Show task list if you're using task tracking

The /permissions command is particularly useful if Claude Code gets blocked on certain operations. You can grant permissions for specific tools or prompts during your session.

Check Your API Usage

After completing the exercises, it's helpful to see how much API usage they consumed. In Claude Code, type:

/cost

This will show you the total API cost for your current session, broken down by model usage and token counts. This helps you understand the typical costs of using Claude Code for research tasks and plan your budget accordingly.

For Instructors

Reference implementations, teaching notes, and solutions are available on a separate page:

View Instructor Solutions

Post-Session Survey

Please take a moment to share your feedback — it helps us improve future sessions.

Take the Post-Session Survey