Harvard Data Science Initiative Harvard Faculty of Arts and Sciences

Generative AI for Scholarship

Harvard Data Science Initiative (HDSI) & Faculty of Arts and Sciences (FAS)

Week 2 — The AI-Empowered Coder

Friday, February 27, 2026 · 4:00–5:30 pm · Northwest Building, Room B103

This session focuses on using AI for interactive data analysis with Python notebooks — the kind of exploratory, iterative work that researchers do every day: loading data, writing analysis code, making plots, and debugging along the way.

What We'll Cover

Note on scope: Tools like VS Code and GitHub Copilot are designed for software production — building applications, managing large codebases, and writing production-quality code. This session is about something different: using AI to help with interactive data analysis in notebooks, which is how most researchers explore data, prototype analyses, and generate results.

Two Architectures for AI-Assisted Notebooks

This session covers two different ways to use AI while writing Python code. Understanding the architecture helps you choose the right tool for the job.

Architecture 1: Google Colab + Gemini — everything runs in the cloud

In Exercise 1, your browser connects to Google Colab, which runs your code on Google's servers. Gemini AI is built in — everything happens in the cloud. Your laptop is just the display. Colab with Gemini is free through your Harvard Google account.

Architecture 2: Local Jupyter + AI via Harvard HUIT proxy

In Exercise 2, JupyterLab runs on your own machine, so your code and data stay local. When you ask the AI for help, the query goes through Harvard's secure HUIT proxy to an LLM (Claude, Gemini, or others) via AWS Bedrock, and the response comes back the same way. The same HUIT API key works for all models available through Bedrock. Your data is protected by Harvard's enterprise agreement. API calls through HUIT are not free — they are billed to your PI's HUIT account, so coordinate with your advisor.

Prerequisites

Before the session, please ensure you have:

Setting Up for Session 2

Follow these steps to get the demonstration notebooks ready:

How to download: Control-click (Mac) or right-click (Windows) each button and select "Download Linked File" or "Save Link As..."

Notebook 1: AI in Colab Notebook 2: File I/O with Google Drive Notebook 3: Data Browsing sdss_photometry.csv

  1. Download all three notebooks and the data file using the buttons above.
  2. Upload notebooks to Google Drive:
  3. Upload the data file (for Notebook 3):
  4. Open in Colab:

Session Exercises

This session covers two ways to integrate AI with Python programming:

Exercise 1: Colab with Gemini

Google Colab has built-in Gemini integration, allowing you to get AI assistance while writing code. Three notebooks walk you through progressively:

Note: Each notebook includes detailed markdown cells that walk you through every step. Follow along during the session, or work through them at your own pace afterward.

Exercise 2: Local Jupyter Notebook with AI Integration

API Key for This Exercise
The key will be posted shortly before the session and will work until midnight on the day of the session.
Retrieve API Key (Harvard Key required)

While Colab is convenient, many researchers prefer running Jupyter notebooks locally on their laptops for better control, offline access, and integration with local files. You can set up an AI chat assistant (Jupyternaut) directly inside JupyterLab, powered by Claude through Harvard's HUIT Bedrock proxy.

Mac Setup Guide Windows Setup Guide
Step-by-step instructions for configuring Jupyter AI with Harvard's API endpoint

The setup guides (Mac | Windows) cover installation, environment configuration, Jupyternaut settings, available models, and troubleshooting.

Try it out: Once Jupyternaut is configured and responding, copy and paste this prompt into the Jupyternaut chat:

Can you make me a notebook that simulates a damped harmonic oscillator, in a new notebook. I want dynamic plots and adjustable parameters with slider controls.

Review the code Jupyternaut generates. Does it run? Do the sliders work? This is a good test of whether the AI can produce a complete, functional notebook from a single natural-language request.

Next prompt: Now try this one, using the same data file from Exercise 1:

Now make another notebook that loads in and does some sensible analysis on sdss_photometry.csv

How does the AI decide what "sensible analysis" means for an astronomy dataset it hasn't seen before? Does it figure out what the columns are? What plots and statistics does it choose?

Advantages of local notebooks with AI:

Cost Management Reminder

Using Harvard's API endpoint charges costs to your PI's HUIT billing account. Make sure you set monthly spending limits when registering your API key. Monitor usage regularly and coordinate with your advisor about appropriate resource usage.

Looking Ahead: Session 3 — Claude Code CLI

Architecture 3: Claude Code CLI — AI agent at the OS level

Next week, we'll explore Claude Code, a command-line AI agent that operates at the operating system level — not inside a notebook. It can read and write files anywhere on your machine, run shell commands, execute Python scripts, use git, and manage entire projects autonomously. Like the local Jupyter setup, it routes AI queries through Harvard's secure HUIT proxy.

Claude Code represents a more autonomous approach to AI-assisted programming — you describe what you want to accomplish, and it plans and executes the entire workflow.

No setup required before Session 3 — we'll walk through installation together during that session.

Post-Session Survey

Please take a moment to share your feedback — it helps us improve future sessions.

Take the Post-Session Survey