Generative AI for Scholarship

Harvard Data Science Initiative (HDSI) & Faculty of Arts and Sciences (FAS)

Week 2 — The AI-Empowered Coder

Friday, February 27, 2026 · 4:00–5:30 pm · Northwest Building, Room B103

This session focuses on using AI for interactive data analysis with Python notebooks — the kind of exploratory, iterative work that researchers do every day: loading data, writing analysis code, making plots, and debugging along the way.

What We'll Cover

Two cloud/local architectures for AI-assisted notebook coding
Google Colab + Gemini — AI help built into a free cloud notebook environment
Local JupyterLab + Harvard API — AI chat in a notebook running on your own machine
Hands-on exercises — working with packages, CSV data, Google Drive files, and AI-generated code

Note on scope: Tools like VS Code and GitHub Copilot are designed for software production — building applications, managing large codebases, and writing production-quality code. This session is about something different: using AI to help with interactive data analysis in notebooks, which is how most researchers explore data, prototype analyses, and generate results.

Two Architectures for AI-Assisted Notebooks

This session covers two different ways to use AI while writing Python code. Understanding the architecture helps you choose the right tool for the job.

Architecture 1: Google Colab + Gemini — everything runs in the cloud

In Exercise 1, your browser connects to Google Colab, which runs your code on Google's servers. Gemini AI is built in — everything happens in the cloud. Your laptop is just the display. Colab with Gemini is free through your Harvard Google account.

Architecture 2: Local Jupyter + AI via Harvard HUIT proxy

In Exercise 2, JupyterLab runs on your own machine, so your code and data stay local. When you ask the AI for help, the query goes through Harvard's secure HUIT proxy to an LLM (Claude, Gemini, or others) via AWS Bedrock, and the response comes back the same way. The same HUIT API key works for all models available through Bedrock. Your data is protected by Harvard's enterprise agreement. API calls through HUIT are not free — they are billed to your PI's HUIT account, so coordinate with your advisor.

Prerequisites

Before the session, please ensure you have:

Terminal basics: Be comfortable with terminal-window operating system commands (navigating directories, running programs from the command line)
Google Colab access: Sign in to colab.research.google.com with your Harvard-affiliated Google account
Colab folder: Verify you have a "Colab Notebooks" folder in your Google Drive (it should be created automatically when you first use Colab)
Know how to open notebooks: In Google Drive, right-click a .ipynb file → "Open with" → "Google Colaboratory"
Know how to upload notebooks: In Colab, go to File → Upload notebook, or drag a .ipynb file into Google Drive

Setting Up for Session 2

Follow these steps to get the demonstration notebooks ready:

How to download: Control-click (Mac) or right-click (Windows) each button and select "Download Linked File" or "Save Link As..."

Notebook 1: AI in Colab Notebook 2: File I/O with Google Drive Notebook 3: Data Browsing sdss_photometry.csv

Download all three notebooks and the data file using the buttons above.
Upload notebooks to Google Drive:
- Go to drive.google.com
- Navigate to your "Colab Notebooks" folder (or create it if it doesn't exist)
- Drag and drop all three .ipynb files into this folder
Upload the data file (for Notebook 3):
- Upload sdss_photometry.csv to the Colab Notebooks folder in your Google Drive (same folder as the notebooks)
Open in Colab:
- In Google Drive, navigate to your "Colab Notebooks" folder
- Right-click on a notebook file
- Select "Open with" → "Google Colaboratory"
- You're ready to go!

Session Exercises

This session covers two ways to integrate AI with Python programming:

Exercise 1: Colab with Gemini

Google Colab has built-in Gemini integration, allowing you to get AI assistance while writing code. Three notebooks walk you through progressively:

Notebook 1 — AI in Colab: The Gemini sidebar (chat) vs. the magic wand (cell-level AI), pre-installed packages, and installing new ones with !pip install
Notebook 2 — File I/O with Google Drive: Generating, writing, reading, and analyzing a CSV data file on your Google Drive
Notebook 3 — Data Browsing: Loading real SDSS galaxy photometry from Google Drive, inspecting and visualizing it, then using the magic wand to generate analysis code

Note: Each notebook includes detailed markdown cells that walk you through every step. Follow along during the session, or work through them at your own pace afterward.

Exercise 2: Local Jupyter Notebook with AI Integration

API Key for This Exercise
The key will be posted shortly before the session and will work until midnight on the day of the session.
Retrieve API Key (Harvard Key required)

While Colab is convenient, many researchers prefer running Jupyter notebooks locally on their laptops for better control, offline access, and integration with local files. You can set up an AI chat assistant (Jupyternaut) directly inside JupyterLab, powered by Claude through Harvard's HUIT Bedrock proxy.

Mac Setup Guide Windows Setup Guide
Step-by-step instructions for configuring Jupyter AI with Harvard's API endpoint

The setup guides (Mac | Windows) cover installation, environment configuration, Jupyternaut settings, available models, and troubleshooting.

Try it out: Once Jupyternaut is configured and responding, copy and paste this prompt into the Jupyternaut chat:

Can you make me a notebook that simulates a damped harmonic oscillator, in a new notebook. I want dynamic plots and adjustable parameters with slider controls.

Review the code Jupyternaut generates. Does it run? Do the sliders work? This is a good test of whether the AI can produce a complete, functional notebook from a single natural-language request.

Next prompt: Now try this one, using the same data file from Exercise 1:

Now make another notebook that loads in and does some sensible analysis on sdss_photometry.csv

How does the AI decide what "sensible analysis" means for an astronomy dataset it hasn't seen before? Does it figure out what the columns are? What plots and statistics does it choose?

Advantages of local notebooks with AI:

Work offline (after initial API calls)
Access to local files and databases
Full control over Python environment and packages
Integration with version control (git)
Better privacy for sensitive data

Cost Management Reminder

Using Harvard's API endpoint charges costs to your PI's HUIT billing account. Make sure you set monthly spending limits when registering your API key. Monitor usage regularly and coordinate with your advisor about appropriate resource usage.

Looking Ahead: Session 3 — Claude Code CLI

Architecture 3: Claude Code CLI — AI agent at the OS level

Next week, we'll explore Claude Code, a command-line AI agent that operates at the operating system level — not inside a notebook. It can read and write files anywhere on your machine, run shell commands, execute Python scripts, use git, and manage entire projects autonomously. Like the local Jupyter setup, it routes AI queries through Harvard's secure HUIT proxy.

Execute multi-step tasks: Plan, implement, test, and debug complete workflows
Read and write files: Navigate your codebase and make changes across multiple files
Run commands: Execute Python scripts, install packages, run tests
Analyze real data: Work with data files and generate complete analysis pipelines
Create documentation: Generate reports, plots, and notebooks from your analysis

Claude Code represents a more autonomous approach to AI-assisted programming — you describe what you want to accomplish, and it plans and executes the entire workflow.

No setup required before Session 3 — we'll walk through installation together during that session.

Post-Session Survey

Please take a moment to share your feedback — it helps us improve future sessions.

Take the Post-Session Survey