Rosa Del Mar

Daily Brief

Issue 75 2026-03-16

Agent-Assisted Data Work Packaged As End-To-End Curriculum

Issue 75 Edition 2026-03-16 5 min read
General
Sources: 1 • Confidence: High • Updated: 2026-04-12 10:16

Key takeaways

  • A three-hour NICAR 2026 workshop titled "Coding agents for data analysis" was delivered and targeted data journalists.
  • Workshop participants collectively spent 23 US dollars worth of Codex tokens.
  • A workshop workflow configured Datasette to serve static content from a visualization folder and used Claude Code to iteratively create interactive visualizations directly in that folder.
  • The workshop demonstrated using Claude Code and OpenAI Codex to explore, analyze, and clean data.
  • The handout covers setup for Claude Code and Codex, database questioning, data exploration and cleaning, visualization creation, and agent-assisted scraping.

Sections

Agent-Assisted Data Work Packaged As End-To-End Curriculum

  • A three-hour NICAR 2026 workshop titled "Coding agents for data analysis" was delivered and targeted data journalists.
  • The workshop demonstrated using Claude Code and OpenAI Codex to explore, analyze, and clean data.
  • The handout covers setup for Claude Code and Codex, database questioning, data exploration and cleaning, visualization creation, and agent-assisted scraping.
  • Workshop exercises used Python and SQLite, and some exercises used Datasette.
  • The handout was designed for usefulness beyond in-person attendees and is expected by the author to apply beyond data journalism to general data exploration.

Controlled Rollout Pattern For Hands-On Agent Tooling (Environment + Spend Governance)

  • Workshop participants collectively spent 23 US dollars worth of Codex tokens.
  • The workshop used GitHub Codespaces and OpenAI Codex to distribute a budget-restricted Codex API key to attendees.

Agent-In-The-Loop Visualization Iteration Within A Running Data App

  • A workshop workflow configured Datasette to serve static content from a visualization folder and used Claude Code to iteratively create interactive visualizations directly in that folder.
  • Claude Code generated a heat map visualization for a trees database using Leaflet and Leaflet.heat.

Unknowns

  • How many participants attended, and what was the per-attendee distribution of token spend across exercises?
  • What objective outcomes were observed (task completion time, error rates, learning gains, or quality of produced analyses/visualizations) compared to a non-agent baseline?
  • How often did agent outputs require manual correction, and what kinds of failures (logic errors, data misinterpretation, security issues, dependency problems) occurred during the exercises?
  • What were the specific parameters of the budget-restricted API key (limits, enforcement behavior, participant friction), and did the restriction affect the learning experience?
  • Is there evidence of reuse or adoption of the handout beyond the original workshop (downloads, forks, citations, or follow-on trainings)?

Investor overlay

Read-throughs

  • Early indicator of demand for agent-assisted training products that package end-to-end data workflows, which could support vendors offering coding agents, developer tools, or data journalism tooling if reused beyond the workshop.
  • Operational pattern suggests a pathway for enterprises or educators to deploy agent tooling with spend governance using hosted environments and budget-restricted API keys, implying potential demand for cost controls and admin features.
  • Agent-in-the-loop visualization iteration inside a running data app hints at workflow acceleration for analytics and visualization creation, a potential usage driver for tools that integrate agents with lightweight data stacks such as SQLite and Datasette.

What would confirm

  • Evidence of adoption beyond the original workshop, such as downloads, forks, citations, or follow-on trainings using the handout and workflow.
  • Measured outcomes versus a non-agent baseline, including task completion time, error rates, learning gains, or quality of analyses and visualizations produced.
  • Token spend distribution and cohort size, plus details on budget-restricted API key limits and enforcement, showing the spend governance pattern is scalable without harming the learning experience.

What would kill

  • Low reuse of the handout and workflow beyond the initial workshop, indicating limited transferability or insufficient perceived value.
  • Frequent need for manual correction of agent outputs, with recurring failures such as logic errors, data misinterpretation, security issues, or dependency problems during exercises.
  • Budget-restricted API keys causing significant participant friction or blocking progress, suggesting governance constraints reduce usability and weaken the case for broader rollout.

Sources

  1. 2026-03-16 simonwillison.net