I wrote a post about a BI tool called Codatum a while back, but lately I've been relying even more heavily on Claude Code as my local analysis agent.
Anyone who does data analysis knows these small but persistent frustrations:
- Forgetting when, in which notebook, and under what conditions a particular analysis was done
- Digging through old files every time you need to write a similar query
- The expressive limits of BI tool charts. Requests like "put this axis on a log scale and overlay an average line" or "add a text annotation here" constantly run into tool constraints — and clicking through UI controls is tedious in its own right
- As you drill down from analysis A to B to C, it's hard to capture the hypotheses and thinking along the way, and even harder to reconstruct the path afterward in a structured form
- The friction of moving assets — queries, charts, tables, insights — between tools when pulling together a report
Some of these pain points were still open issues with Codatum, so I decided to build my own end-to-end environment using Claude Code.
Here's a walkthrough of the analysis repository I'm actually using, with the specific patterns that make it work.
Repository Structure
The analysis repo is organized so that assets can be reused across products.
.claude/commands/ Skills (shared across all products)
analysis.md
retrospect.md
notion-export.md
lib/ Shared libraries (chart.py, etc.)
products/ Per-product assets
{product-a}/
CLAUDE.md BQ project config
context/ Domain knowledge (business.md / metrics.md)
queries/ Human-reviewed SQL templates
schema/ Table schema definitions
reports/ Per-analysis sessions
_template/
YYYY-MM-DD_TOPIC/
CLAUDE.md
queries/
report.ipynb
output/
{product-b}/ (Same structure, different BQ project)
The three Skills in .claude/commands/ are generalized to work for any product's analysis. Calling /analysis {product-a} spins up the environment for one product; /analysis {product-b} switches to the other. The context/, queries/, schema/, and reports/ directories are isolated per product, so knowledge, queries, and schemas never get mixed.
chart.py in lib/ is a shared visualization helper used across all products, with Japanese fonts and color schemes already configured.
Automating Analysis with Skills — and Building Up Feedback
Here are the key Skills I have configured in this repo.
/analysis
This skill handles the full setup for an analysis session. It automatically walks through 8 steps: data quality checks, creating a new session folder, surfacing past related sessions, displaying available base queries from queries/, fetching BigQuery schemas, and running a QA step where a separate agent independently validates the queries. After I call /analysis, all I have to do is write down the goal of the session.
/notion-export
Once analysis is done, /notion-export exports the report to Notion. It reads the product tag and document tag from products/{product-a}/CLAUDE.md, calls lib/notion_export.py, and automatically creates a page in the Notion database. The whole process — converting the notebook to HTML via jupyter nbconvert and posting it — is handled by a single skill invocation. No more context-switching between tools.
/retrospect
Calling this skill after finishing an analysis prompts Claude to categorize the session's findings into four buckets:
- Business knowledge (gotchas, data characteristics) →
products/{product-a}/context/business.md - Metrics and calculation logic →
products/{product-a}/context/metrics.md - Reusable SQL → saved as
products/{product-a}/queries/*.sql - Workflow improvements → reflected in
.claude/commands/analysis.md
For each item, Claude presents a multiple-choice Q&A: "apply as-is / apply with edits / recategorize / skip." That's all I need to decide.
If I notice "this exclusion condition is always needed," it gets automatically included in data_quality_check.sql from the next session onward. The system is built so that Claude Code gets smarter with every analysis I run.
The sections below go into what each directory and skill is actually doing under the hood.
Stacking Analysis History into CLAUDE.md
"Which session covered what, and what exclusion conditions did we apply?" — this is the information that's easiest to forget. In this repo, I create a reports/YYYY-MM-DD_TOPIC/CLAUDE.md for each analysis and keep it updated with the goal, tables used, findings, and notes. Claude Code reads this file at the start of a session and can immediately pick up from where things left off. No more wondering "where did I write those data quality exclusion conditions for that analysis?"
Building Up Human-Reviewed Queries in queries/
The inefficiency of writing queries from scratch every time is solved by the products/{product-a}/queries/ directory. This is where I accumulate human-reviewed SQL templates — things like base_data.sql (the master query that every analysis starts from) and stockout_avoid_rate.sql (the logic for calculating stockout avoidance rate) — all vetted for accuracy and reusability. By writing explicitly in products/{product-a}/CLAUDE.md to "reference queries/*.sql as the starting point for analysis," Claude Code picks these up without being told to each time. I've also made it a rule to always run data_quality_check.sql before stockout avoidance rate analysis, automating data quality assurance as well.
Writing Charts as Code
Drag-and-drop in BI tools hits expressive limits quickly. In this repo, I use lib/chart.py as a shared visualization helper — with fonts and styles already standardized — and have Claude write the Python code. When I say "add an average line to this chart and switch the x-axis to monthly," Claude immediately produces code that BI tool UIs simply can't handle as easily.
Managing Drill-Down Chains with a Series CLAUDE.md
The problem of "I drilled down from analysis A to B to C, but I can't reconstruct why I went to B afterward" is solved with a CLAUDE.md at the series level. For example, the "Shonai - Stockout Avoidance Rate" series has four sub-analyses, but the parent directory's CLAUDE.md consolidates the reasoning for each sub-analysis, shared filters, and an executive summary of key findings. Even when reviewing weeks later, a single document lets me trace the entire hypothesis chain that led to C.
I also like how the structure grows as you drill down — each step becomes its own folder: 01_overview/, 02_daily_discount_exclusion/, 03_yogurt_noodle_sku/. If I want to know why we narrowed in on yogurt noodle SKUs, I just open 03_'s CLAUDE.md. The analysis path is literally visible as a directory tree.
Centralizing Queries, Charts, and Insights in a Notebook
"Run the SQL, move the results to a BI tool, build a chart, paste it into a doc" — that three-step tool-switching workflow is completely eliminated with Jupyter Notebook. report.ipynb holds the SQL code, the resulting charts, and the analysis narrative all in one place. There's no need to hunt for "which query produced this chart?" — one notebook contains every asset for the analysis. Exporting to HTML via jupyter nbconvert and pushing to Notion is also scripted, so sharing costs are nearly zero.
How a New Analysis Actually Flows
Here's how all of this comes together in practice with a concrete example.
Say the question is: "Store A's stockout rate got worse than last month — I want to understand why." Calling /analysis is all it takes to kick off this entire flow, and sharing with the team via Notion is just as painless.
I've now reached a point where I can confidently hand off the bulk of my day-to-day analysis work to an autonomous agent — and honestly, because of that, I'm running more analyses than ever.






