Turning Claude Code into Your Ultimate Data Analyst

I wrote a post about a BI tool called Codatum a while back, but lately I've been relying even more heavily on Claude Code as my local analysis agent.

Anyone who does data analysis knows these small but persistent frustrations:

Forgetting when, in which notebook, and under what conditions a particular analysis was done
Digging through old files every time you need to write a similar query
The expressive limits of BI tool charts. Requests like "put this axis on a log scale and overlay an average line" or "add a text annotation here" constantly run into tool constraints — and clicking through UI controls is tedious in its own right
As you drill down from analysis A to B to C, it's hard to capture the hypotheses and thinking along the way, and even harder to reconstruct the path afterward in a structured form
The friction of moving assets — queries, charts, tables, insights — between tools when pulling together a report

Some of these pain points were still open issues with Codatum, so I decided to build my own end-to-end environment using Claude Code.

Here's a walkthrough of the analysis repository I'm actually using, with the specific patterns that make it work.

Repository Structure

The analysis repo is organized so that assets can be reused across products.

.claude/commands/         Skills (shared across all products)
  analysis.md
  retrospect.md
  notion-export.md
lib/                      Shared libraries (chart.py, etc.)
products/                 Per-product assets
  {product-a}/
    CLAUDE.md             BQ project config
    context/              Domain knowledge (business.md / metrics.md)
    queries/              Human-reviewed SQL templates
    schema/               Table schema definitions
    reports/              Per-analysis sessions
      _template/
      YYYY-MM-DD_TOPIC/
        CLAUDE.md
        queries/
        report.ipynb
        output/
  {product-b}/            (Same structure, different BQ project)

The three Skills in .claude/commands/ are generalized to work for any product's analysis. Calling /analysis {product-a} spins up the environment for one product; /analysis {product-b} switches to the other. The context/, queries/, schema/, and reports/ directories are isolated per product, so knowledge, queries, and schemas never get mixed.

chart.py in lib/ is a shared visualization helper used across all products, with Japanese fonts and color schemes already configured.

Automating Analysis with Skills — and Building Up Feedback

Here are the key Skills I have configured in this repo.

/analysis

This skill handles the full setup for an analysis session. It automatically walks through 8 steps: data quality checks, creating a new session folder, surfacing past related sessions, displaying available base queries from queries/, fetching BigQuery schemas, and running a QA step where a separate agent independently validates the queries. After I call /analysis, all I have to do is write down the goal of the session.

/notion-export

Once analysis is done, /notion-export exports the report to Notion. It reads the product tag and document tag from products/{product-a}/CLAUDE.md, calls lib/notion_export.py, and automatically creates a page in the Notion database. The whole process — converting the notebook to HTML via jupyter nbconvert and posting it — is handled by a single skill invocation. No more context-switching between tools.

/retrospect

Calling this skill after finishing an analysis prompts Claude to categorize the session's findings into four buckets:

Business knowledge (gotchas, data characteristics) → products/{product-a}/context/business.md
Metrics and calculation logic → products/{product-a}/context/metrics.md
Reusable SQL → saved as products/{product-a}/queries/*.sql
Workflow improvements → reflected in .claude/commands/analysis.md

For each item, Claude presents a multiple-choice Q&A: "apply as-is / apply with edits / recategorize / skip." That's all I need to decide.

If I notice "this exclusion condition is always needed," it gets automatically included in data_quality_check.sql from the next session onward. The system is built so that Claude Code gets smarter with every analysis I run.

The sections below go into what each directory and skill is actually doing under the hood.

Stacking Analysis History into CLAUDE.md

"Which session covered what, and what exclusion conditions did we apply?" — this is the information that's easiest to forget. In this repo, I create a reports/YYYY-MM-DD_TOPIC/CLAUDE.md for each analysis and keep it updated with the goal, tables used, findings, and notes. Claude Code reads this file at the start of a session and can immediately pick up from where things left off. No more wondering "where did I write those data quality exclusion conditions for that analysis?"

Building Up Human-Reviewed Queries in `queries/`

The inefficiency of writing queries from scratch every time is solved by the products/{product-a}/queries/ directory. This is where I accumulate human-reviewed SQL templates — things like base_data.sql (the master query that every analysis starts from) and stockout_avoid_rate.sql (the logic for calculating stockout avoidance rate) — all vetted for accuracy and reusability. By writing explicitly in products/{product-a}/CLAUDE.md to "reference queries/*.sql as the starting point for analysis," Claude Code picks these up without being told to each time. I've also made it a rule to always run data_quality_check.sql before stockout avoidance rate analysis, automating data quality assurance as well.

Writing Charts as Code

Drag-and-drop in BI tools hits expressive limits quickly. In this repo, I use lib/chart.py as a shared visualization helper — with fonts and styles already standardized — and have Claude write the Python code. When I say "add an average line to this chart and switch the x-axis to monthly," Claude immediately produces code that BI tool UIs simply can't handle as easily.

Managing Drill-Down Chains with a Series CLAUDE.md

The problem of "I drilled down from analysis A to B to C, but I can't reconstruct why I went to B afterward" is solved with a CLAUDE.md at the series level. For example, the "Shonai - Stockout Avoidance Rate" series has four sub-analyses, but the parent directory's CLAUDE.md consolidates the reasoning for each sub-analysis, shared filters, and an executive summary of key findings. Even when reviewing weeks later, a single document lets me trace the entire hypothesis chain that led to C.

I also like how the structure grows as you drill down — each step becomes its own folder: 01_overview/, 02_daily_discount_exclusion/, 03_yogurt_noodle_sku/. If I want to know why we narrowed in on yogurt noodle SKUs, I just open 03_'s CLAUDE.md. The analysis path is literally visible as a directory tree.

Centralizing Queries, Charts, and Insights in a Notebook

"Run the SQL, move the results to a BI tool, build a chart, paste it into a doc" — that three-step tool-switching workflow is completely eliminated with Jupyter Notebook. report.ipynb holds the SQL code, the resulting charts, and the analysis narrative all in one place. There's no need to hunt for "which query produced this chart?" — one notebook contains every asset for the analysis. Exporting to HTML via jupyter nbconvert and pushing to Notion is also scripted, so sharing costs are nearly zero.

How a New Analysis Actually Flows

Here's how all of this comes together in practice with a concrete example.

Say the question is: "Store A's stockout rate got worse than last month — I want to understand why." Calling /analysis is all it takes to kick off this entire flow, and sharing with the team via Notion is just as painless.

I've now reached a point where I can confidently hand off the bulk of my day-to-day analysis work to an autonomous agent — and honestly, because of that, I'm running more analyses than ever.

Author

矢本真丈 (Masatake Yamoto)

3児の父。1987年青森県弘前市生まれ大阪府在住。東北大学応用化学修士を修了。大学院在学中の2011.3.11に東日本震災で被災。卒業後、丸紅にて資源投資業務、一般社団法人RCFにてGoogleとのイノベーション東北プロジェクト、株式会社スマービー（現・ストライプインターナショナル）にてママ向けECの立ち上げ等を経て、 2017年6月株式会社10Xを創業、代表取締役を務める。

Turning Claude Code into Your Ultimate Data Analyst

Repository Structure

Automating Analysis with Skills — and Building Up Feedback

/analysis

/notion-export

/retrospect

Stacking Analysis History into CLAUDE.md

Building Up Human-Reviewed Queries in `queries/`

Writing Charts as Code

Managing Drill-Down Chains with a Series CLAUDE.md

Centralizing Queries, Charts, and Insights in a Notebook

How a New Analysis Actually Flows

Author

矢本真丈 (Masatake Yamoto)

Comments

Write a comment

Related Articles

「―――」と「正直」が、嫌いすぎる

スーパーマーケット向けグローバルSaaSで何が起きているのか

Claude Code時代の仕様書の役割

AIエージェントは使われる「場所」の勝負になっている

SlackとGoogle Apps Scriptでサーバレスダッシュボードを構築する

スクショケシというプロダクトをつくった話

子育てとスタートアップ起業

Repository Structure

Automating Analysis with Skills — and Building Up Feedback

/analysis

/notion-export

/retrospect

Stacking Analysis History into CLAUDE.md

Building Up Human-Reviewed Queries in queries/

Writing Charts as Code

Managing Drill-Down Chains with a Series CLAUDE.md

Centralizing Queries, Charts, and Insights in a Notebook

How a New Analysis Actually Flows

Author

矢本真丈 (Masatake Yamoto)

Comments

Write a comment

Related Articles

「―――」と「正直」が、嫌いすぎる

スーパーマーケット向けグローバルSaaSで何が起きているのか

Claude Code時代の仕様書の役割

AIエージェントは使われる「場所」の勝負になっている

SlackとGoogle Apps Scriptでサーバレスダッシュボードを構築する

スクショケシというプロダクトをつくった話

子育てとスタートアップ起業

Building Up Human-Reviewed Queries in `queries/`