The Box in the Closet

Imagine a shoebox. Not a tidy one -- the kind you find at the back of a closet after moving apartments. Inside it: napkin sketches, Post-it notes, torn-out magazine pages, a few Polaroids, and a receipt with something important scrawled on the back. You know there is a brilliant idea somewhere in that box. You wrote it down once, maybe in 2022. But finding it means dumping everything onto the floor and reading every single piece.

Now imagine that box is digital. It lives across four apps, two cloud drives, a folder of screenshots, and a notebook you scanned last summer. You have five years of notes. Thousands of them. And you can never find anything.

This is not a failure of memory. It is a failure of structure. The notes exist. The knowledge is there. But it is scattered like confetti after a parade -- each piece had meaning once, but now it is just a mess on the ground.

This article shows you how to sort that confetti back into the original sentences. In one afternoon, you will turn years of chaotic notes into a structured, connected knowledge base -- not just a folder of markdown files, but a living Obsidian vault where every note links to its neighbors and every idea is one search away. The tool that makes this possible is an AI agent running in your terminal. Think of it as a librarian who can read your handwriting -- except this one also reads screenshots, PDFs, voice memo transcripts, and writes proper wikilinks.

Why Obsidian is the Destination

There is a reason Obsidian went from a niche tool for personal knowledge management enthusiasts to over 150 million monthly active users with 2,700+ community plugins. The reason is almost absurdly simple: an Obsidian vault is just a folder of markdown files.

No proprietary database. No API required to access your own notes. No sync server you do not control. Just .md files in a directory on your filesystem.

In the AI CLI era, this architecture becomes a superpower. When your knowledge base is a folder, organizing it with an AI agent is as simple as:

cd ~/vault && claude "organize my notes"

No plugin installation. No API key configuration. No OAuth dance. The agent reads files and writes files. Obsidian reads those same files and renders them with backlinks, graph views, and full-text search. Two tools, same folder, zero integration work.

This is why we are not just organizing notes into markdown. We are organizing them into an Obsidian vault -- with [[wikilinks]], YAML frontmatter, tags, and a folder structure that Obsidian's graph view can visualize. The organized notes become immediately searchable, immediately connected, immediately useful.

What You Will Build

By the end of this afternoon, you will have:

An Obsidian vault containing all your organized notes, with proper frontmatter, tags, and wikilinks.
A set of organized markdown files grouped by topic, not by source app or date -- each one a first-class Obsidian note.
A graph of connections where you can visually see how "API design decisions" links to notes originally scattered across Apple Notes, Notion exports, and a photo of a whiteboard.

The key insight: the AI does not organize by filename or date. It reads the content and understands what each note is about. A file named meeting-2024-03-15.txt gets classified under "Project Alpha - API decisions" because the agent read what is inside. A screenshot of a whiteboard gets classified under "Product roadmap Q3" because the agent looked at the image and understood the diagram. And because the output lands in an Obsidian vault, every connection becomes a clickable link in a visual graph.

This is the difference between shelving books by cover color and shelving them by subject -- then drawing threads between every book that references another.

Prerequisites

Claude Code installed and authenticated. If you have not set it up yet, follow the first hour tutorial -- it takes ten minutes.
Obsidian installed. Free for personal use. If you already have a vault, great -- we will output into it. If not, Obsidian creates one when you open it for the first time.
Your notes, exported. More on this in the next section.
Basic terminal comfort. You will type commands and read output. No programming required.

Step 1: The Great Export

Before the AI can read your notes, they need to exist as files on your computer. Every note-taking app has an export function. Here is how to get everything out.

Apple Notes: Open Notes, select all (Cmd+A), then File > Export as PDF. Each note becomes a PDF file.

Notion: Go to Settings > Export all workspace content. Choose Markdown & CSV format. You get a zip file with one markdown file per page.

Google Keep: Use Google Takeout (takeout.google.com). Select Keep, export, and download. Notes come out as HTML files.

Existing Obsidian vaults: If you already use Obsidian but your vault is a mess, you do not need to export anything. We will reorganize in place (with the AI writing to a new subfolder, never touching originals).

Logseq: Your notes are already markdown files in a folder. Just copy them.

Voice memos: Export from your phone to your computer. They arrive as .m4a or .mp3 files.

Screenshots and photos of handwritten notes: Copy them from your phone or camera roll into the folder.

Create a single directory and put everything in it:

mkdir -p ~/notes-raw
# Copy all your exported notes here
# Notion exports, Apple Notes PDFs, screenshots, voice memos -- everything

After exporting, your folder probably looks like this:

~/notes-raw/
  meeting-2024-03-15.txt
  Untitled 47.pdf
  IMG_3921.png
  project-ideas.md
  voice-memo-2025-01-08.m4a
  Screenshot 2024-11-02 at 3.41.22 PM.png
  api-design-notes.html
  random-thoughts.md
  book-notes-atomic-habits.pdf
  IMG_4102.jpg
  quarterly-review-prep.txt
  research-links.md
  whiteboard-photo.heic
  ... (hundreds more)

A graveyard of good intentions. Filenames that meant something once. Formats that made sense in whichever app created them. Zero structure connecting any of it.

This is your starting material. The AI agent will read every single file and deliver the results straight into a vault that can actually use them.

Step 2: Prepare the Obsidian Vault

If you already have an Obsidian vault, you can output directly into it. If not, create one:

mkdir -p ~/vault

Open Obsidian and select "Open folder as vault" pointing to ~/vault. That is it. Obsidian now watches this folder. Any markdown file the AI agent writes here will appear in Obsidian instantly -- with wikilinks resolved, tags indexed, and the graph view updated.

This is the architectural elegance of Obsidian that matters here. There is no import step. There is no sync delay. The agent writes a .md file. Obsidian sees it. Done.

Step 3: Configure CLAUDE.md

Before the agent starts, it needs instructions. A CLAUDE.md file in your working directory acts as a standing order -- like telling the librarian "here is how I want my library organized" before they touch a single book.

Create this file in ~/notes-raw/:

# Note Organization Rules

## Goal

Read every file in this directory. Classify each note by its CONTENT, not by
its filename, source app, or file format. Build a structured knowledge base
as organized markdown files in an Obsidian vault.

## Output Destination

Write all organized files to: ~/vault/organized/

## Topic Categories

Classify notes into these top-level topics (create new ones if needed):
- projects/ -- notes related to specific projects, grouped by project name
- learning/ -- book notes, course notes, tutorials, concepts studied
- ideas/ -- brainstorms, product ideas, feature proposals, shower thoughts
- meetings/ -- meeting notes, discussion summaries, action items
- reference/ -- how-to guides, code snippets, configuration recipes
- personal/ -- journal entries, reflections, life plans, health notes
- research/ -- academic research, literature reviews, data analysis notes

## Classification Rules

1. READ the content of every file to determine its topic. Never classify by filename alone.
2. For images (PNG, JPG, HEIC): describe the visual content. If it is a photo of handwritten notes or a whiteboard, transcribe the text.
3. For PDFs: read the full document and extract the core topic.
4. For voice memos (M4A, MP3): transcribe the audio and classify by content.
5. If a single note covers multiple topics, place it under the PRIMARY topic and add cross-references to the others.
6. When uncertain, ask: "What would I search for to find this note?" That search term is the topic.

## Obsidian Output Format

For each topic, create a markdown file at:
  ~/vault/organized/{topic}/{subtopic}.md

Each markdown file MUST include:

### Frontmatter (YAML)
---
title: "Descriptive Title"
date: YYYY-MM-DD
lastUpdated: "2026-03-22T11:48:54"
sources:
  - original-filename-1.txt
  - original-filename-2.png
tags:
  - topic-tag
  - subtopic-tag
---

### Content
- A clear title (H1) matching the frontmatter title
- The actual content, cleaned up and formatted as readable markdown
- Wikilinks to related notes: use [[note-name]] format
  Example: "See also [[API Design Patterns]] and [[Sprint Notes Q3]]"
- Tags inline where relevant: #project-alpha, #api-design

### Wikilink Rules
- Link to other organized notes using [[filename]] (without path or .md extension)
- When referencing a specific section, use [[filename#section]]
- Create links even if the target note does not exist yet (Obsidian handles this gracefully)
- Prefer descriptive link text: [[api-rate-limiting|rate limiting strategy]]

## Index

Generate ~/vault/organized/INDEX.md as a Map of Content (MOC) that:
- Lists every topic and subtopic with [[wikilinks]]
- Shows how many notes are in each category
- Includes a "Key Connections" section highlighting notes that bridge multiple topics
- Uses tags: #MOC #index

## Safety

- NEVER delete or modify original files in ~/notes-raw/.
- Write all output to ~/vault/organized/.
- If a file cannot be read, log it in ~/vault/organized/SKIPPED.md with the reason.

This is your organization brain. Adjust the topic categories to match your life. A student might add courses/ and exams/. A researcher might add papers/ and experiments/. The categories should reflect how you think, not how your apps think.

The important difference from a plain markdown dump: every output file gets proper Obsidian frontmatter, wikilinks to related notes, and inline tags. The moment these files land in your vault, Obsidian's graph view lights up with connections you never saw before.

Step 4: Let the Agent Read Everything

Now comes the part that would take a human three weekends. The AI agent does it while you make coffee.

cd ~/notes-raw
claude

In the Claude Code prompt:

Read every file in this directory. Follow the CLAUDE.md rules exactly.
For each file, read its content, classify it by topic, and write the
organized output to ~/vault/organized/.
Use Obsidian wikilinks and frontmatter as specified in CLAUDE.md.
Start with a scan of all files and show me the proposed classification
before writing anything.

The agent scans the directory, reads each file, and presents a classification plan:

Found 847 files. Proposed classification:

projects/alpha-api/        -- 34 notes (from meetings, screenshots, text files)
projects/website-redesign/ -- 22 notes
learning/programming/      -- 67 notes (book notes, tutorials, snippets)
learning/design/           -- 18 notes
ideas/product-features/    -- 41 notes
ideas/side-projects/       -- 29 notes
meetings/2024/             -- 156 notes
meetings/2025/             -- 89 notes
reference/dev-setup/       -- 45 notes
reference/recipes/         -- 12 notes
personal/journal/          -- 78 notes
personal/health/           -- 23 notes
research/user-interviews/  -- 31 notes
research/market-analysis/  -- 19 notes

Unclassified: 14 files (3 corrupted, 11 ambiguous)

Proceed with writing organized files?

Review the plan. If the agent put your cooking recipes under reference/recipes/ and you would rather have them under personal/cooking/, say so:

Move cooking-related notes to personal/cooking/ instead of reference/recipes/.
Also split meetings/ by project, not by year.
Proceed with the rest.

The agent adjusts and writes. Each organized markdown file looks like this:

---
title: "Project Alpha - API Design Decisions"
date: 2024-11-02
lastUpdated: "2026-03-22T11:48:54"
sources:
  - meeting-2024-03-15.txt
  - IMG_3921.png
  - api-design-notes.html
  - Screenshot 2024-11-02 at 3.41.22 PM.png
tags:
  - project-alpha
  - api-design
  - architecture
---

# Project Alpha - API Design Decisions

## REST vs GraphQL Decision (March 2024)

Team decided on REST for the public API, GraphQL for internal services.
Key reasons:
- Client team more familiar with REST
- GraphQL complexity not justified for CRUD-heavy public endpoints
- Internal services need flexible querying across microservices

See [[GraphQL Internal Services Setup]] for implementation details.

## Authentication Flow (November 2024)

Whiteboard sketch (transcribed from IMG_3921.png):
- OAuth 2.0 with PKCE for mobile clients
- API key + HMAC for server-to-server
- JWT with 15-minute expiry, refresh tokens in HttpOnly cookies

Related: [[OAuth PKCE Implementation Notes]]

## Rate Limiting Strategy

From screenshot of Slack conversation:
- 100 requests/minute per API key for free tier
- 1000 requests/minute for paid tier
- Sliding window algorithm, not fixed window

---

**See also:**
- [[Project Alpha Sprint Notes]]
- [[API Design Patterns]]
- [[User Interview - Developer Experience|Developer experience feedback]]

Notice what happened. Four files -- a text file, a whiteboard photo, an HTML export, and a screenshot -- became one coherent document about API design decisions. The agent read the image, transcribed the whiteboard diagram, extracted the relevant content from the HTML, and wove it all together under a topic that makes sense.

But now look at what Obsidian does with this. Open the graph view. "Project Alpha - API Design Decisions" shows connections to "Sprint Notes," "API Design Patterns," and "Developer Experience Feedback." Click any link. Follow the thread. The notes are not just organized -- they are networked.

A note titled meeting-2024-03-15.txt told you nothing. A wikilinked note titled "Project Alpha - API Design Decisions" tells you everything and leads you to everything related.

Step 5: Handle the Tricky Formats

Most notes are text. But the interesting ones are often not.

Photos of Handwritten Notes

The agent sees images. When it encounters a photo of a notebook page, it does not just classify the photo -- it reads the handwriting and transcribes it. Your scrawled "check distributed caching -- Redis vs Memcached??" becomes searchable text inside the organized file, tagged with #infrastructure and wikilinked to [[Caching Architecture Decisions]].

Not every handwriting sample is legible. If the agent cannot read it confidently, it says so in the output and includes the original image path for manual review.

Screenshots

Screenshots of Slack conversations, browser tabs, error messages, code snippets -- the agent reads the text in the image and classifies by content. A screenshot of a Stack Overflow answer about PostgreSQL indexing ends up in reference/database/, not in a generic screenshots/ folder. It gets a wikilink to [[PostgreSQL Performance Tuning]] and a #database tag.

PDFs

Exported notes from Apple Notes and other apps often arrive as PDFs. The agent reads PDFs natively. A 12-page PDF of meeting notes from a workshop gets split into its constituent topics and cross-referenced accordingly.

Voice Memos

For voice memos, the agent transcribes the audio and classifies the transcript. A rambling 8-minute voice note about three different ideas becomes three separate entries in three different topic files, each attributed back to the source audio.

Step 6: Build the Map of Content

After the agent finishes organizing, it generates ~/vault/organized/INDEX.md -- an Obsidian Map of Content (MOC):

---
title: "Personal Knowledge Base"
tags:
  - MOC
  - index
---

# Personal Knowledge Base

Last organized: 2026-03-23
Total notes processed: 847
Successfully classified: 833
Skipped (see [[SKIPPED]]): 14

## Topics

### Projects (56 notes)
- [[Project Alpha API Decisions]] -- API design, sprint notes, architecture
- [[Website Redesign]] -- wireframes, copy, feedback

### Learning (85 notes)
- [[Programming Notes]] -- languages, patterns, tutorials
- [[Design Notes]] -- UI principles, typography, color theory

### Ideas (70 notes)
- [[Product Feature Ideas]] -- feature proposals, user requests
- [[Side Project Plans]] -- weekend project plans, prototypes

### Meetings (245 notes)
- [[Project Alpha Sprint Notes]]
- [[Design Review Notes]]
- [[One-on-One Notes]]

### Reference (57 notes)
- [[Dev Setup Guides]] -- machine config, dotfiles, tool guides
- [[API Design Patterns]]

### Personal (101 notes)
- [[Journal]]
- [[Health Notes]]
- [[Cooking Recipes]]

### Research (50 notes)
- [[User Interview Summaries]]
- [[Market Analysis]]

## Key Connections

- [[Project Alpha API Decisions]] bridges to [[API Design Patterns]]
  and [[User Interview - Developer Experience]]
- [[Side Project Recipe App]] connects to notes in both [[Cooking Recipes]]
  and [[React Patterns]]
- [[Health Notes - Sleep]] connects to [[Journal - Productivity Reflections]]

Open this file in Obsidian. Every [[link]] is clickable. The graph view from this single MOC file shows your entire knowledge base as a visual network. Clusters form around projects. Bridges appear between ideas you never consciously connected.

This is something grep cannot do. This is something a folder of flat text files cannot do. The wikilinks turn a pile of organized notes into a navigable knowledge graph.

Supercharging with Obsidian CLI

Once your vault is organized, Obsidian's official CLI (released February 2026 with v1.12) gives you powerful ways to interact with it from the terminal -- and to chain with AI agents.

# Search your vault without opening the app
obsidian search query="rate limiting"

# List all tags across the vault
obsidian tags

# Find orphan notes -- notes with no incoming or outgoing links
obsidian orphans

# See backlinks for a specific note
obsidian backlinks "Project Alpha API Decisions"

The killer feature for AI workflows: every command supports format=json output. This means an AI agent can query your vault programmatically:

# Agent can find relevant notes before answering a question
obsidian search query="authentication flow" format=json | claude -p "Based on these notes, summarize our auth architecture"

The performance numbers matter here. Obsidian CLI searches the vault index at 60x the speed of raw grep and consumes 70,000x fewer tokens than having an AI agent scan every file. For a vault with thousands of notes, the difference is between a 2-second answer and a 10-minute token-burning expedition.

Going Deeper: MCP Server

For workflows where the AI agent needs to read, search, and write to your vault in a single conversation, there is an MCP server option:

npm install -g obsidian-mcp-server

This gives Claude Code direct vault access via MCP protocol -- search, read, create, and link notes without shelling out to the CLI. It is useful for complex multi-step operations like "find all notes about Project Alpha, identify gaps, and draft three new notes to fill them."

But emphasis on this point: you do not need the MCP server. Your vault is just a folder. cd ~/vault && claude already works. The MCP server is an optimization, not a requirement. Start without it. Add it when your workflow demands it.

Scripting It for Repeat Runs

New notes accumulate. You do not want to redo this manually every quarter. Here is a script that runs the classification incrementally and outputs directly into your Obsidian vault:

#!/usr/bin/env bash
set -euo pipefail

NOTES_DIR="${1:?Usage: organize-notes.sh <notes-directory>}"
VAULT_DIR="${2:-$HOME/vault}"
ORGANIZED_DIR="$VAULT_DIR/organized"
PROCESSED_LOG="$ORGANIZED_DIR/.processed-files.txt"

mkdir -p "$ORGANIZED_DIR"
touch "$PROCESSED_LOG"

# Find files not yet processed
NEW_FILES=$(comm -23 \
  <(find "$NOTES_DIR" -maxdepth 1 -type f | sort) \
  <(sort "$PROCESSED_LOG"))

NEW_COUNT=$(echo "$NEW_FILES" | grep -c '.' || true)

if [[ "$NEW_COUNT" -eq 0 ]]; then
  echo "No new files to process."
  exit 0
fi

echo "Found $NEW_COUNT new files to classify."

claude -p "You are a note organizer. Read CLAUDE.md in $NOTES_DIR for rules.

These are NEW files that need classification:
$NEW_FILES

Read each file, classify by content, and APPEND to the existing organized
markdown files in $ORGANIZED_DIR/. Use Obsidian wikilinks and frontmatter.
Update INDEX.md with any new entries and new [[wikilinks]].
Do not reprocess files already in the organized/ directory.

After processing, list the files you processed (one per line, full path)."

# The agent outputs processed file paths -- append to log
# (In practice, capture the output and parse)
echo "$NEW_FILES" >> "$PROCESSED_LOG"

echo "Done. $NEW_COUNT new notes classified into $VAULT_DIR."

Run it monthly. Or weekly. Or whenever you dump a new batch of notes into the folder. Each run only processes what is new -- and every new note arrives in your vault with wikilinks to existing notes already in place.

CLAUDE.md for Your Vault

Once the initial organization is done, consider adding a CLAUDE.md to the vault itself. This teaches any AI agent that enters the vault how your knowledge base works:

# Vault: Personal Knowledge Base

## Structure
- organized/ -- AI-classified notes from raw exports
- projects/ -- active project notes (manually maintained)
- daily/ -- daily notes (Obsidian daily notes plugin)
- templates/ -- note templates

## Conventions
- Filenames: kebab-case, descriptive (e.g., api-rate-limiting-strategy.md)
- Tags: lowercase, hyphenated (#project-alpha, #api-design)
- Wikilinks: always use [[descriptive name]], never raw file paths
- Frontmatter: every note MUST have title, date, and tags

## Tag Taxonomy
- #MOC -- Map of Content (index notes)
- #project-{name} -- project-specific tags
- #status/active, #status/archived, #status/idea
- #type/meeting, #type/reference, #type/journal

## When Adding Notes
- Always check for existing related notes and add [[wikilinks]]
- Update the relevant MOC if one exists
- Use templates/ for consistent structure

Now when you run cd ~/vault && claude "add my notes from today's architecture meeting", the agent already knows your naming conventions, tag taxonomy, and linking style. The new note arrives as a native citizen of your vault, not a foreign import.

What Actually Changed

Before: 847 files across four apps. No structure. No way to search by topic. You knew you wrote something about API rate limiting once, but finding it meant opening three apps and scrolling through hundreds of notes with useless titles.

After: An Obsidian vault organized by what you think about, not what app you happened to use. A graph view that shows connections between ideas. Wikilinks that let you follow a thread from "API design" to "user interview feedback" to "sprint planning notes" in three clicks. A search -- either in Obsidian or from the terminal with obsidian search -- that finds everything in seconds.

The AI did not just move files into folders. It read every note, understood the content, built the structure a human would build, and wired it together with links a human would add -- if that human had time to read 847 notes in one sitting and remember every connection between them. You did not have that time. The agent did.

The notes were always there. The knowledge was always there. It just needed a librarian who could also draw the map.

For more on batch file processing with AI agents, see the complete file organizer guide. If you have not installed Claude Code yet, start with the first hour walkthrough.

Danny Huang·Follow on LinkedIn →

Free Download

Ready to streamline your terminal workflow?

Multi-terminal drag-and-drop layout, workspace Git sync, built-in AI integration, AST code analysis — all in one app.

Download Termdock →

#ai-agent#note-organization#claude-code#cli#productivity#knowledge-management#markdown#obsidian

Organize Years of Scattered Notes into an Obsidian Vault with AI CLI in One Afternoon