Document Processing
Extract text from PDFs, Word docs, Excel sheets, and PowerPoints directly into Obsidian
Process documents in 30 seconds
- Drag any document into chat or note
- Choose extraction mode (full text, summary, key points)
- Get markdown ready to edit, link, and search
Turn locked content into living notes!
What document processing does for you
- Research papers: PDF → Searchable notes with AI summaries
- Meeting docs: Word files → Action items and key decisions
- Data tables: Excel → Markdown tables for analysis
- Presentations: PowerPoint → Structured notes with slide content
Supported formats
What you can process
Format | Extensions | Extracts |
---|---|---|
Text, structure, pages | ||
Word | .docx, .doc | Text, tables, formatting |
Excel | .xlsx, .xls | All sheets, tables, values |
PowerPoint | .pptx, .ppt | Slides, speaker notes |
Text | .txt, .rtf, .csv | Direct content |
Format details
PDF capabilities:
- Multi-page documents
- Preserves headings
- Maintains paragraphs
- Tables when possible
- Note: Scanned PDFs need OCR first
Word features:
- Bold and italic preserved
- Lists maintained
- Tables converted
- Basic structure kept
Excel conversion:
Markdown| Product | Q1 | Q2 | Q3 | |---------|----|----|----| | Widget | 100| 150| 200|
PowerPoint structure:
Markdown## Slide 1: Title Content... ## Slide 2: Main Points - Bullet point - Another point Speaker Notes: Remember to emphasize...
How to process
Method 1: Drag and drop
Fastest approach:
- Drag document into chat/note
- SystemSculpt detects format
- Choose processing option
- Done!
Method 2: Command palette
Cmd/Ctrl + P → "Process Document"
→ Select file → Choose options
Method 3: Right-click
File Explorer → Right-click document
→ "Process with SystemSculpt"
Processing options
Extraction modes
Mode | What you get | Best for |
---|---|---|
Full Text | Complete content | Archives, reference |
Summary | AI-condensed version | Quick review |
Key Points | Main ideas only | Busy professionals |
Structured | Headings preserved | Study notes |
Output options
Where to put extracted text:
- Create new note
- Insert in current note
- Copy to clipboard
- Add to chat context
Custom prompts
Extract specific content:
"Extract only the methodology section"
"Find all dates and deadlines"
"Get financial data as tables"
"List all action items mentioned"
Real-world workflows
Research workflow
PDF papers → Knowledge base:
- Collect PDFs in folder
- Select all → Process batch
- Choose "Structured + Summary"
- Each becomes searchable note
- AI helps synthesize findings
Example result:
Markdown# Paper Title - Smith et al. 2024 ## Summary [AI-generated overview] ## Introduction [Extracted text...] ## Methods [Extracted text...] ## Key Findings - Finding 1 - Finding 2 ## My Notes - [ ] Follow up on methodology - [ ] Compare with Jones 2023
Meeting documentation
Agenda → Minutes workflow:
- Process agenda (Word/PDF)
- Add notes during meeting
- Process minutes after
- AI extracts all action items
Financial reports
Excel → Analysis:
- Drop spreadsheet
- Get markdown tables
- Ask AI to:
- Calculate trends
- Identify anomalies
- Create summaries
- Generate insights
Course materials
Slides → Study notes:
- Process all PowerPoints
- Organize by lecture
- AI creates:
- Study guides
- Practice questions
- Key concepts list
- Flashcards
Advanced techniques
Batch processing
Multiple files at once:
Select 10 PDFs → Right-click
→ Process all → Creates 10 notes
Consistent naming:
- Original:
Report_Q1_2024.pdf
- Output:
Report_Q1_2024 - Extracted.md
Combine with AI
After extraction:
- Drop into chat
- Ask questions:
- "Summarize the main arguments"
- "What actions are required?"
- "Compare with [other doc]"
- "Create outline for presentation"
Smart organization
Documents/
├── Original/
│ ├── contract-v1.pdf
│ └── contract-v2.pdf
├── Extracted/
│ ├── contract-v1.md
│ └── contract-v2.md
└── Analysis/
└── contract-comparison.md
Tips for best results
Before processing
✅ Check file:
- Under 50MB size
- Has selectable text (PDFs)
- Not corrupted
- Supported format
✅ Prepare docs:
- OCR scanned PDFs first
- Save Word as .docx
- Clean up Excel data
- Add speaker notes to slides
Quality optimization
PDFs:
- Text-based, not scanned
- Clear formatting
- Avoid complex layouts
Word docs:
- Use styles for structure
- Clean formatting
- Avoid track changes
Excel sheets:
- Clear headers
- Simple table structure
- Remove formulas
- Delete empty rows
PowerPoints:
- Include speaker notes
- Use slide titles
- Keep text in shapes
- Avoid heavy graphics
Limitations & workarounds
What's not extracted
Content | Workaround |
---|---|
Images | Add descriptions manually |
Charts | Screenshot separately |
Formulas | Shows results only |
Macros | Ignored |
Comments | Not included |
Size limits
- Max file: 50MB
- Recommended: Under 10MB
- Large files: Split first or process in sections
Format issues
"Can't extract text":
- PDF might be scanned
- Run OCR first
- Try different tool
"Formatting lost":
- Complex layouts simplify
- Manual cleanup needed
- Focus on content
Integration examples
With templates
Markdown## Document: {{filename}} Processed: {{date}} Type: {{document-type}} ### Extracted Content <!-- Extraction appears here --> ### AI Analysis - Summary: - Key points: - Actions needed: ### Related - [[Original file]] - [[Project notes]]
With embeddings
- Process documents
- Embeddings index content
- Semantic search finds everything
- Connect related documents
With chat
Research assistant:
You: [Drop 5 PDFs into chat]
You: "Compare the methodologies"
AI: [Analyzes all 5 documents]
You: "Create literature review outline"
AI: [Generates comprehensive outline]
Privacy & security
- Secure upload: Encrypted transmission
- No storage: Processed and deleted
- Your data: Never used for training
- Quick process: Usually under 1 minute
Troubleshooting
Issue | Fix |
---|---|
"Processing failed" | Check format, size, connection |
"No text found" | PDF needs OCR, file empty |
"Takes too long" | Large file, try smaller |
"Wrong content" | Check extraction mode |
Next steps
- Audio Features - Transcribe recordings too
- Premium Overview - All premium benefits
- Try it: Drag a PDF into Obsidian now!
📄 Start simple: Try with a small PDF first, then tackle your document backlog!