We're currently undergoing massive updates and changes to the core infrastructure of SystemSculpt. Website and plugin updates are occurring daily.

Audio Features

5 min read

Turn any audio into searchable, editable text within Obsidian

Transcribe in 30 seconds

  1. Drag audio file into chat or note
  2. Click "Transcribe" when prompted
  3. Get text with timestamps, speakers, and formatting

Works with recordings, meetings, lectures, podcasts - anything with speech!

What audio transcription does for you

  • Meeting notes: Record → Transcribe → AI extracts action items
  • Lecture capture: Never miss important points while note-taking
  • Interview documentation: Full transcripts with speaker labels
  • Voice journaling: Speak thoughts, get formatted notes
  • Content creation: Draft by speaking, polish with AI

Quick start

Record & transcribe

1. Click 🎙️ in ribbon (or Cmd/Ctrl + R)
2. Speak your thoughts
3. Stop recording
4. Auto-transcribes instantly
5. Edit or save as note

Transcribe existing files

1. Drag audio file to chat/note
2. Select "Transcribe"
3. Choose format:
   - Standard (paragraphs)
   - Timestamped (SRT)
4. Use transcript immediately

Supported formats

Primary formats

FormatExtensionBest forQuality
M4A.m4aApple devices, voice⭐⭐⭐⭐⭐
MP3.mp3Universal, podcasts⭐⭐⭐⭐
WAV.wavPro recording⭐⭐⭐⭐⭐
WebM.webmBrowser recording⭐⭐⭐⭐
OGG.oggOpen source⭐⭐⭐⭐

Recording sources

Works with:

  • Phone recordings (Voice Memos, etc.)
  • Zoom/Teams/Meet recordings
  • Podcast files
  • WhatsApp/Telegram voice notes
  • Professional recordings

File limits:

  • Max size: 200MB
  • Max length: 2 hours
  • Larger files: Split first

Transcription options

Output formats

Standard transcription

Clean paragraphs with proper punctuation.
Perfect for notes and documentation.

Timestamped (SRT)

00:00:00 --> 00:00:05
Welcome everyone to today's meeting.

00:00:05 --> 00:00:12  
Let's start with our quarterly review.

Language support

Auto-detected:

  • English, Spanish, French, German
  • Italian, Portuguese, Dutch, Russian
  • Chinese, Japanese, Korean
  • 50+ languages total

Pro tip: Specify language in settings for faster, more accurate results

Real-world workflows

Meeting workflow

Markdown
## Team Meeting - {{date}}

Recording: [[meeting-audio.m4a]]

### Transcript
[Drops here after processing]

### AI Summary
- Key decisions...
- Action items...
- Next steps...

### Tasks
- [ ] @John - Review proposal (mentioned at 00:15:23)
- [ ] @Sarah - Send report (mentioned at 00:28:45)

Lecture notes workflow

  1. Record lecture on phone
  2. Upload to Obsidian
  3. Transcribe with timestamps
  4. Ask AI to:
    • Summarize key concepts
    • Create study guide
    • Generate quiz questions
    • Explain complex topics

Interview workflow

  1. Record interview (phone/recorder)
  2. Get speaker-separated transcript
  3. Use AI to:
    • Extract key quotes
    • Identify themes
    • Create article outline
    • Generate summary

Voice journaling workflow

Morning routine:
1. Record thoughts (2-5 minutes)
2. Auto-transcribe
3. AI organizes into:
   - Gratitude items
   - Today's priorities  
   - Ideas to explore
   - Mood tracking

Advanced features

Transcription providers

SystemSculpt API (default)

  • Included with premium
  • Optimized models
  • No extra setup

Custom providers

Yaml
Settings → Transcription:
- Provider: OpenAI/Groq/Custom
- API Key: [Your key]
- Model: Advanced speech recognition

Quality optimization

Best recording practices:

  • Quiet environment
  • Speak clearly
  • 6-12 inches from mic
  • Minimize background noise

File preparation:

  • Trim silence
  • Normalize volume
  • Use lossless formats when possible
  • Split very long files

Batch processing

Multiple files at once:

  1. Select all audio files
  2. Right-click → "Transcribe with SystemSculpt"
  3. Each creates separate transcript
  4. Perfect for interview series, lecture courses

Tips & tricks

Speed up workflow

Keyboard shortcuts:

  • Cmd/Ctrl + R: Start/stop recording
  • Drag & drop for instant processing
  • Create templates for common formats

Smart organization:

Audio/
├── Recordings/
│   └── 2024-01-15-meeting.m4a
├── Transcripts/
│   └── 2024-01-15-meeting.md
└── Summaries/
    └── 2024-01-15-action-items.md

Accuracy tips

DO:

  • Speak one at a time
  • Use good microphone
  • Record in quiet space
  • Process soon after recording

DON'T:

  • Record in noisy environments
  • Have multiple people talk over each other
  • Use extremely compressed files
  • Expect 100% accuracy (always review)

Integration ideas

With templates:

Markdown
## Voice Note - {{date:HH:mm}}
![[voice-{{date}}.m4a]]

### Transcript
<!-- Transcription appears here -->

### Key Points
- 

### Next Actions
- [ ] 

With AI chat:

  1. Transcribe meeting
  2. Drop into chat
  3. "Extract all action items with owners"
  4. "Create project timeline from discussion"
  5. "Identify risks mentioned"

Common issues

ProblemSolution
"Transcription failed"Check internet, file format, size limits
"Poor accuracy"Improve audio quality, reduce noise
"Wrong language"Manually specify in settings
"Takes too long"Large files need time, try smaller segments

Processing details

How it works

  1. Audio uploads securely to SystemSculpt
  2. AI models process speech-to-text
  3. Text returns formatted
  4. Original deleted from servers
  5. You get transcript in Obsidian

Privacy & security

  • Encrypted transmission
  • No permanent storage
  • Process completes in minutes
  • Your data stays yours

Performance

  • 5-min recording: ~30 seconds
  • 30-min recording: ~2 minutes
  • 1-hour recording: ~3-5 minutes
  • Varies with server load

Next steps


🎙️ Pro tip: Start with short recordings to test your setup, then tackle longer content!