Open-source Hermes skill turns videos into reports

The author says they built an open-source Hermes skill called Video Report Nemotron. The skill takes a YouTube, Bilibili, or local video and creates Markdown, HTML, or PDF reports. It uses existing subtitles first, and uses local Apple Silicon ASR when subtitles are missing. For visual reports, it can capture frames, run OCR, and add relevant screenshots to the report.

Key points

  • Video Report Nemotron is a Hermes skill for turning videos or video URLs into structured reports.
  • It tries to use existing subtitles before running ASR.
  • If subtitles are unavailable, it can use local Apple Silicon ASR.
  • It can produce Markdown, HTML, and PDF report files.
  • For visual reports, it uses frame capture and OCR to add relevant screenshots.

Quick term guide

open-source
Software whose code is shared publicly so others can inspect, use, or change it.
Hermes skill
A small add-on module that gives the Hermes AI agent a new capability or behavior.
Markdown
A simple text format for headings, lists, links, and other basic document structure.
local app
An app that runs on your own computer instead of only on a website.
Apple Silicon
Apple's own line of chips (M1, M2, M3, M4, M5) used in Macs, known for performance and efficiency.
screenshot
A digital image that shows exactly what is visible on a computer screen.
hermes-agent
A likely name for Nous Research’s agent-style AI tool or service.
podcasts
Audio shows people can listen to online.
Read original