Free Magento 2 module exports your store catalog into AI-readable files for RAG

A new open-source module for Magento 2 stores automatically exports product catalogs and CMS pages into formats that AI tools can read. Connect these files to a RAG system and an AI can answer questions using your real store data. It's a practical tool for store owners already running Magento.

RAG (Retrieval-Augmented Generation) is a technique where an AI looks up documents you provide before writing an answer — making it more accurate and grounded in your actual data. This module takes a Magento 2 store's product catalog and CMS pages (like About or Policy pages) and exports them as llms.txt, llms-full.txt, or streaming JSONL files. Different AI tools prefer different formats, and streaming JSONL handles large catalogs efficiently without loading everything into memory at once.

Once exported, these files can feed an AI search engine or chatbot so it can answer questions like "recommend a waterproof hiking boot" using real inventory data. If you don't use Magento this is not directly applicable, but it's a concrete example of how RAG pipelines are built for e-commerce data.

Key points

  • Automatically converts Magento 2 store data into AI-readable formats (llms.txt, etc.)
  • Supports three export formats: llms.txt, llms-full.txt, and streaming JSONL
  • Connect the output to a RAG system so an AI can answer questions from your store data
  • Streaming JSONL handles large catalogs without high memory use
  • Open-source and free to install or modify

Quick term guide

open-source
Software whose code is shared publicly so others can inspect, use, or change it.
CMS pages
Non-product pages on a store site, such as About Us, Shipping Policy, or FAQ pages.
RAG (Retrieval-Augmented Generation)
A technique where an AI searches an external knowledge base for relevant information before generating its answer
streaming JSONL
A file format that sends data line by line instead of all at once, so large files use much less memory.
streaming
Here it means text is generated continuously as you speak, rather than waiting until you finish talking.
search engine
A website like Google or Bing that helps you find information on the internet.
RAG pipeline
The full process of splitting documents into chunks, converting them to embeddings, storing them, and searching them at query time.
e-commerce
Buying and selling products online.
Read original