DEV Community
•
2026-04-12 13:39
The Haystack converter that handles 91+ file formats without a Cloud API
Haystack already has converters for PDFs, for DOCX, for HTML. If you're building a RAG pipeline, you've probably used at least two of them. But if you've ever tried to build an indexing pipeline that handles everything a user might throw at it like PDFs, scanned invoices, spreadsheets, PowerPoint decks, images, archives, you know the pain. You end up wiring together three or four different convert...