Summary
A Reddit discussion highlights the persistent challenge of accurately extracting data from complex PDF tables, particularly borderless or multi-column financial data, even when using Vision-Language Models (VLMs). Despite trying several open-source tools, users report a lack of robust, free solutions, with only paid options like LandingAI proving effective.
What happened
A user on r/MachineLearning initiated a discussion regarding the difficulties encountered when attempting to extract table data from PDFs and convert it into Markdown format. The primary pain points identified are handling tables without borders and those containing more than 5-6 columns, especially within financial documents.
Key details
- The user has experimented with various open-source tools, including `docling`, `graphite-docling`, and `marker`, but none have provided a consistently solid solution for complex table structures.
- The only tool noted to perform well for this specific challenge is LandingAI, which is a paid service.
- The discussion underscores a perceived gap in the open-source ecosystem for advanced table extraction capabilities, despite the advancements in Vision-Language Models (VLMs).
What to watch
This community discussion points to an ongoing practical challenge in document AI. The demand for robust, open-source table extraction solutions remains high, particularly for complex and unstructured table layouts. Future developments in open-source VLMs or specialized table extraction models that can effectively handle these edge cases would be highly valuable to the AI and data science communities.
Editorial note
AI Dose summarizes public reporting and links to original sources when they are available. Review the Editorial Policy, Disclaimer, or Contact page if you need to flag a correction or understand how this site handles sources.
Continue Reading
Explore related coverage about community news and adjacent AI developments: [r/ML] Why ML conference reviews sometimes feel like a “lottery“ [D], [HN] Show HN: Hackamaps – A global hackathon map I build after hitting Lovable Limits, [r/ML] A Hackable ML Compiler Stack in 5,000 Lines of Python [P], [r/ML] Phosphene local video and audio generation for Apple Silicon open source (LTX 2.3) [P].
Related Articles
- [r/ML] Why ML conference reviews sometimes feel like a “lottery“ [D]
May 2, 2026
- [HN] Show HN: Hackamaps – A global hackathon map I build after hitting Lovable Limits
May 2, 2026
- [r/ML] A Hackable ML Compiler Stack in 5,000 Lines of Python [P]
May 1, 2026
- [r/ML] Phosphene local video and audio generation for Apple Silicon open source (LTX 2.3) [P]
May 1, 2026
Next read
[r/ML] Why ML conference reviews sometimes feel like a “lottery“ [D]
Stay with the thread by reading one adjacent story before leaving this update.
Comments
Sign in to leave a comment.