Описание тега python-camelot
Camelot is a Python library that makes it easy for anyone to extract tabular data from PDF files.
Camelot is a Python library that makes it easy for anyone to extract tabular data from PDF files.
Why Camelot?
- You are in control. Unlike other libraries and tools which either give a nice output or fail miserably (with no in-between), Camelot gives you the power to tweak table extraction. (This is important since everything in the real world, including PDF table extraction, is fuzzy.)
- Bad tables can be discarded based on metrics like accuracy and whitespace, without ever having to manually look at each table.
- Each table is a pandas DataFrame, which seamlessly integrates into ETL and data analysis workflows.
- Export to multiple formats, including JSON, Excel and HTML.