-
Notifications
You must be signed in to change notification settings - Fork 24
Open
Description
@kba:
It would be very useful to have a transformation that extracts any tables from PAGE-XML to CSV.
Thoughts:
- each TableRegion needs its own CSV, so it's not immediately clear how this fits with the page→page converter paradigm
(e.g. for page→text, one could simply paste the CSV in the middle of the plaintext, but maybe creating a multitude of output files is usually better)- CSV may already be too coarse (no multi-span, no header distinction)
- perhaps better transfer to ocr-fileformat subrepo?
Metadata
Metadata
Assignees
Labels
No labels