|
| Iniciado por guest, 16,may. 2012 08:34 - 1 respuesta |
| |
| | | |
|
| |
| Publicado el 16,mayo 2012 - 08:34 |
Hi.
I would like to extract text from a PDF file. There is a table inside the pdf, and when i use pdftotext, the text is read but unsorted. (it reads all the cells of column 1, then all cells of column 2, etc). And i cannot manage to sort it, because doesn't even put a CR (carriage return) character after each row. So after pdftotext I only see a very large string without CR.
If I convert the file using adobe acrobat (file > save as > txt) it converts ok.
I read in the french forum (with google translate) about "abby pdf transformer". It's a dll that I can use in windev, but it cost 1600 $.
Thanks. |
| |
| |
| | | |
|
| | |
| |
| Publicado el 24,octubre 2015 - 17:43 |
I think you should have a fine OCR component to read text from pdf and check some free trial packages of some 3rd party pdf converters: http://www.pqscan.com/convert-pdf/ I hope you success. Good luck. |
| |
| |
| | | |
|
| | | | |
| | |
|