Read text from pdf. - WINDEV 2024 - Foro de desarrolladores

FOROS PROFESIONALES
WINDEV, WEBDEV y WINDEV Mobile

Inicio

Mensajes recientes

Conéctese...

Español

Inicio → WINDEV 2024 → Read text from pdf.

Read text from pdf.

Iniciado por guest, 16,may. 2012 08:34 - 1 respuesta

guest

Publicado el 16,mayo 2012 - 08:34

Hi.

I would like to extract text from a PDF file.
There is a table inside the pdf, and when i use pdftotext, the text is read but unsorted. (it reads all the cells of column 1, then all cells of column 2, etc). And i cannot manage to sort it, because doesn't even put a CR (carriage return) character after each row. So after pdftotext I only see a very large string without CR.

If I convert the file using adobe acrobat (file > save as > txt) it converts ok.

I read in the french forum (with google translate) about "abby pdf transformer". It's a dll that I can use in windev, but it cost 1600 $.

Thanks.

Informar

Debe estar conectado para evaluar este mensaje

peterwalll

Publicado el 24,octubre 2015 - 17:43

I think you should have a fine OCR component to read text from pdf and check some free trial packages of some 3rd party pdf converters: http://www.pqscan.com/convert-pdf/
I hope you success. Good luck.

Informar

→ Volver a WINDEV 2024