<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"><channel><category>pcsoft.us.windev</category><copyright>Copyright 2026, PC SOFT</copyright><lastBuildDate>24 Oct 2015 17:43:14 Z</lastBuildDate><pubDate>16 May 2012 08:34:35 Z</pubDate><description>Hi.&#13;
&#13;
I would like to extract text from a PDF file.&#13;
There is a table inside the pdf, and when i use pdftotext, the text is read but unsorted. (it reads all the cells of column 1, then all cells of column 2, etc). And i cannot manage to sort it, because doesn't even put a CR (carriage return) character after each row. So after pdftotext I only see a very large string without CR.&#13;
&#13;
If I convert the file using adobe acrobat (file &gt; save as &gt; txt) it converts ok.&#13;
&#13;
&#13;
I read in the french forum (with google translate) about "abby pdf transformer". It's a dll that I can use in windev,  but it cost 1600 $.&#13;
&#13;
Thanks.</description><ttl>30</ttl><generator>WEBDEV</generator><language>en_US</language><link>https://forum.pcsoft.fr/es-ES/pcsoft.us.windev/36217-read-text-from-pdf/read.awp</link><title>Read text from pdf.</title><managingEditor>moderateur@pcsoft.fr (El moderador)</managingEditor><webMaster>webmaster@pcsoft.fr (El webmaster)</webMaster><item><author>peterwalll</author><category>pcsoft.us.windev</category><comments>https://forum.pcsoft.fr/es-ES/pcsoft.us.windev/36217-read-text-from-pdf-54628/read.awp</comments><pubDate>24 Oct 2015 17:43:14 Z</pubDate><description>I think you should have a fine OCR component to read text from pdf and check some free trial packages of some 3rd party pdf conv…</description><guid isPermaLink="true">https://forum.pcsoft.fr/es-ES/pcsoft.us.windev/36217-read-text-from-pdf-54628/read.awp</guid><link>https://forum.pcsoft.fr/es-ES/pcsoft.us.windev/36217-read-text-from-pdf-54628/read.awp</link><source url="https://forum.pcsoft.fr/es-ES/pcsoft.us.windev/36217-read-text-from-pdf/read.awp">Read text from pdf.</source><title>Re: Read text from pdf.</title></item></channel></rss>
