We use cookies to give you the best experience on our website. If you continue to browse, then you agree to our privacy policy and cookie policy. (Last updated on: November 16, 2018).
Unfortunately, activation email could not send to your email. Please try again.
Syncfusion Feedback

PDF ExtractText using Physical Layout

Thread ID:

Created:

Updated:

Platform:

Replies:

94100 Apr 21,2010 01:12 PM UTC Oct 14,2013 04:25 AM UTC ASP.NET Web Forms (Classic) 3
loading
Tags: PDF
HY
Asked On April 21, 2010 01:12 PM UTC

Hi,

Does anyone know how to extract text from PDF document according to the physical layout (WYSIWYG)?


I am working on PDF document from vendor where the content are not created in a linear fashion.

For example, the content in the PDF file can be:
Name: Apple ***
Age : 21 ***
Sex : Male ***


When I use ExtractText function, I will get the following string:
Name: ***
Age : ***
Sex : ***
Apple
21
Male


What I want to get is:
Name: Apple ***
Age : 21 ***
Sex : Male ***

Any advice is appreciated.
Thanks.

Regards
HY

Angappan G [Syncfusion]
Replied On April 26, 2010 10:33 AM UTC

Hi,

Thank you for your interest in Essential Studio.

Essential pdf supports text extraction, where the glyphs will be extracted according to the order in which they are stored in the document structure.Please have a look at the sample in the link below where the contents of the document are extracted in a linear fashion.

Sample Link:
http://files.syncfusion.com/samples/PDF.Windows/TextExtractionSample.zip

Please try this and let us know if you have any queries.

Regards,
Angappan.

Robert Titular
Replied On October 9, 2013 07:30 PM UTC

I have the same request. When I tried out the attached solution, there is no difference in the out text file. I'm looking to have the text in the physical layout of the original document.

I have v11.3.0.30 of the PDF library.

Is this possible? I just tried the trial version of Aspose's pdf .net product and was able to extract the text to match the physical layout of the original document.

 

 

 


Praveenkumar H [Syncfusion]
Replied On October 14, 2013 04:25 AM UTC

Hi Robert,

Thank you for using syncfusion products,

Please provide us the sample input file.
It will help us to investigate further in this.

With Regards,
Praveen

CONFIRMATION

This post will be permanently deleted. Are you sure you want to continue?

Sorry, An error occured while processing your request. Please try again later.

Please sign in to access our forum

or the page will be automatically redirected to sign-in page in 10 seconds.

Warning Icon You are using an outdated version of Internet Explorer that may not display all features of this and other websites. Upgrade to Internet Explorer 8 or newer for a better experience.Close Icon

;