Winnovative Software Solutions HTML to PDF Converter


 HTML to PDF Converter for .NET - Free Converter Application - PDF Creator
 PDF Merge/Split - PDF Security - RTF to PDF Converter - Excel Library - Charts

 
Skip Navigation Links
Home
ProductsExpand Products
Online DemoExpand Online Demo
Download
Buy NowExpand Buy Now
SupportExpand Support
ContactExpand Contact
 

PDF To Text Converter for .NET

The Winnovative PDF to Text Converter can be used in any type of .NET application to extract the text from a PDF document. The integration with existing .NET applications is extremely easy and no installation is necessary in order to run the converter. The downloaded archive contains the assembly for .NET and a ready to use sample console application. The full C# source code for the sample application is available in the Samples folder. The sample application can be built with Visual Studio 2005, 2008 or 2010. The result of conversion is a .NET String object that you can use for example in search operations or save into a file on disk.

The PDF to Text Converter does not require Adobe Reader or other third party tools.

Features

PDF Split for .NET Box
.NET development library and C# samples
Extract text from PDF stream or a PDF file
Extract text preserving the original PDF layout
Extract text in PDF reading order
Does not require Adobe Reader or other third party tools
Specify the range of pages to be extracted
Save the extracted text in a HTML format and add description meta tags
Add the title, keywords, author from PDF description in HTML meta tags
Mark the page breaks in the extracted text with a special character
Extract text from password protected PDF documents


Code Sample for PDF to Text Conversion

The code below was taken from the command line sample available for download in the PDF to Text Converter archive. In this sample an instance of the PdfToTextConverter class is constructed and the converter properties are set based on the command line arguments. Then the converter method is called to extract the text from the source PDF document and the resulted text is saved in a file on disk in a text or HTML format using the UTF-8 encoding.
// create pdf to text converter
        PdfToTextConverter pdfToTextConverter = new PdfToTextConverter();
        // set converter options
        pdfToTextConverter.Layout = layout;
        pdfToTextConverter.MarkPageBreaks = markPageBreaks;
        pdfToTextConverter.AddHtmlMetaTags = addHtmlMetaTags;
        pdfToTextConverter.UserPassword = userPassword;

        // get output file path
        string outFileName = System.IO.Path.Combine(System.IO.Path.GetDirectoryName(srcPdfFile),
                System.IO.Path.GetFileNameWithoutExtension(srcPdfFile));
        if (addHtmlMetaTags)
            outFileName += ".html";
        else
            outFileName += ".txt";

        // extract text from PDF
        string extractedText = pdfToTextConverter.ConvertToText(srcPdfFile);

        // write the resulted string into an output file in the working directory
            using UTF-8 encoding
        System.IO.File.WriteAllText(outFileName, extractedText, System.Text.Encoding.UTF8);
 

Licensing

 
The LicenseKey property of the PdfToTextConverter class should be set with the string you have received after the product purchase. The default demo license key allows you to extract the text only from the first two pages of the PDF document. If you need a full featured evaluation for a limited period of time you can use the demo license key obtained after product download or you can request a new one by email from the product support team.

HTML to PDF Converter