<?xml version="1.0" encoding="UTF-8"?><!-- generator="bbPress" -->

<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
>

<channel>
<title>doPDF Forum Topic: Search in PDF</title>
<link>http://www.dopdf.com/forum/</link>
<description>Discussion forum for doPDF users.</description>
<language>en</language>
<pubDate>Sat, 18 May 2013 13:02:46 +0000</pubDate>

<item>
<title>Softland on "Search in PDF"</title>
<link>http://www.dopdf.com/forum/topic/search-in-pdf#post-4677</link>
<pubDate>Mon, 24 Oct 2011 13:55:58 +0000</pubDate>
<dc:creator>Softland</dc:creator>
<guid isPermaLink="false">4677@http://www.dopdf.com/forum/</guid>
<description>&#60;p&#62;Hello,&#60;/p&#62;
&#60;p&#62;Unfortunately the PDF format supports by default only latin characters. The other characters are added in the PDF as embedded CID font subsets, with Unicode CMaps. You have to use a search text module capable to read this type of text from the PDF files to be able to extract all the characters correctly.&#60;/p&#62;
&#60;p&#62;Thank you for understanding.
&#60;/p&#62;</description>
</item>
<item>
<title>aras on "Search in PDF"</title>
<link>http://www.dopdf.com/forum/topic/search-in-pdf#post-4676</link>
<pubDate>Mon, 24 Oct 2011 09:12:50 +0000</pubDate>
<dc:creator>aras</dc:creator>
<guid isPermaLink="false">4676@http://www.dopdf.com/forum/</guid>
<description>&#60;p&#62;Hi all,&#60;/p&#62;
&#60;p&#62;My Client requirement is to do a PDF search (non-english) in the Search module of his e-learning website. When i try to extract the contents of PDF for indexing, some of the characters are neglected during extraction (empty spaces in that area,when i view the indexed contents in Luke). I am getting these problem for languages like Tamil/Hindi.&#60;/p&#62;
&#60;p&#62;The Client is very adamant that he wants  PDF search.&#60;/p&#62;
&#60;p&#62;What is the solution for this...Please give me a ray of light or guidelines.&#60;/p&#62;
&#60;p&#62;Thanks and Regards,&#60;br /&#62;
aras
&#60;/p&#62;</description>
</item>

</channel>
</rss>
