PDF Processor

PROJECT: PDF Processing

OVERVIEW: I have a large number of large catalogs that are stored as PDF files. Some of the PDFs can be several hundred pages and more than 50MB.

I need to have these files processed so we can better use them in our shopping system. You will need to be able to

1. extract, process and then write individual pages of the PDF

2. search for, and report finding, strings within the pages

3. insert hyperlinks around strings in pdf file


1. A source pdf file {source}.pdf. May be several hundred pages long and over 50MB in size.

2. A SearchString definition. This search string is a 'regular expression' used to locate part-numbers within the PDF. (see [url removed, login to view]).

3. A URLtemplate string. This template will be used to insert links into the pdf. The found partnumber will be inserted into the template string and then added back into the pdf document. For example:

URLtemplate = [url removed, login to view]{partnumber}


1. A .csv spreadsheet which lists each partnumber found in the first column and a comma-separated list of the pages where it was found in the second column. This file is written to {sourcefilename}[url removed, login to view] in a subfolder called {sourcefilename}_pages

2. A subfolder {sourcefilename}_pages where each page of the source is written out as {source}_{page}.pdf. Before the page is written out, the page is searched for SearchString, and if found, the string is replaced with URLtemplate

3. The {source} document is searched for SearchString, and if found, the string is replaced with URLtemplate. This processed source file is written as {source}.pdf in the {sourcefilename}_pages subfolder


1. Prefer the code to be written in ASP/VB but will also consider alternate language implementations provided support is also included in installing the language on my server.

2. The code should be written in 2 parts, one as a [url removed, login to view] library that can be included into other programs, and the second as a [url removed, login to view] which will ask user for the inputs and then produce the outputs

3. I will provide you with a .pdf document. You will have the code hosted on an ASP server (I can provide this if you like). You will provide a url where the code can be demonstrated.

4. The project should be completed in no more than 10 days from the start of the project.


If you are interested in bidding on this project please provide the following

1. Brief description of your experience with ASP and working with PDF files

2. If you have a company website, please provide the URL

3. The day you will be able to start the project

4. When you expect the project to be finished

5. What your fee will be for the project

6. Your contact information

Regards, Andy


Se mere: what to write on my overview, what is search string, what is a search string, what is a regular expression, vb programs, template library, string processing in c, string library in c, string library c, start a wikipedia page, some search string, search string definition, regular expression no, regular expression in c, regular expression example, regular expression code, regular expression c, regular expression a, regular expression 0, parts list template, out source website finding, library template website, library template, library part, finding template

Om arbejdsgiveren:
( 0 bedømmelser ) Ridgefield, United States

Projekt ID: #113969