Develop an script to generate OCR with ALTO standards for PDF and other outputs

The main scope of the project is to create a python bash or perl script that process OCR over a master folder of TIFF files or a PDF file.

The workflow will be:

1. Read a Master directory file with .Tiff files uncompressed

2. Start OCR processing

3. Generate Output files:

3.1. Output #1: Master PDF 1.4 (PDF/A-1)

3.2. Output #2: Access PDF 1.4 (PDF/A-1) - (Never bigger than 20mb.)

3.3. Output #2: ALTO XML with all the OCR information

3.4. Output #3: TXT with all the OCR text results

3.5. Output #4: TXT with all the OCR text results

4. Create XML with the info of the process

4.1 XML has to record all the processes and also record any failure on the process.

The project scope is to generate a script on bash, perl or python based on any Open-source OCR tools to do those tasks. To complete the project, the script has to be tested, checked and documented in detail. No chance to make anything different. During the development, the developer has to give us a report of the fields and values we have to work with and we will help on the definition. Can be a long term project if the results are good with much more image processing tasks.

Some valid Open-source tools that can bu used are:

- [login to view URL]

- [login to view URL]

- [login to view URL]

Reference documentation links:

- [login to view URL]

- [login to view URL]

- [login to view URL]

We can provide image samples and also give much more detail of the script to develop.

Other options can be used but need to be validated by us before start the development.

Project will be divided in two milestones, defined as here:

Milestone 1 is to complete:

- Point 1

- Point 2

- Point 3.1

- Point 3.3

Milestone 2 is to complete the project:

- Performance and correct any error of Milestone 1

- Point 3.2

- Point 3.4

- Point 3.5

- Point 4

- Documentation

- Testing

Feel free to ask any question

Evner: Python, XML, Shell Script, OCR, PDF

About the Client:
( 26 bedømmelser ) Marín, Spain

Projekt ID: #32722135

14 freelancere byder i gennemsnit $564 timen for dette job

(39 bedømmelser)

Hi Hiring manager I am Full Stack Principal Architect,Expert Image processing,OCR Most popular spells that I use: C#, Python, .NET, WPF, WCF, VSTO, SQL, OCR, PHP, Java, React.js, Node.js, Laravel. I have more than 15 Flere

$750 USD in 4 dage
(35 bedømmelser)

Note: I am available to start right now! Dear Hiring Manager, I have read out your job description and the requirements for the job.I am a professional Artificial intelligence and machine learning programmer . I Flere

$250 USD på 1 dag
(30 bedømmelser)

Hello, I read your project details and really interested in your mentioned job. I have 5+ years’ experience doing similar jobs related to these skills Python, Shell Script, PDF, XML and OCR. I think its doable job, and Flere

$750 USD in 8 dage
(8 bedømmelser)

Hi there, How r u? I have had a look and i am sure that i can handle this project well as i have experience in XML, PDF, OCR, Shell Script and Python. I have worked on similar projects before too. Please initiate the c Flere

$750 USD in 24 dage
(1 bedømmelse)
(2 bedømmelser)

How are you? We have AI & Data Science,Django team who are highly experienced in Machine learning and Deep Learning and can deliver products as per your requirements. We have done many real time projects like Semantic Flere

$700 USD in 7 dage
(5 bedømmelser)

Dear, sir. How are you? ~~~ Computer Vision Professional is here. ~~~ I've a good interest about your project as a computer vision professional who has been specializing in this field for over 8 years. I recently devel Flere

$700 USD in 7 dage
(8 bedømmelser)

----------------Professional OCR Expert! Best Result in Time!----------- Dear sir. I've read your project description very carefully. I've extensive experience in OCR, so I believe that I can provide excellent result i Flere

$500 USD in 7 dage
(2 bedømmelser)

Hi! I am an expert Python engineer. I am familiar with Python and I have a lot of work experiences in OCR, XML, Shell Script, Python and PDF. I can start right away. I want to discuss for this project in detail. Plea Flere

$500 USD in 5 dage
(2 bedømmelser)

Hi, I'm Aafreen Khan! Hope you’re doing well. I'll complete your project in the way you'll fall in love with because I've been working as a Full Stack Web & Software developer for 5 years. I provide end to end solutio Flere

$500 USD in 7 dage
(0 bedømmelser)

Hello to Spain, your project sounds very interesting for me and i could already realize numerous comparable projects. I have a lot of experience with your required tasks like: - OCR (OpticalDocumentRecognition Flere

$251 USD in 3 dage
(0 bedømmelser)

Greetings! I have reviewed your project details and as you need. I believe that I can assist you with this project. I have been working as a website developer for the past 8+ years and I have impressed a lot of my cl Flere

$500 USD in 7 dage
(0 bedømmelser)

Hello, Jorge! I have read your project requirements very carefully and with great interest. I am a python expert with 5 years of experience. Recently, I have performed tasks to interpret pdf documents with python. http Flere

$500 USD in 5 dage
(0 bedømmelser)