1) User opens a web page with file upload form
2) User uploads a scanned document
3) Data is extracted in the backend via a script
4) Data is saved in MySQL db via a script
Please see the link below for scanned document examples:
[login to view URL]
Data from example document#2 that should be extracted:
Invoice Number - Pavadzīme Nr. ABT 2930
Supplier - Piegādātājs AB Transystems SIA
Supplier address - [login to view URL] Maskavas iela 227,Rīga,LV-1019
Supplier Registration Number - Reģ. Nr. 40003741261
Vat Nr - PVN Nr - LV40003741261
Bank account Nr. - Konts - LV91HABA0551009805564
Sum without VAT - Summa bez PVN: 6.66
Sum with VAT - Summa ar PVN (EUR) 8.06
Product Fields - Preču nosaukums, Mērv, Daudz, Cena, Summa
(JTH 48B M10x30 regulējošās kājiņas gabals 4 0,590 2,39)
There will be multiple scanned document templates with different designs/looks.
If you’re up for this job - please provide us with the necessary information:
a) Which programming language/libraries will you be using?
b) For multiple templates, will the same data extraction logic/pattern be applied, or will it be needed to customize for each template?
c) What would be the minimum requirements for the scanned document in terms of quality and dimensions(px) for the script to work?
d) When can you start work on this project?
For this project - it would be best to use an already available solution. I would suggest using Apache Tika ([login to view URL])
12 freelancere byder i gennemsnit €519 på dette job
Hello, I have read the details provided and i am positive i can provide quality work,please contact me to discuss more on the project deadline and some other few things