Extract some info from PDF files, capture this info and load it into DB
$2-8 USD / hour
Closed
Posted over 5 years ago
$2-8 USD / hour
Actually, I need to to load manually information from aprox 1000 pdf files/day (invoices, purchase orders, etc.). Those files have approximately 60 different templates and resides in a Windows Server with a folder tree like: "d:\Folder\Folder\ Folder\001\pdffile00[1...n].pdf"
I need to do:
1. Connect to the Win server and pull the files to a linux server preserving directory structure
2. Once the files are in the linux server extract the info I need (between 20 and 60 fields per file) and load this info into a table in MariaDB.
Other considerations:
1. To pull the files could be a shell script
2. To read the files a java application is preferred because, later we will need to integrated into
OpenKM ([login to view URL]) as an extension
Please send your offer in hour/man a $/hour.
Hi Friend,
I have huge experience in java development and i worked on pulling files from FTP server, parsing it and loading data to the Database. I reviewed your requirement it's looking good to me. I will do this automation job for you.
Apart from that i also worked on OpenKM customization and when required i will do this integration in that. you can see from my profile regarding my experience on Java/J2EE platform. Please feel free to contact me.
7$ hour & 40$ per man day
Thanks,
hello
I have done similar word recently. I read pdf file, extracted exam data and calculated results for students. also updated all data into databse.
so I am confident about your work. but as you said there are around 60 templates so it will take time.
looking forward to hearing from you.
thanks
mahavir
Hi, sir!
I had a close look to your project.
I am an experienced programmer and I'm sure I can complete your project asap.
If you award this project to me, I'll complete it in time.
I promise a high quality and punctual work.
Please contact with me.
Best regards.
Followings are my services what I will provide if we work together.
1. Daily or Weekly result.
2. Clean and Robust code after Completion.
3. Responsible communication for more than 10 hours each day.
i have 8 years of experience in Java J2EE spring web application development including Eclipse RCP plugin development. i have experienced in pdf extraction using java itext or pdfbox apis. spring boot project we can do. only connecting the windows machine from linux and copying script i need to do more analysis. if interested we can discuss further.
Expert in shell scripting, python and Big Data
***********************************************************************************************************************************************************************************************************
I already working PDF edit in java spring project using PDFBox lib. i have working last 6 month for PDF add annotation and modify annotation and extract data from PDF .
let me know so i can show you demo for simple pdf file extract .if you have pdf file then send me will show you your pdf file extract data.
Thanks
Hi, I am a java developer for almost 5 years and I know shell script as well. Extracting records from pdf is easy if PDFs aren't locked.
So tell me if you want this to be done.
Have a nice time