Hi, my name is Paul and I'm a python developer of cloud apps backend and solution architecture. It so happens that I developed a script that converts PDF files to excel sheets and also that extracts BI columnar-oriented data. This script does not use OCR but I have a little experience with PyTesseract as well and that could come in handy in this case. Also I'm a machine learning enthusiast and have fiddled with that for fun in the late days, as well as I have a friend that got his Master degree in this matter and gave me some tips and advices. Hope we can work it out. Cheers.