gradio==4.44.1 pytesseract python-docx camelot-py[cv] # for digital-table parsing pdf2image # for fallback OCR on images pytesseract Pillow rapidfuzz pdfplumber openai