Hyper-Intelligent Document Processing AI Example program
This script uses Google's Document AI and the OpenAI API to go beyond simple extraction. It reads an invoice, extracts key data, and then uses GPT-4 to identify any unusual payment terms.
Program begins
# Requires: pip install google-cloud-documentai openai python-dotenvimport osfrom google.cloud import documentaifrom openai import OpenAIfrom dotenv import load_dotenvload_dotenv() # Load API keys from .env file# Configure clientsdocai_client = documentai.DocumentProcessorServiceClient()openai_client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))# 1. Process Document with DocAIPROJECT_ID = "your-google-cloud-project-id"LOCATION = "us" PROCESSOR_ID = "your-invoice-processor-id" file_path = "invoice.pdf"with open(file_path, "rb") as image: image_content = image.read()document = {"content": image_content, "mime_type": "application/pdf"}name = f"projects/{PROJECT_ID}/locations/{LOCATION}/processors/{PROCESSOR_ID}"request = {"name": name, "raw_document": document}result = docai_client.process_document(request=request)document_json = result.document.to_json()# 2. Extract structured data from DocAI response (simplified)# In a real scenario, you would parse the full entity list from document_jsoninvoice_data = { "supplier_name": "ABC Supplies Ltd.", "invoice_total": "$12,345.67", "due_date": "2023-12-01", "payment_terms": "Net 90" # Let's say this is unusual}# 3. Use Generative AI for Analysis & Anomaly Detectionprompt = f"""Analyze the following invoice data and point out any unusual or non-standard terms for a typical B2B transaction.Focus on payment terms, large amounts, or missing information.Invoice Data:{invoice_data}Provide a concise summary for an accounts payable specialist."""response = openai_client.chat.completions.create( model="gpt-4", messages=[ {"role": "system", "content": "You are a helpful financial analyst assistant."}, {"role": "user", "content": prompt} ], max_tokens=200)analysis = response.choices[0].message.contentprint("*** Generative AI Analysis ***")print(analysis)# Output might be: "Warning: The payment terms 'Net 90' are unusually long for this industry. Standard terms are typically Net 30. The invoice total of $12,345.67 is high and may require managerial approval according to company policy."
end
No Comments have been Posted.