How to Process PDF Files

How to optimise the value of your content when uploading PDF files into your Knowledge Graph Workflow.

How can PDFs add value to your Knowledge Graph Workflow?

You can upload PDF files to your Knowledge Bank in the Knowledge Graph Workflow. The information in these files can be utilized to help answer end-user questions through the Agent.

Agents can now ingest and index images to analyze visual information alongside text.

Before you upload your PDF files, it's important to follow the guidelines below. These guidelines will ensure that the Mindset platform can read and comprehend your document effectively, allowing you to derive maximum value from your content.

PDF Guidelines for Optimal Processing:

Password Protected Documents: Unfortunately, we are unable to process any document that has password protection enabled. Please ensure to remove any password protection before uploading your PDF documents.

AI-Based Image Ingestion: We use AI processing to correctly understand PDF content. While this significantly enhances our ability to process and extract information from various document formats, there are potential issues to be aware of:

  • Handwritten Notes: PDFs containing handwritten notes or annotations may not be accurately interpreted by the AI, leading to potential gaps in the extracted information.

  • Low-Resolution Images: Images with low resolution or poor quality might result in inaccurate text extraction. Ensure your PDF images are of high quality for the best results.

Always review the transcription of your content to ensure that data has been accurately captured. If you see issues, consider updating the source content to improve the AI detection of important elements.

Following these guidelines will help maximize the effectiveness of the Mindset platform in reading and comprehending your PDF documents, allowing you to derive maximum value from your content.


Understanding Content Segmentation: Audio/Visual vs. PDFs

Our platform efficiently organizes audio and visual files by specific topics, allowing end-users to easily locate relevant content.

For PDFs, the organization can be likened to written transcripts of lectures. Instead of topics, our system marks every 10 sentences as a new 'chapter,' with a careful 2-sentence overlap between chapters to maintain connection and continuity.

Our current approach organizes them in a sentence-based structure, ensuring uniformity in 'chapters' without compromising informativeness.

If you'd like to read more about setting up your Knowledge Graph Worfklow, check out Knowledge Graph workflow FAQs.

Last updated