How to Successfully Process PDF Files

How to optimise the value of your content when uploading PDF files into your Knowledge Graph Workflow.

How can PDFs add value to your Knowledge Graph Workflow?

You can upload PDF files to your Knowledge Bank in the Knowledge Graph Workflow. The information in these files can be utilised to help answer end-user questions through the Knowledge Assistant.

Before you upload your PDF files, it's important to follow the guidelines below. These guidelines will ensure that the Mindset platform can read and comprehend your document effectively, allowing you to derive maximum value from your content.

Text-Based PDFs: Please ensure the PDFs you are uploading into your knowledge bank have selectable text. This means when you open the PDF, you are able to highlight the text. If you are unable to do so, it may mean that the text is part of an image, meaning the platform may not be able to read and process it.

Images in PDFs: If the PDF you are uploading into your application has pictures with words or information on them, such as an infographic or poster, the system will not be able to process the image or text inside those images.

Tables & Multi-Column Layouts: If the document you are uploading into your application has tables or sections where the text is displayed in columns or grids, the platform may not be able to process this text accurately. Please note, this includes any part of the document where text is in rows and columns, even if no lines divide them.

Password protected documents: Unfortunately, we are unable to process any document which has password protection enabled. Please ensure to remove any password protection before uploading your PDF documents.


Understanding Content Segmentation: Audio/Visual vs. PDFs

Our platform efficiently organises audio and visual files by specific topics, allowing end-users to easily locate relevant content.

For PDFs, the organisation can be likened to written transcripts of lectures. Instead of topics, our system marks every 10 sentences as a new 'chapter,' with a careful 2-sentence overlap between chapters to maintain connection and continuity.

While ongoing efforts aim to improve PDF handling, our current approach organises them in a sentence-based structure, ensuring uniformity in 'chapters' without compromising informativeness.

If you'd like to read more about setting up your Knowledge Graph Worfklow, check out How to configure the Knowledge Graph Banks, or Knowledge Graph workflow FAQs.

Last updated