Vault eTMF users may upload a large number of documents frequently. The TMF Bot can automatically classify new documents and automatically set metadata, saving your organization time and effort. Auto-classification with the TMF Bot can help reduce the number of classification errors, and surface potential issues sooner, thus increasing compliance. The TMF Bot can also automatically set the Study value for documents.

If configured to use TMF Bot for auto-classification, Vault will analyze documents added to the Document Inbox and populate their Document Type, Subtype, and Classification fields, with the current status of each document listed in the TMF Bot column. This column is empty when TMF Bot is not enabled or if TMF Bot confidence level is lower than the PCT.

Once this auto-classification is complete, you have the chance to review the classification before completing and removing it from the Document Inbox.

If configured to use TMF Bot for metadata extraction, Vault automatically populates the Study, Study Country, and Site fields on documents, including any documents created via email processors or API.

How to use Auto-Classification or Metadata Extraction

Once an Admin has deployed an Auto-Classification Trained Model and a Metadata Extraction model in your eTMF Vault, no additional action is needed on your part. The following methods of adding documents to the Document Inbox result in auto-classification and automatic population of the Study, Study Country, or Study Site fields by the TMF Bot when possible:

The TMF Bot then follows these steps to queue, auto-classify, and automatically populate the Study field on uploaded documents:

  1. Vault checks the origin of each file and assigns it to a classification queue:
    • Documents uploaded via API, Vault Loader, FTP, or email are placed in a bulk processing queue, ensuring that large imports do not slow down typical auto-classification processes.
    • All other documents, including those uploaded via Vault Mobile, are placed in an express processing queue.
  2. The TMF Bot automatically scans each added document. If you navigate to the Document Inbox, you can see the progress for each document in the TMF Bot field. If you cannot see the TMF Bot field, you can add it as a column in your Document Inbox. Each document will list one of the following statuses:
    • Express Queued…: The TMF Bot is waiting to process the document from the express queue.
    • Bulk Queued…: The TMF Bot is waiting to process the document from the bulk queue.
    • Done: The file has finished processing.
  3. If the TMF Bot can auto-classify the document, the document has its Type, Subtype, Classification, or Study fields populated, and the Tags field will include the TMF Bot Auto-classified tag.
  4. If a Metadata Extraction model is deployed, the TMF Bot scans the file name and content to identify a Study > Study Country > Study Site hierarchy match in your Clinical Operations Vault. To populate Study, Study Country, and Study Site metadata, TMF Bot verifies or applies the following:
    • The user must have permissions on the matched Studies.
    • These Studies are not in study migration mode.
    • There is no ambiguity with another matched hierarchy.
    • If parent matches are found, only their children are considered as possible matches.
    • If there are multiple parent matches, their children matches can be used to determine a single parent.
    • If parents are not found, children must be unique to retrieve the hierarchy. If TMF Bot does not identify a hierarchy match, or rules above are violated, then the TMF Bot does not put anything in the Study, Study Country, or Study Site fields for that document.

For example, imagine a study, AVEG 027, that has the following sites in it: F1, US4, and US245.

  • Matches on the name “F1” will be completely ignored.
  • Matches on “US4” will only be considered if the Bot also finds a match for “AVEG 027”.
  • Matches on “US245” can be actioned, even if a match for “AVEG 027” is not found.

While the time to process each document can vary, Vault aims to have each file processed in five (5) seconds.

Accepting Auto-Classifications and Metadata Extractions

Once documents have a value of Done in the TMF Bot field, use the checkboxes to select documents, then click Complete to enter any necessary document fields. Note that you can only complete documents with the same classification in bulk.

Once completed, the uploaded documents are available for additional processing. Document tags will indicate whether the document was Auto-classified or had any metadata set by the TMF Bot.

Rejecting an Auto-Classification or Metadata Extraction

If you find that TMF Bot applied an incorrect classification, you can navigate to the document and select Reclassify as normal. When you manually reclassify a document, Vault tags the document as TMF Bot Misclassified.

If you find that TMF Bot populated an incorrect Study, Study Country, or Study Site for a document, you can navigate to the document and manually update those fields.

Auto-Classification Limitations

  1. Some document classifications may not be available to the TMF Bot. This is often because there were not enough documents to train the TMF bot on that classification or because the classifications were deliberately excluded from training.
  2. The TMF Bot only auto-classifies documents if it is confident in its selection. Documents typically have low confidence when the document could easily be classified as two or more different document types.
  3. Some categories of documents cannot be auto-classified. These include:
    • Audio or Video files
    • Non-text files, such a ZIP files, statistical files, or database files
    • Non-English files
    • Files where Vault cannot extract text, for example, if the text is too blurry or if the file is password-protected or encrypted.