Scanning and Text Recognition (OCR)

Paper documents can be added to M-Files by using a network scanner or a local scanner. For more information on network scanning, refer to Scanner Sources. When using local scanning, the scanner must be directly connected to the computer that is used to add the scanned file to M-Files. The scanning functions can be accessed by pressing the Alt key and then opening the Operations menu.

Note: Scanner integration in M-Files Desktop uses the TWAIN and WIA technologies. Only scanners that can be equipped with a TWAIN or WIA driver are supported.

If the M-Files OCR (optical character recognition) module is enabled, M-Files suggests that the scanned file can be converted to a searchable PDF by character recognition once the scanning is completed. You can activate the character recognition or ignore it. You can also define advanced settings for the character recognition.

Note: The M-Files OCR module is an M-Files add-on product available for extra fee. It can be activated with a license code. For more information, see Enabling the M-Files OCR Module and Managing Server Licenses. M-Files uses an OCR engine offered by IRIS. For the M-Files OCR module purchase inquiries, please contact our sales team at [email protected].

You can convert an image file to a searchable PDF as well. The optical character recognition is performed on the image file to enable full-text searching across the file. After the conversion, you can find, for example, a contract document converted from an image by performing a search using the names of the contracting parties or any other text included in the original image file.

M-Files also automatically suggests the character recognition if you drag an image file to M-Files, provided that you have the M-Files OCR module installed. M-Files does not suggest the character recognition for PDF files, because performing the optical character recognition on an already searchable PDF reduces the quality and increases the size of the PDF file. You must convert non-searchable PDF files into searchable PDF files manually via the context menu of the PDF file.

Optical character recognition can be performed on the following file formats:
  • TIF
  • TIFF
  • JPG
  • JPEG
  • BMP
  • PNG
  • PDF
TIFF files using an alpha channel or JPEG compression are not supported.
Note: If text recognition is performed on an image file which was not saved and returned to M-Files, the file will only be saved as a PDF. Otherwise, the original image file can be found in the document version history.

Importing Image Files as Searchable PDFs

To import a picture file to the vault as a searchable PDF:
  1. Drag and drop an image file to M-Files.
  2. Optional: In the Conversion to Searchable PDF dialog, check the Use automatic language detection checkbox to set M-Files to automatically detect the document language.
  3. Optional: In the Conversion to Searchable PDF dialog, click Advanced to improve the quality of the text recognition by selecting primary and secondary language options to match the language used in the image.
    Opening the advanced options disables the option to use automatic language detection.
  4. Click Convert to start the conversion.
  5. Once the conversion is complete, the New Document dialog appears. Finish importing the image by filling in the metadata and clicking Create.
The image file is imported to to the vault as a searchable PDF, allowing you to locate it by using the M-Files search functions.

Converting an Image File Stored in M-Files to a Searchable PDF

  1. In M-Files, locate the image file that you want to convert to a searchable PDF.
  2. Right-click the file and select Scanning and Text Recognition (OCR) > Convert to Searchable PDF... from the context menu.
  3. Optional: In the Conversion to Searchable PDF dialog, check the Use automatic language detection checkbox to set M-Files to automatically detect the document language.
  4. Optional: In the Conversion to Searchable PDF dialog, click Advanced to improve the quality of the text recognition by selecting primary and secondary language options to match the language used in the image.
    Opening the advanced options disables the option to use automatic language detection.
  5. Click Convert to start the conversion.
The image file is converted into a searchable PDF and any textual content in the image can be found using the search functions of M-Files.