In the past, we’ve discussed the importance of adding quality metadata for several reasons. Of course, better metadata will improve all aspects of search performance. Also, the right identifiers and tags can provide you with business intelligence, audit support, ideas for extra revenue streams and more. The real gold mine is in converting unstructured data into structured data.
Here’s a quick one-minute primer on what exactly metadata is:
Adding this additional information to describe your documents may seem like a big job, and that’s certainly true if you don’t use the best tools. Not to fear. Today’s artificial intelligence will help create and add better metadata with less effort. Plus, it will work with all types of files, including text, graphics, audio, and video.
Despite the obvious benefits of having high-quality metadata, the task of adding it to countless files and records might seem like an incredibly time-consuming and tedious process. You may not have the manpower to comb through hundreds of thousands of text, image, and video files to select relevant keywords. Even if you do have the resources, that might not seem like the best use of your people. Advances in AI and machine learning can minimize human effort and produce excellent results and in that way, AI is finally living up to the hype.
This list provides a basic overview of the types of AI technology that systems use to help with metadata creation:
In the past, people associated indexing mostly with text documents. Modern AI isn’t just limited to text files. Case in point… Las Vegas face recognition can identify known cheaters and card-counters in video images. You may have also seen examples of this on popular social networks. Like when Facebook knows that family reunion photo has Aunt Wanda in it and suggests a tag. Language processing can extract meaning from speech in audio files. Combining various techniques will also extra tags from video files.
Thus, you can use AI to help create and add metadata to text, graphics, and video files. For instance, today’s search engines can index and categorize .MP3 and .JPG files as well as .HTML and .PDF files. An intelligent information management system can do the same thing inside of your organization.
Consider some examples from CMSWire of using intelligent systems to categorize various types of files:
Images: The healthcare field has relied heavily on image recognition technology for all sorts of medical scans. Other industries can use this tech to help categorize scanned documents, including handwriting. If you have deposited a hand-written check in the ATM, you have probably seen this kind of image recognition at work.
Audio: Common examples of intelligent speech processing include Amazon Alexa and similar home systems. You have probably also used voice-to-text to compose text messages or request searches on your mobile phone. This same technology can find patterns in your company’s audio recordings.
Video: Analyzing video files combines the AI tech that’s used to process images, text, and audio. For example, you might tag everybody at a meeting by using facial recognition of a recording. Similarly, you may set time indexes of a video to make it easier to find the exact moment when a certain topic got discussed.
AI can help reduce effort and, in some cases, improve the quality of your metadata. Mostly, intelligent systems can make projects possible that you may lack the time or funds to accomplish quickly if you had to do them manually. Even better, these systems learn as they work, so they can provide increasingly better and more useful results over time. Since the machines never get tired or bored, they can also help minimize and eliminate the kinds of mistakes that people are prone to making.
Here’s a simple example: How fast could the fastest data worker look through 500 documents to find instances of social security numbers and then tag those documents as sensitive? Maybe a few days. Intelligent information management AI can do it in minutes, if not seconds. That’s the kind of power we’re talking about here.
You should still involve various stakeholders to determine which kinds of metadata you need, in order to create rules within the system and verify results. You can use these rules to help direct both the intelligent software and your quality control teams. Basically, the higher the risk of specific information, the more you may need to rely upon people to normalize the intelligence with governance rules and quality verification.
You might prioritize various kinds of information, so you can devote more time to the specific documents that carry the most value and associated risks. Also, you might start testing your smart systems with low-priority information, so both you and your AI system can learn to work together better.
You don’t have to wait for future technology to involve intelligent computer systems in information management. Here at M-Files, we eager to offer you a free trial or a walk-through to answer your questions.