Knowledge mining will drive the next wave of AI

Where data mining has been restrictive, knowledge mining widens the length, breadth and density of intelligence models.
25 November 2019

Knowledge Mining is so new, it hasn’t even got an acronym yet. Source: Shutterstock

If 2019 taught us anything, it was that every technology vendor, large and small, had to have a stance on Artificial Intelligence (AI) and the software automation advantages it can deliver.

Some vendors got so excited about AI and the Machine Learning (ML) that allows intelligence engines to get smarter, they forgot to talk about so-called digital transformation. But that was just for a while, not for long, obviously.

Industry spin and subterfuge notwithstanding, AI may now have another chapter to deliver… and it comes in the shape of Knowledge Mining. But before we understand what it is, let’s remember how we got here.

Data mining roots

Knowledge Mining stems from Data Mining, a term that was popularized in the nineties and carried us through the millennium. Data Mining is an interdisciplinary process incorporating statistics, mathematical modelling and pattern recognition and other aspects of information analytics.

In basic terms, Data Mining involves sifting through massive data sets to establish patterns to create what are known as ‘association rules’ (rather like an IF/THEN statement) to direct action based upon the data relationships discovered. People do still talk about Data Mining, but AI has in many cases displaced the.

Widening our narrow models

While Data Mining has been useful, information scientists argue that it was restricted to creating comparatively narrow AI models i.e. it was useful for doing (and learning) one specific thing, such as a tracking one type of image, categorizing one work process or some other defined and essentially discrete task.

Knowledge Mining widens the length, breadth and density of the intelligence model being constructed.

Data Mining centralizes on the processing of relatively well-structured information sets, often held in databases where information is nicely deduplicated, verified and parsed into appropriate fields. Knowledge Mining goes deeper in that it involves the ingestion of massive datasets spanning structured, semi-structured and unstructured information.

Knowledge Mining also embraces a more complex level of business logic and is capable of understanding where connected information streams come together to form real world business process.

According to John ‘JG’ Chirapurath, general manager, Azure Data & AI at Microsoft, “More than two-thirds (68 percent) of respondents to a recent Harvard Business Review Analytic Services survey believe knowledge mining is important to achieving their companies’ strategic goals in the next 18 months.”

Chirapurath points to the challenge at hand on the road to Knowledge Mining. The central issue with ‘old’ information mining techniques was that by the time the data was identified, classified and ratified, it was only fit for archiving. Where Knowledge Mining goes further is in its use of metadata to get the ‘information about information’ this delivers, which speeds the entire analytics process up from the start.

This is of particular importance when we look at the ingestion of unstructured data into Knowledge Mining engines. Where that unstructured data comes in the form of videos, voicemails, emails, images or some other traditionally multi-form-factor shape, then we need to know what it relates to faster than using manual processes of classification performed by human beings.

Real-time anomaly detection

Only when we can track information automatically and sidestep manual work can we start to get use Knowledge Mining for things like real-time anomaly detection.

“With knowledge mining, it is now possible to train a system to recognize the key data to extract from a statement– whether it is in a PDF, a scanned document, or spreadsheet format – and to do it consistently. The same is true for more complex processes, such as allocating invoices to the right account or pulling data from investment documents, which can vary in their presentation, and using that data to validate investment terms,” wrote Chirapurath, in an original article here.

Knowledge Mining is predicted to have most impact upon the enterprise organizations working in financial services, healthcare, manufacturing and legal services. As we enter the early stages of this technology, we can reasonably suggest that most customers won’t ‘do’ the mining themselves, it is more likely that they will buy it as a service from a cloud provider.

Awareness of Knowledge Mining is still comparatively new, so much so that most people aren’t even saying KM for short. Oops, we just did, so now you have the knowledge.