Have the Days of the Citizen Data Scientist Arrived? Vertica from Micro Focus
The transformation of machine learning from research area within academic institutions to the business sphere has been relatively quick. But like all new technologies, artificial intelligence in all its variations (deep learning, cognitive processing, among others) has yet to find widespread use in most organizations.
That’s because until recently, companies required dedicated teams of data specialists: database administrators, data scientists and engineers, and developers skilled in the variations of coding languages and frameworks needed to extract insights from large data pools.
Questions asked of newly assembled data teams were slow to be answered and often produced results at variance from the intended direction of the questioning. To this day, it’s almost as if there’s a translation layer between business-focused professionals and data teams that’s missing. Data specialists lack the knowledge of business strategy, and decision-makers are not trained in the intricacies of data science. No wonder, therefore, that many AI projects in the enterprise fail to produce timely or effective results.
Part of the problem is that, on a practical level, data professionals have to work from manageable subsets of data from the total informational pool of resources. Building and testing models for machine learning algorithms require many manual processes (such as data sanitizing and normalization), which are time- and processor cycle-consuming. Fine-tuning of data models is therefore undertaken on limited information, which (when in production) can produce both inaccurate and out-of-date results.
Today’s machine learning platforms designed with business focus firmly in mind solve many of these problems. They collate and sanitize data on the entirety of the organization’s data resources in an abstraction layer between the data and query layers. That means complete data sets are normalized and available for interpretation on-the-fly.
Additionally, the query layer in platforms like Micro Focus‘s Vertica is presented in such a way as to allow, even encourage, business professionals to address the data in ways that are business focused. This top-down approach to machine learning means an organizational shift from a SQL query (or “hard code”) mentality. Multiple stakeholders are empowered to unearth their own insights from the big data that is the vital resource many organizations already possess but are unable to exploit.
The commercial mindset [PDF] behind this shift in focus away from the bottom-up data analysis approach is epitomized by new pricing models. Vertica, for example, doesn’t charge its platform’s users by the raw processing power queries require. That approach discourages the broader use of machine learning in an enterprise, frowns on experimentation, and stops queries from being adequately refined. Instead, clients pay according to the infrastructure size needed to run queries and smart algorithms on full data sets continuously. As the size of the data addressed increases, so does the charge, of course. But it’s a model that democratizes machine learning to professionals and decision-makers from right across the enterprise.
Now, instead of data science teams adapting and learning the minutiae of all the business functions to best serve the needs of each, the presentation and query layer of machine learning’s power becomes a collaborative space where data analysts and stakeholders work together towards the same strategic goals. The days of the “citizen data scientist” are rapidly approaching.
Across the enterprise, Vertica is enabling different teams to access the data they need, choose the data models that are best suited for their purpose, and fine-tune the algorithms to produce meaningful results. Cyber security teams are already leveraging AI to detect anomalous network patterns, while anti-fraud officers can spot identity theft with facial recognition, handwriting, or voiceprint verification.
There are also use cases for marketing and in customer-facing areas, like sentiment analysis (in text and speech) and uncovering market trends in internal and externally available data. Practically, too, operations can be improved via logistics improvements (route planning, traffic analysis) and spotting where plant and machinery might need maintenance. There are literally dozens of practical uses already possible, with many more evolving right now.
Once machine learning is out from the relatively closed confines of the data science labs, the uses of this transformational technology become available to revolutionize many areas of every business. Here on Tech HQ, we’ll be looking at some of these areas where current practices using business-focused AI solutions produce massive value in short timescales.
Until then, if what you’ve read here has interested you, your data team, or any of the many business function owners in your organization, reach out to Vertica today. Read a recently published O’Reilly report to learn how you can deploy machine learning in minutes, not months.
27 January 2023
25 January 2023