One Size Does Not Fit All
Along with crypto assets (think bitcoin and blockchain), machine learning (ML) is currently among the hottest technology topics. However, this isn’t the first time that we’ve heard this tune as there have been numerous peaks and valleys in ML interest over the past 30 or so years as new developments spiked interest, and subsequent challenges cooled enthusiasm. While it may turn out to be different this time, experience suggests that a healthy dose of skepticism and a heaping serving of pragmatism are called for to cut through the hype and get to the root of the opportunity presented by the advancements in ML. How can you determine whether or not ML is the best and most effective way to address your business problems, and, if it is, whether or not your team is in a position to make the most of these powerful capabilities?
Is This the Golden Age of ML?
The field of machine learning has existed as an academic discipline since the late 1950’s, and its roots go further than that, deep into the origins of statistics. ML has experienced something of a boom or bust evolution over the years as new discoveries and advancements led to a spike in interest that was often followed by the bitter reality of real-world limitations. But ML may well be entering into what will turn out to be a golden age due in large part to the advent of inexpensive computing power, advancements in effective data storage, an increase in alternative data, and powerful tools from the likes of Google and others.
As is often the case, however, advances in one area reveal the limitations and obstacles in another, particularly when it comes to overlaying new technology on top of existing processes and systems. In addition to understanding and addressing the particular nuances of ML, it is imperative to examine and incorporate the second order impacts and challenges that must be overcome in order to achieve success. While the outcomes that ML can produce are exciting, the work and preparation that are necessary to make the results happen require clear and concerted effort.
Three Questions to Ask Before Diving In
Too often, the advent of new tools and techniques elicits a knee-jerk reaction, resulting in companies spinning up new, multi-disciplinary teams for exploratory projects. The companies that are most successful at leveraging ML understand from the outset whether or not the approach in question fits the situation at hand, if the data is ready, and finally if there is a requisite vision to maximize both results and opportunities.
1. Is machine learning the right approach for the job?
A fundamental question that is often overlooked is whether or not machine learning is even appropriate for the task at hand. There is a tendency to assume that the latest and most powerful techniques will yield the best results, but the reality is that traditional reporting or statistical analysis may do the job just fine. It’s important to distinguish whether or not you are simply trying to understand and explain behaviors or outcomes, or if you are trying to make predictions about what will happen in the future.
For example, if your business has seen a spike in revenue or customers and you are looking to understand if one segment is more responsible than the others for this shift, then traditional reporting analytics will be more than sufficient for generating insights. However, if the same business now wants to suggest product recommendations for their existing customers based on purchasing history, machine learning is the appropriate approach for the job.
In any case, it is incumbent upon the decision maker to understand what business questions need to be addressed, and what the nature of the questions are around those approaches so that the proper analytic or ML approaches can be deployed.
2. Does your data support machine learning?
If it is determined that ML is indeed the appropriate approach, the next step is to ascertain whether or not the existing data is sufficient to support the desired project. One thing to note: this assessment needs to be performed on a project-by-project basis. Just because a dataset was sufficient for a previous ML project does not mean that same dataset will be valid for a new model. With that said, questions about data fall broadly into the categories of quantity and quality.
While it may seem that there is an overabundance of data in today’s world, it can often be the case that not enough of the correct data is available so that the machine can begin learning well enough to start to make solid decisions for the specific questions on the table. For forecasting of any kind, it is a requirement to have sufficient data available — multiple years’ worth of data in many cases — to account for complex issues such as seasonal or year-over-year trends. Unfortunately, there is no way around this, so businesses that do not currently store enough historical data may need to begin doing so before attempting ML models.
Regarding quality, the completeness of the dataset is important, and there are three types of completeness that need to be assured.
- Determine how frequently specific data elements are missing in your dataset. For example, is the date of birth always available for your customers or was this an optional field in the form? Also, were there checks in the data collection process to assure quality is preserved if manual data entry is part of the process? In the end, there isn’t a one size fits all solutions for determining if too much data is absent. The variance of the field in question and the accuracy threshold that the model’s output must meet also come into play in determining data completeness. Sometimes the solution is to improve the data collection method before attempting ML.
- It may seem obvious, but the value you would like to predict must be present in the dataset used to build a model. For example, to predict the sale price of a home based on the number of rooms, location, or other features, the dataset used to build the model must include the transaction price. It would be virtually impossible to accurately predict a future sale price without actual examples of real-world sale prices.
- The number of examples you have for a certain feature or result in a target value matters. Returning to the home example, if there are no recorded sales of houses with 12 bedrooms, a model will not be able to predict an accurate sales price for that kind of house.
Undertaking a data assessment phase before diving into ML can save you from taking on a challenge that your data is not yet ready to support, speed the process to beginning ML projects in cases where the data can be enhanced, and improve outcomes due to adherence to best practices.
3. Do you have the necessary vision to follow through?
As any sports enthusiast knows, great tennis serves, home runs, and golf drives are all about the follow through. ML projects are the same way but that follow through is often overlooked or assumed, then eventually forgotten, turning an insightful ML project into a wasted investment. The next question is, “What does follow through look like for a machine learning project?”
There are two main steps involved in the follow-through of a machine learning model:
- Incorporate the model into your data processing pipeline. This step is like plugging a light bulb into a lamp and flipping the switch. By plugging the model in, new predictions are made as new data is collected. This process should be automated: machine learning models that are built to only be called on demand will be underutilized or, more often than not, forgotten entirely. Just as you won’t get any light from a lamp that isn’t turned on, no insight will flow from a model that isn’t running.
- Return to the “why” behind the model. Too often, the original impetus for the ML project is altered or forgotten as the project progresses. It is important to continue to revisit the motivating questions that are driving your efforts. Why are you spending time and energy to come up with these predictions? What is the ideal outcome for learning this information? How does your business improve from knowing this information? These questions might seem obvious, but capitalizing on the answers often is not. Answering these questions with a programmatic solution is the difference between simply storing predictions in a table in a database or acting on those predictions.
As one last example, let’s look at a churn prediction model, where we’ve predicted which customers are most likely to cancel their subscription. What is the reason for developing this model? Rationally, knowing this information means that a business can take actions to prevent the departure of customers before it happens. If you know which of your customers are most likely to leave in the near future, what would you do to prevent it?Perhaps your solution is to send targeted email campaigns to these customers. Integrating your model with a digital marketing campaign will enable your model to drive business outcomes as part of your marketing strategy. This churn prevention marketing campaign is a prime example of follow-through on your machine learning model, resulting in a significant impact to the business.
Reap the Benefits of Machine Learning Done Right
Out in the field, machine learning is producing unique and heretofore unimagined results. However, it is most often the case that the potential outcomes from ML exercises are not being maximized due to the failure to adopt the necessary vision to realize the full value an ML model encapsulates. At Maven Wave, we partner with our clients to assist in designing and managing change initiatives in a way that goes beyond technology to fully embrace all facets and tenets of transformational initiatives, such as machine learning. As one of just a few companies worldwide that have achieved Google’s Machine Learning Partner Specialization, Maven Wave is well equipped to help companies develop innovative solutions rooted in machine learning. Contact us to get started!
About the Authors
Debbie Callahan is a Managing Director with over 25 years of data and analytics consultative business development experience. At Maven Wave, she is responsible for helping our clients leverage analytics to accelerate innovations and deliver better business outcomes. Prior she worked for Clarity Solution Group, and Knightsbridge Solutions (acquired by Hewlett Packard) exclusively focused on data and analytic solutions. Callahan is recognized for her client-focused philosophy; persistence in meeting clients’ unique organizational challenges; commitment to excellence; and a partnering model that ensures ease of doing business. She has a Master of Science Degree in Management and Organizational Behavior from Benedictine University and a Bachelor of Science Degree in Marketing from Illinois State University.
Annie Castner is a Senior Consultant with 4 years of IT consulting expertise. At Maven Wave, she develops machine learning models to help her clients change the way they do business. Her work focuses on the initial data analysis, prep, storage, and finally operationalizing finished models. Prior to joining Maven Wave, Annie served as a Senior Consultant at Clarity Solution Group, where she worked in a variety of data analytics and data engineering roles. She attained her M.S. and B.S. in Applied Math from the University of Notre Dame.