Analytics

Machine Learning Analytics: Understanding Data with Stories

by
Jon Reilly
,
March 7, 2021

At the Heart of Every Good Business is Good Decision Making

The great pursuit of every business is optimized decision-making - finding the exact right path to maximize growth in an incredibly complex and competitive marketplace. Any good decision is built on the foundation of business intelligence. Understanding the customer, understanding the opportunity, understanding the competition, and understanding the strengths and weaknesses of your team and your ability to execute. 

A big part of understanding needed for good decision-making is derived from experience - years on the job absorbing information and learning what works and what does not. Even then, markets evolve at the speed of technology, and information evolves rapidly. But experience is subject to individual interpretation, can drift over time, and can often be wrong. Fortunately, there is another method of understanding that helps drive better decisions - and that’s the understanding of your data. 

Entire industries are built on helping businesses make sense of their data - from business consultancies (market size of $241 Billion in the US alone) to data analytics software solutions (market size of $26 Billion) - no stone is left unturned as businesses work to leverage their data into a competitive advantage. Unfortunately, extracting the critical information from the ever-growing mass of big data each company (qualitative and quantitative) generates is an incredibly complicated undertaking.

Machine Learning is Changing Analytics

Data is exploding at such a furious pace that artificial intelligence is fast becoming the only way to make sense of it all. Machine learning at its core is the use of software and computing power to recognize and learn patterns. Machine learning algorithms can sift through mountains of data to find insights and surface the signal from the noise. Machine learning is the future of analytics. 

But there’s a problem. Machine learning usually works like a black box - data in, predictions out - and users have no understanding of what’s driving a model. And if a business user doesn’t understand a process, they generally don’t trust the result. Data scientists have developed highly technical metrics to evaluate machine learning models, but most people don’t know how to think about F-scores and ROC curves.

It is becoming increasingly easy to take advantage of the power of machine learning to both understand your data and even automate certain types of high-value business decisions. Here at Akkio, we are working to democratize access to machine learning - and our new no-code data stories feature is a big step down that path. Now anyone can train a machine learning model in minutes to surface the drivers of key business outcomes. Let’s take a look at how that works with some example datasets. 

Direct Mail Bank Campaign

The direct mail bank campaign dataset is the classic lead scoring challenge. It contains over forty-one thousand customer records, each the target of a marketing campaign. It also includes 20 different demographic and financial data points on each customer. To examine some of the patterns in the demographic data, we first trained a model with “age” as the prediction target. Once we have a machine learning model that predicts “age” based on all the other factors, we can explore the data story for age. 

Age Patterns:

The data stories for age make a lot of sense - retired married people who completed 4 years of post-high school education are highly likely to be older on average than single high school students. Next, let’s look at the patterns related to subscriptions. 

Here we see that older people with technical jobs and cell phones are much more likely to subscribe to the service than younger people working blue-collar jobs. It’s also interesting to see the duration field, which records the length of time spent on the phone with the bank discussing the offer (in seconds). If you spend a long time on the phone, you are more likely to subscribe. The bank can use this new knowledge to target their campaign to an older audience. 

HR Employee Attrition and Review

Here is a second example - an HR dataset of 1,470 employees from IBM. Each record contains data on the employee’s department, salary, employment history, demographics, and performance review scores. Each record is tagged for “attrition.” This dataset can be used to build machine learning models that explore the relationships between HR data, reviews, and employee turnover (a 17% rare-case classification task).  Here is the Attrition data story. 

From this data story, we learn that employees facing long working hours, punishing travel schedules, and either fresh into the workforce or those with frequent turnover in the past are way more likely to leave the company. Now the HR team can better screen resumes to avoid investing in new employees who are unlikely to stick around. 

Telco Customer Churn

Another example - two months of historical data on customer churn for a Telco. Seven thousand records contain demographic information and details on the services the customer subscribes to, contract terms, and payment details (methods and amounts). Each record is tagged with if the customer churned out in the last two months. Here is the data story for churn. 

There is lots of obvious stuff here - if you are on a contract, you are much less likely to churn. This is a European telco that exists in a competitive environment with many providers offering internet and data services. They would do well to check their pricing model and work to move their new users onto contracts and automated payment methods as quickly as possible (perhaps with some incentives). 

Insurance Charges

Finally, let’s check an insurance company dataset of just over 1,300 records that track the cost charged to a health insurance plan given demographics like age, gender, BMI, etc. Training a model shows the stories we would expect - older fathers who are overweight and smoke are more likely to have high costs associated with healthcare, while young, healthy, non-smoking women have the lowest costs. Let’s look at the data story.

Charge Patterns:

Now It’s Your Turn

See how easy it is to identify and understand the drivers of your critical business outcomes by signing up for a free account at Akkio and training your first model. It only takes a few minutes to unlock new insights and understanding from your data. And from there, you can deploy your machine learning model with just a few clicks - as new data flows in, you’ll understand the predictions that flow out. Now you are on the path towards predictive analytics and automated data-driven decision-making.

SIGN up

Grow Faster with No-Code ML

Now everyone can leverage the power of AI to grow their business.