Nowadays, marketers work with a veritable cornucopia of tools, generating enormous amounts of data. You might use Mailchimp to send emails, Slack to communicate with your team, Typeform to collect user feedback, Google Analytics to track site data, Hubspot or Salesforce in your CRM, and more.
All these tools—and practically every other notable marketing tool not listed—allows you to easily export your data, allowing you to build models and make predictions with AI.
The most common export is as a CSV, or comma-separated values file. This is the standard for tabular data, and this is what you’ll upload to Akkio.
Before you can go out and build an AI model, you need to know where your data will come from.
The easiest way to do this is to think of the Key Performance Indicators, or KPIs, that matter to your organization. If you’re measuring KPIs properly, then you’ll have data about them - data that can be exported and uploaded to Akkio.
Suppose you’re using Hubspot data as your CRM, and you want to score your leads according to their likelihood to convert. Googling “export Hubspot data” and clicking the first result, we see the following steps to export Hubspot data:
This is easy enough. The more data you have, the better, as machine learning is extremely data-hungry. Once you’ve figured out what your goal is, and where your data is, you can build a model and make predictions.
This next part (actually building an AI model and making predictions) is dead-simple.
First, join Akkio for free, if you haven’t already. When you’re logged in, hit “Create New Flow.” This refers to the flow of Akkio, from uploading data, to making predictions, and deploying a model.
Then, upload your dataset - whatever format it’s in. It’ll most likely be a CSV, but you can upload Excel and JSON files as well, or even directly connect to Salesforce via an integration.
Finally, hit “Add Step” to add a step to the flow. Hit “Predict,” and then select the column you want to predict. This is the organizational KPI you’ve already decided on, whether it’s churn, attrition, conversion, or any other metric.
You’ll immediately find some interesting insights, such as the “Top Fields,” or the columns that were most important in making your prediction.
This is deeply significant, as it’s essentially calculating what the most important “levers” are for your business to optimize a KPI. Suppose you’re optimizing conversions from Hubspot data, and the “top field” is “customer support.” This would suggest that you should prioritize the quality of customer support to increase conversions.
Your top field might be something completely different, like customer income or lead source, but regardless of what it is, it’s an important insight into what drives your KPIs.
Another aspect of your AI model report is “prediction quality,” which tells you how good your AI model is at predicting your KPI.
The first variable you’ll see is “accuracy,” which is simply the percent of correct predictions your model made. Say you’re predicting churn with an 80% accuracy. This means if you fed 100 customers into your model to predict churn, your model will make correct predictions on 80 of them. Nice!
You’ll also see these variables:
These are more complex values that can give you deeper insight into the performance of a model.
These are particularly useful when you have an “imbalanced dataset,” which refers to a dataset that has a lot of more of one “class” than another. Take the example of predicting credit card fraud. In the vast majority of credit card transactions, there is no fraud. The goal is to find a tiny percentage of fraudulent transactions. If your AI model predicted “no fraud” every time, it would be extremely “accurate,” but also useless.
Wikipedia gives a handy primer on precision versus recall:
“Suppose a computer program for recognizing dogs (the relevant element) in photographs identifies eight dogs in a picture containing ten cats and twelve dogs, and of the eight it identifies as dogs, five actually are dogs (true positives), while the other three are cats (false positives). Seven dogs were missed (false negatives), and seven cats were correctly excluded (true negatives). The program's precision is then 5/8 (true positives / all positives) while its recall is 5/12 (true positives / relevant elements).”
In the credit card fraud example, suppose you have 1 million transactions, 1,000 of which are fraudulent, and the model guesses “not fraudulent” every time. The accuracy is 99.9%. The precision is 0% (0/1000), which is the red flag that informs you the model is broken.
F1 is a function of both precision and recall, used when you want to achieve a balance between the two.
If you’re predicting an everyday KPI like conversions or sentiment, then these metrics aren’t as important as in life-or-death applications like determining the right dose of medicine to give a patient.