For the past five years, some of the leading data scientists, engineers, and statisticians in the world have come together for the International Workshop on Automation in Machine Learning. Automated machine learning, or AutoML, allows companies to build and train their own AI models with less human intervention. The technology can be applied across many different verticals, from sales and marketing to healthcare or finance.
Traditionally, when companies want to build an AI model for a particular task, such as scoring leads or predicting churn rates based on user behavior, they would need to hire experts with specialized AI knowledge, who would spend months on building and deploying models.
This process requires significant time and effort—both in terms of investment into human capital as well as time spent managing infrastructure efforts—and often leads companies astray because they can’t involve non-technical domain experts in the model-building process.
In contrast, AutoML makes it possible for companies with little or no prior experience in building AI models to develop these types of complex systems at ease, by automating steps such as data pre-processing and data preparation, algorithm selection, cross-validation, model performance measurement, and feature selection.
The Fifth International Workshop on Automation in Machine Learning was a tour de force of cutting-edge innovations in automated machine learning. Let’s look at takeaways across five areas:
These state-of-the-art innovations in AutoML solutions are already being used to solve a variety of real-world problems.
One of the biggest challenges in automating machine learning research is figuring out what hyperparameters to use for a given problem. Hyperparameters are settings for the setup of the AI network itself. This can be especially difficult when using artificial intelligence algorithms like random forests or deep learning where there are many parameters that need to be tuned.
Recent research shows how we can automate hyperparameter optimization for these types of ML models, saving time and improving accuracy and robustness. These techniques are based on “meta-learning” principles, where learning algorithms are applied to metadata about machine learning experiments.
In fact, hyperparameter tuning is already being used in industry. For instance, Facebook’s fastText library uses hyperparameter autotuning to build efficient text classifiers. Classifiers are used for a variety of NLP-related tasks, including sentiment analysis, language identification, spam detection, content tagging, topic classification, and more.
One day, this is likely to become a common part of AutoML frameworks. Hyperparameter autotuning is also commonly discussed alongside automated feature engineering, or automatically selecting the features that are most relevant, which is another advancement in state-of-the-art AutoML.
Neural Architecture Search, or NAS, describes ways to automatically design artificial neural network architectures, which can often out-perform architectures designed by human experts. In fact, NAS has been used to rival the best manually-designed architecture on a very popular dataset, CIFAR-10.
NAS is typically underpinned by Reinforcement Learnings, or RL, although other architectures can be used as well, including methods like evolutionary algorithms and hill-climbing. NAS works by trying many different possible network architectures until one performs well enough according to some metric (such as accuracy or speed) for a given task.
A common approach starts off by randomly sampling hundreds or thousands of random designs — but eventually, NAS systems will not only find good designs but also figure out why certain designs perform better than others do. In other words, unlike traditional “model selection” techniques in AutoML, NAS is a model generation technique.
Most machine learning tools have yet to implement NAS, but it’s a powerful tool that many data scientists are using to improve accuracy benchmarks.
IoT will become an important driver of demand for computing power over the next decade or so — especially as more devices connect to the internet and AI becomes more widely deployed across all devices.
There have been many fascinating discussions of applications of IoT and AutoML technology within industries such as healthcare (e.g. monitoring vital signs), manufacturing (e.g. smart sensors for factories), retailing (e.g. smart shelves that automatically restock products when inventory gets low), transportation (e.g. self-driving cars) and security & surveillance systems (e.g. smart video cameras).
What’s challenging about applying AI to IoT is that these devices “at the edge” are typically small and computationally weak, while AI is typically computation-intensive. New forms of AutoML are needed to make predictions on these devices, and we’re seeing the rise of low-power, low-latency, and lightweight machine learning inference capabilities.
It’s an incredibly exciting time for IoT startups, with investment in the space expected to grow nearly 27% annually, much of which is used to build more intelligent devices.
Automated assessment of fairness in predictive accuracy relates to questions like: Is the algorithm predicting what it should predict? How do we know? What metrics can be used for this purpose?
Fairness is one of those topics that has been studied extensively but often gets overlooked when people think about machine learning problems in industry. It is especially important when developing algorithms that are applied in real-world settings, such as criminal justice systems or self-driving cars, where lives may depend on decisions made by these algorithms (e.g. whether someone should be released from jail).
The first thing to take away is a simple truth: there is no single metric for a fair versus unfair prediction. The reason why so many papers talk about “fairness” without specifying which aspect they care about most comes down to two factors: (i) different researchers have different intuitions about what makes something fair or unfair; (ii) even if there was one agreed-upon definition, it would still be very difficult to measure fairness objectively because humans intuitively perceive different aspects of fairness differently.
This means that there will always be multiple ways to define fairness and multiple metrics associated with those definitions will give us different insights into its quality.
And thus, automating assessments around these definitions will require us to think carefully about which aspects we want our models to predict fairly well and then using appropriate metrics accordingly.
The problem of automated fake news detection has been around for a while now, but it has gained more attention recently as the spread of misinformation on social media platforms like Facebook and Twitter continues to grow. Indeed, nearly 80% of Americans are reported to have seen fake news on the coronavirus outbreak.
Automated fake news detection is about answering the pervasive question: “How do I know if what I’m reading is real or not?” There are many different types of fake news out there — from conspiracy theories to distorted truths about current events.
And then there are stories that are just flat-out wrong — like articles claiming that Hillary Clinton ran a criminal activities ring out of a Washington DC pizzeria or that Parkland survivors were paid by George Soros to act as agent provocateurs for anti-gun groups.
These types of stories are particularly dangerous because they can lead people to take actions against their own interests or against those who have done nothing wrong (like threatening innocent people on Twitter).
What can be done about these problems? The answer is not simple, but it starts with developing better tools for detecting fake news so that people can make informed decisions about which sources they trust and which ones they don’t. With AutoML, accurate systems to detect fake news can be built more efficiently than ever before.
Traditionally, businesses would need to hire data science professionals proficient in tools like Python to build machine learning pipelines from scratch. Now, non-technical teams can use no-code AI tools like Akkio to build ML pipelines themselves. Highly technical teams often use open source tools like TPOT or Auto-sklearn, or proprietary tools like DataRobot or H2O.ai.
Let's look more closely at how businesses can apply AutoML tools, across sales, marketing, customer support, HR, and finance use cases.
In particular, let's discuss how AutoML can be used for lead scoring, forecasting, churn reduction, sales funnel optimization, and fraud detection.
Every salesperson knows the importance of qualifying leads. In fact, it’s one of the most important aspects of any sales process. But what if you could automate this? What if you could use AI to automatically score leads based on their data attributes? You can with AutoML.
Let's say you're a mortgage lender looking to close more deals. You might be spending countless hours prospecting and qualifying leads, and even wasting time and resources on prospects who don't pan out as clients.
But what if you could run an algorithm on your existing database and get instant feedback on which leads are worth calling and following up with? With AutoML, you can simply connect a historical dataset of leads, select a column on whether or not those leads converted, and a predictive lead scoring model will be made automatically. Further, the model can be deployed via a live API to score incoming leads instantly.
Forecasting sales is crucial to maintaining efficiency, particularly in times like the holidays or during new product launches.
The last thing any business wants is surprises during the busy holiday shopping season—especially when it comes to sales volume. However, many businesses struggle to predict how much product they'll need or how much inventory they'll need to order in advance of the holidays.
The same goes for forecasting demand for new products or services before they launch—or even knowing which channels will be best-positioned for success once they do launch (i.e. online vs brick-and-mortar).
But there's an easier way: With AutoML, you can effortlessly build machine learning models that can forecast demand based on historical trends and other variables.
This allows you to plan inventory levels accordingly so that you’re better prepared for peak demand periods throughout the year.
In order to close a sale, you first have to get someone on the phone or into an online chat — but not everyone who contacts you will become a customer. That’s where AutoML comes in: by using machine learning algorithms to analyze customer data and trends, it can help determine which leads are most likely to convert and when those leads might become customers.
For example, let’s say that I own a small business selling office supplies online. I want my site visitors to know exactly how much each package costs before they place their orders (so that they don’t end up paying too much for shipping). But I also want as many people as possible to buy from me.
Using AutoML, I could create a model based on past purchases — both successful and unsuccessful — and use this model to calculate shipping costs automatically before a visitor places an order.
If my model predicts that someone is likely going over budget on shipping costs, then I could contact them directly via email or text message with alternative pricing options before their order goes through.
It helps build brand affinity among existing customers while also increasing the likelihood of new purchases down the line; it's a win-win all around.
AutoML can be used to predict which customers are most likely to churn and when they might do so.
This is a critical step in the sales process: you need to identify customers at risk of leaving, so you can take action before they do. On the flip side, you also need to identify your most loyal customers and figure out what makes them tick so that you can engage with them in the right way at the right time.
Churn prediction is important because it allows us to focus our efforts where we need them most: where we're making money (by converting buyers into repeat buyers), and where we're losing money (by losing buyers altogether).
In the world of fraud prediction, AutoML is being used by financial institutions to identify new potential fraud cases before they turn into full-blown incidents. This can be done by leveraging AutoML on historical financial data, such as transaction patterns.
Amidst the work-from-home era and the post-pandemic world, fraudsters have had a field day. From phishing scams to synthetic identity fraud, there has been no shortage of ways for bad actors to take advantage of unsuspecting victims.
For instance, with the rise of bring-your-own-device (BYOD) policies, many employees are now using personal devices for work purposes. This creates a new attack vector for fraudsters, who can exploit vulnerabilities in these devices to gain access to corporate data.
Similarly, the proliferation of remote work has led to an increase in the use of unsecured Wi-Fi networks. Fraudsters can take advantage of these networks to intercept data or launch man-in-the-middle attacks.
When it comes to payment fraud, the rise of new FinTech solutions like buy now, pay later (BNPL) services has created new opportunities for fraudsters. In many cases, these services are used to make online purchases without the need for a credit card. This makes it difficult for banks to identify fraudulent transactions, as there is no way to confirm whether the purchase was authorized by the cardholder.
Financial institutions are looking for ways to reduce their risk profile as much as possible, which means they need to find new ways of identifying fraudulent activity. Using AutoML allows them to do just that—automate the identification of fraudulent activity before it becomes an issue.
The use of AutoML in fraud prediction isn’t a new concept—there are patents in fraud detection dating to the early 1990s—but it is gaining serious traction today with financial institutions who are interested in reducing their risk profile as much as possible.
From August 14, 2022 to August 18, 2022, the Sixth International Workshop on Automation in Machine Learning will be held in Washington, DC. This will be a great opportunity to learn more about the latest advances in AutoML solutions and to network with experts in the field.
Similar to previous workshops, topics will include:
- Hyperparameter autotuning, which can be used to automatically optimize machine learning algorithms
- Neural architecture search (NAS), which is a technique for automatically designing neural networks
- IoT and automation, which are two areas where AutoML can be applied
- Automated bias and misuse detection, which is important for ensuring fairness in machine learning
- Automated fake news detection, which is a problem that has only become more prevalent in recent years.
In the post-pandemic world, AutoML will play an even more important role in solving real-world problems. Be sure to mark your calendars for AutoML 2022.
The ability to leverage ML in a variety of ways is opening up new possibilities for businesses and consumers alike. This is especially true for businesses that are looking to stay competitive in an increasingly dynamic market landscape.
From sales and marketing to customer support and HR, AutoML is providing businesses with the ability to automate and scale their operations. See how easy it can be to build and deploy AI models with an Akkio free trial.