The Machine Learning Boom
In 2012, the U.S. became the world’s top oil and gas producer, overtaking Russian production of natural gas in 2011 and petroleum in 20131
. Rents in boom towns like Williston, ND, for a time, rivaled those in New York and San Francisco as thousands of workers swelled into the area to support the thriving industry… and then “busted” with falling oil and gas prices due to supply shocks. Hydraulic fracturing or “fracking” technology is one of the primary drivers of this incredible growth in U.S. energy output, allowing access to huge energy reserves that were not previously economical to tap. In today’s forward-thinking digital organizations, another technology, Machine Learning, is starting to have a similar impact.
In the digital landscape, one of the hotter topics this year is artificial intelligence (AI). My colleague Christian Saylor’s thoughtful post on the topic cites several noteworthy examples of AI on the rise. Machine Learning, the preeminent field in the practical AI realm, focuses on “algorithms that can learn from and perform predictive analysis on data”2. At a high level, the idea is to feed a Machine Learning algorithm a set of “training” data and let the algorithms infer a model that will accurately predict future results based on the training data. The smaller the difference or error between the model’s predictions and actual data, the better the model is. Different algorithms exist for different problems; for example, a linear regression algorithm which models predictions along a scale of possible values, might take as input a variety of product attributes, historical sales and even data such as weather, to develop a model to anticipate future sales. A logistic regression algorithm might be applied to predict values belonging to a discrete set or classification e.g. does a particular bitmap image represent the character “7”.
The mathematical approaches underlying Machine Learning—matrix multiplication, linear regression, logistic regression, gradient descent, etc.—are not new; many date back to the early 19th century or before. So why are we only seeing Machine Learning boom now? These techniques often require a lot of data and a lot of computation, but in recent years we’ve seen both storage and computational capacity increase rapidly. Data processing frameworks such as Hadoop and Spark have been developed to harness clusters of computers and are now available as affordable cloud services. Another factor contributing to the Machine Learning boom is the availability of good tools/libraries such as MatLab, R, and SciKit-learn to apply a wide variety of algorithms to data and infer predictive models. Fracking has been around since 1947 but only recently did it come to the forefront of U.S. energy production. The pieces to harness Machine Learning have been coming together in recent years and we are already seeing it have a clear impact on many of today’s challenging problems.
Machine Learning is already a part of your life. Search results in your browser3 and the Siri experience on your phone4 are driven by Machine Learning predictions based on vast amounts of prior data. Spam filtering in services such as those present in Gmail rely heavily on Machine Learning techniques5 and have gotten so effective that we rarely worry about spam anymore. Just as fracking technology started as a procedure to improve the performance of existing oil and gas wells, and later found applications in the economical extraction of tight gas and oil shale, Machine Learning techniques are very quickly finding applications in a wide variety of fascinating problems. Consider the following:
- Computer Vision in Driverless Cars - The combination of advanced Machine Learning algorithms with computer vision image processing techniques is improving the accuracy of detecting pedestrians in autonomous driving systems.
- Improving Preventative Health Care - With Obamacare’s emphasis on preventative health care, many startups , such as CareSkore, are focused on extracting valuable insights into improving preventative health care through models to predict which patients may be most likely to skip appointments or intentionally fail to take medications.
- Diabetic Retinopathy Detection - Various studies have tackled detection of Diabetic Retinopathy, a leading cause of blindness, using Machine Learning algorithms applied to retinal images6. More recently, over 600 teams participated in a challenge hosted on Kaggle.com to build a model for detecting Diabetic Retinopathy from retinal images taken during the screening process.
- Predicting School Drop Outs - The Indian government has partnered with Microsoft in a project which, among other goals, seeks to predict which students are most likely to drop out of school.
- Reducing Data Centre Energy Consumption - Google applied their own Machine Learning engine, DeepMind, to reducing energy consumption at their data centers.
During the 2016 Q2 earnings call, Google’s CEO Sundar Pinchai stated: “Machine Learning is the engine that will drive our future, and it’s already making our products better and helping users every day”7. One such example is Google Allo, a forthcoming instant message app that uses Machine Learning to suggest message responses. There is almost an “arms race” of sorts among some of the biggest tech giants—Google, Amazon, Microsoft and Apple—to see whose digital assistant—Google Now, Alexa, Cortana and Siri, respectively,—can extract the most meaning from human interactions and provide the best consumer value. Machine Learning, of course, underlies the “brains” of these assistants. Expect to see Machine Learning become even more pervasive and drive competitive advantage in the products and services you use each day.
Fracking technology is not without controversy—opponents cite concerns with water purity and increased seismic activity—and neither is Machine Learning. Primarily when applied to personal data such as patient information, ML has raised the ire of privacy advocates. John Foreman, author of Data Smart, captures the concern as follows:
Via Machine Learning, a person’s future actions can be predicted at the individual level with a high degree of confidence. No longer are you viewed as a member of a cohort. Now you are known individually by a computer so that you may be targeted surgically.8
What if your medical insurance premiums were based on an algorithm’s predictions of your likely claims using your genetic characteristics and inferences of your lifestyle based on posts and images shared via social media? What if you were questioned by law enforcement because an algorithm predicted you were likely involved with a crime (like the “pre-crime” unit in Minority Report)? Stephen Hawking suggested that “full artificial intelligence could spell the end of the human race”9. Without getting too gloomy, there’s enough concern around Machine Learning techniques that Google has established an AI Ethics board; unfortunately, we don’t know much about it. Anthony Goldbloom, co-founder of Kaggle, predicts Machine Learning will eventually displace jobs that consist primarily of “frequent, high volume tasks”10.
While Machine Learning has its critics, one cannot deny the positive value that it is bringing to digital products every day. It has progressed from research labs to the mainstream at a rapid pace and is driving a boom of innovation that is still in its infancy. Organizations in today’s digital landscape must carefully consider how Machine Learning can act as a competitive advantage within their respective vertical as well as in how it can help enable a more holistic experience strategy. Organizations that choose to not implement Machine Learning could find themselves at a competitive disadvantage far faster than they might have originally supposed.
Per Tim Cook in the Apple 2016 Q2 earnings call
See the official Gmail blog entry “The mail you want, not the spam you don’t”
See Retinal Blood Vessel Segmentation Using Line Operators and Support Vector Classification and Automated Detection and Differentiation of Drusen, Exudates, and Cotton-Wool Spots in Digital Color Fundus Photographs for Diabetic Retinopathy Diagnosis
See his TED talk “The jobs we’ll lose to machines– and the ones we won’t”