## 10 Machine Learning Algorithms to Know in 2023

Machine learning (ML) can do everything from analyzing X-rays to predicting stock market prices to recommending binge-worthy television shows. With such a wide range of applications, it’s not surprising that the global machine learning market is projected to grow from \$21.7 billion in 2022 to \$209.91 billion by 2029, according to Fortune Business Insights.

At the core of machine learning are algorithms, which are trained to become the machine learning models used to power some of the most impactful innovations in the world today.

In this article, you’ll learn about 10 of the most popular machine learning algorithms that you’ll want to know, and explore the different learning styles used to turn machine learning algorithms into functioning machine learning models.

## 10 machine learning algorithms to know

In simple terms, a machine learning algorithm is like a recipe that allows computers to learn and make predictions from data. Instead of explicitly telling the computer what to do, we provide it with a large amount of data and let it discover patterns, relationships, and insights on its own.

From classification to regression, here are 10 algorithms you need to know in the field of machine learning:

### 1. Linear regression

Linear regression is a supervised learning algorithm used for predicting and forecasting values that fall within a continuous range, such as sales numbers or housing prices. It is a technique derived from statistics and is commonly used to establish a relationship between an input variable (X) and an output variable (Y) that can be represented by a straight line.

In simple terms, linear regression takes a set of data points with known input and output values and finds the line that best fits those points. This line, known as the “regression line,” serves as a predictive model. By using this line, we can estimate or predict the output value (Y) for a given input value (X).

Linear regression is primarily used for predictive modeling rather than categorization. It is useful when we want to understand how changes in the input variable affect the output variable. By analyzing the slope and intercept of the regression line, we can gain insights into the relationship between the variables and make predictions based on this understanding.

### 2. Logistic regression

Logistic regression, also known as “logit regression,” is a supervised learning algorithm primarily used for binary classification tasks. It is commonly employed when we want to determine whether an input belongs to one class or another, such as deciding whether an image is a cat or not a cat.

Logistic regression predicts the probability that an input can be categorized into a single primary class. However, in practice, it is commonly used to group outputs into two categories: the primary class and not the primary class. To accomplish this, logistic regression creates a threshold or boundary for binary classification. For example, any output value between 0 and 0.49 might be classified as one group, while values between 0.50 and 1.00 would be classified as the other group.

Consequently, logistic regression is typically used for binary categorization rather than predictive modeling. It enables us to assign input data to one of two classes based on the probability estimate and a defined threshold. This makes logistic regression a powerful tool for tasks such as image recognition, spam email detection, or medical diagnosis where we need to categorize data into distinct classes.

### 3. Naive Bayes

Naive Bayes is a set of supervised learning algorithms used to create predictive models for binary or multi-classification tasks. It is based on Bayes’ Theorem and operates on conditional probabilities, which estimate the likelihood of a classification based on the combined factors while assuming independence between them.

Let’s consider a program that identifies plants using a Naive Bayes algorithm. The algorithm takes into account specific factors such as perceived size, color, and shape to categorize images of plants. Although each of these factors is considered independently, the algorithm combines them to assess the probability of an object being a particular plant.

Naive Bayes leverages the assumption of independence among the factors, which simplifies the calculations and allows the algorithm to work efficiently with large datasets. It is particularly well-suited for tasks like document classification, email spam filtering, sentiment analysis, and many other applications where the factors can be considered separately but still contribute to the overall classification.

### 4. Decision tree

decision tree is a supervised learning algorithm used for classification and predictive modeling tasks. It resembles a flowchart, starting with a root node that asks a specific question about the data. Based on the answer, the data is directed down different branches to subsequent internal nodes, which ask further questions and guide the data to subsequent branches. This process continues until the data reaches an end node, also known as a leaf node, where no further branching occurs.

Decision tree algorithms are popular in machine learning because they can handle complex datasets with ease and simplicity. The algorithm’s structure makes it straightforward to understand and interpret the decision-making process. By asking a sequence of questions and following the corresponding branches, decision trees enable us to classify or predict outcomes based on the data’s characteristics.

This simplicity and interpretability make decision trees valuable for various applications in machine learning, especially when dealing with complex datasets.

### 5. Random forest

random forest algorithm is an ensemble of decision trees used for classification and predictive modeling. Instead of relying on a single decision tree, a random forest combines the predictions from multiple decision trees to make more accurate predictions.

In a random forest, numerous decision tree algorithms (sometimes hundreds or even thousands) are individually trained using different random samples from the training dataset. This sampling method is called “bagging.” Each decision tree is trained independently on its respective random sample.

Once trained, the random forest takes the same data and feeds it into each decision tree. Each tree produces a prediction, and the random forest tallies the results. The most common prediction among all the decision trees is then selected as the final prediction for the dataset.

Random forests address a common issue called “overfitting” that can occur with individual decision trees. Overfitting happens when a decision tree becomes too closely aligned with its training data, making it less accurate when presented with new data.

### 6. K-nearest neighbor (KNN)

K-nearest neighbor (KNN) is a supervised learning algorithm commonly used for classification and predictive modeling tasks. The name “K-nearest neighbor” reflects the algorithm’s approach of classifying an output based on its proximity to other data points on a graph.

Let’s say we have a dataset with labeled points, some marked as blue and others as red. When we want to classify a new data point, KNN looks at its nearest neighbors in the graph. The “K” in KNN refers to the number of nearest neighbors considered. For example, if K is set to 5, the algorithm looks at the 5 closest points to the new data point.

Based on the majority of the labels among the K nearest neighbors, the algorithm assigns a classification to the new data point. For instance, if most of the nearest neighbors are blue points, the algorithm classifies the new point as belonging to the blue group.

Additionally, KNN can also be used for prediction tasks. Instead of assigning a class label, KNN can estimate the value of an unknown data point based on the average or median of its K nearest neighbors.

### 7.  K-means

K-means is an unsupervised learning algorithm commonly used for clustering and pattern recognition tasks. It aims to group data points based on their proximity to one another. Similar to K-nearest neighbor (KNN), K-means utilizes the concept of proximity to identify patterns or clusters in the data.

Each of the clusters is defined by a centroid, a real or imaginary center point for the cluster. K-means is useful on large data sets, especially for clustering, though it can falter when handling outliers.

K-means is particularly useful for large datasets and can provide insights into the inherent structure of the data by grouping similar points together. It has applications in various fields such as customer segmentation, image compression, and anomaly detection.

### 8. Support vector machine (SVM)

support vector machine (SVM) is a supervised learning algorithm commonly used for classification and predictive modeling tasks. SVM algorithms are popular because they are reliable and can work well even with a small amount of data. SVM algorithms work by creating a decision boundary called a “hyperplane.” In two-dimensional space, this hyperplane is like a line that separates two sets of labeled data.

The goal of SVM is to find the best possible decision boundary by maximizing the margin between the two sets of labeled data. It looks for the widest gap or space between the classes. Any new data point that falls on either side of this decision boundary is classified based on the labels in the training dataset.

It’s important to note that hyperplanes can take on different shapes when plotted in three-dimensional space, allowing SVM to handle more complex patterns and relationships in the data.

### 9. Apriori

Apriori is an unsupervised learning algorithm used for predictive modeling, particularly in the field of association rule mining.

The Apriori algorithm was initially proposed in the early 1990s as a way to discover association rules between item sets. It is commonly used in pattern recognition and prediction tasks, such as understanding a consumer’s likelihood of purchasing one product after buying another.

The Apriori algorithm works by examining transactional data stored in a relational database. It identifies frequent itemsets, which are combinations of items that often occur together in transactions. These itemsets are then used to generate association rules. For example, if customers frequently buy product A and product B together, an association rule can be generated to suggest that purchasing A increases the likelihood of buying B.

By applying the Apriori algorithm, analysts can uncover valuable insights from transactional data, enabling them to make predictions or recommendations based on observed patterns of itemset associations.

Gradient boosting algorithms employ an ensemble method, which means they create a series of “weak” models that are iteratively improved upon to form a strong predictive model. The iterative process gradually reduces the errors made by the models, leading to the generation of an optimal and accurate final model.

The algorithm starts with a simple, naive model that may make basic assumptions, such as classifying data based on whether it is above or below the mean. This initial model serves as a starting point.

In each iteration, the algorithm builds a new model that focuses on correcting the mistakes made by the previous models. It identifies the patterns or relationships that the previous models struggled to capture and incorporates them into the new model.

Gradient boosting is effective in handling complex problems and large datasets. It can capture intricate patterns and dependencies that may be missed by a single model. By combining the predictions from multiple models, gradient boosting produces a powerful predictive model.

## Get started in machine learning

With Machine Learning from DeepLearning.AI on Coursera, you’ll have the opportunity to learn essential machine learning concepts and techniques from industry experts. Develop the skills to build and deploy machine learning models, analyze data, and make informed decisions through hands-on projects and interactive exercises. Not only will you build confidence in applying machine learning in various domains, you could also open doors to exciting career opportunities in data science.

## Machine Learning and Higher Education

Software is eating the world, so said Marc Andreesen in 2011.1 These days it seems that machine learning and its specialized algorithms are eating the software world.2 Is it thus a foregone conclusion that machine learning will play a significant role in disrupting technology and shaping our future?

Machine learning concerns teaching machines to learn about something without explicit programming. At the core of machine learning is the idea of modeling and extracting useful information out of data. Societal trends clearly point to data as the resource of the future. Colleges and universities are already swimming in data, and there is much more on the way. Imagine a future in which computers are everywhere and interconnected with everything from clothes to refrigerators, phones, vending machines, and more. Some people have even proposed equipping toilets with sensors that collect data.3 Storing those data will be very cheap.4 These interconnected devices will produce quantities of data that are too large human analysis, requiring us to teach computers to look for patterns in the data, identify predictor variables, and even try to predict for those variables.

Organizations that adapt and adopt machine learning will have a bright future. Machine learning is a new tool in the box, and it is worth learning how to use.5 Colleges, universities, and other educational institutions often adopt disruptive technologies in novel ways and are therefore in a good position to use machine learning to improve higher education. Adopting a machine learning–centric data-science approach as a tool for administrators and faculty could be a game changer for higher education.

Before we discuss machine learning further, it is important to briefly discuss analytics and traditional statistics. It is true that not all predictive analytics needs to be done with machine learning. The traditional methods here are statistical methods such as time series forecasting or various forms of regression. These have been used successfully in many fields for several years. In this article, from a very high overview, we refer to analytics as the subfield of machine learning that is predictive analytics and relies on training algorithms with a labeled training set, otherwise known as supervised learning. A common example is weather.6 Suppose we are interested in predicting sunny days. We can do this by observing our entire data set and feed the conditions into an algorithm that will look at days that were sunny and days that were not. This model is then trained and then can be fed new data and make guesses about whether it is sunny. For our purposes, we are interested in using supervised methods to make predictions and unsupervised methods such as classification to find patterns in the data that we might not have seen.

It is important to discuss the potential benefits and recommendations for pursuing machine learning as a tool for educational experts. In addition, it is important to note potential limitations and ethical considerations. Although an in-depth discussion is beyond the scope of this article, our hope is to start a conversation among higher education administrators, faculty, and IT specialists regarding the potential of machine learning to help make more-informed and better decisions — in other words, get people interested in machine learning to try it and see how things go. We are practicing what we advocate in this article. Heath Yates is actively exploring new algorithmic approaches to machine learning, while Craig Chamberlain is applying machine learning to data in higher education.

## Potential Benefits of Machine Learning in Higher Education

Our interest in machine learning began by doing some very simple clustering analysis parallel to k-nearest neighbor (kNN). Such techniques as kNN can assist in finding patterns in larger data for analysts. During the 2016–17 year, Chamberlain was approached by his university to look at a question posed by a donor: “Can we identify a group of students who need an additional scholarship that would eventually lead to increased retention?” After spending time with several data sets and after a lot of research, Chamberlain and his team identified a group of students who needed additional money to remain enrolled. At the time, many believed that increasing retention for this group was a long shot. However, after awarding these students additional scholarships, retention rose from approximately 64% to about 90%. This effort has had two distinct benefits. The most important is that it contributed to the continued success of those students. The second is that it resulted in about \$200,000 in additional net tuition revenue from an investment of about \$50,000 in scholarships. By conducting basic machine learning to find patterns in the data and testing hypotheses, Chamberlain and his team were able to help students and the university. Although this use case is simple and nascent and relied on some traditional statistical inference, once machine learning and education begin interacting more often, this simple example can evolve into larger data sets with large solutions.

Although analytics is relatively widespread, we believe higher education has barely scratched the surface of the potential for machine learning. At the same time, we do not mean to suggest that no one is doing this kind of work. Rather, we believe there is room to grow in this area. Because Chamberlain works as an analyst in higher education — specifically enrollment management — he has seen substantial market potential for data science and machine learning. From student recruitment and success to curricular modeling and student-to-faculty ratios, large quantities of data go unused. Across the country, only a few consultants are using data science to assess student recruitment and success, which often results in a one-size-fits-all approach to recruiting, awarding financial aid, and measuring student success. Each graduating high school senior has numerous data points to assess, including location, grades, and parent income. Machine learning can assess data for each student and determine the likelihood that the student will enroll. Once a student enrolls, even more data points can be assessed, such as the living situation, grade on the first calculus exam, and major. Using machine learning, universities can then hone in on student retention and persistence and identify factors that influence student success.

Machine learning could potentially be used to look for patterns on a campus-wide level. Are there conditional probabilities or cluster analyses that suggest a pattern for passing a statistics course? Suppose, for example, that students who earn high marks in math classes are more likely to pass a statistics course. This seems obvious, but machine learning can provide a methodology to confirm or refute this belief.

How could university leadership use this information to increase retention and student success? Consider, for example, a correlation between taking particular courses that are not prerequisites for statistics and doing well in statistics. Using machine learning in exploratory data analysis might help find these kinds of patterns.

Kansas City uses machine learning to prevent potholes before they even form.7 Colleges and universities could consider using machine learning as a preventative tool as well. If an institution maintains detailed records on IT purchases and equipment, machine learning could be applied to IT equipment maintenance or maintenance in general.

For higher education, experts are going to need machine learning and people able to understand these algorithms to make better business decisions. Currently, many universities do not have a chief data scientist or a team of experts to apply machine learning in an official capacity. Therefore, many universities are missing opportunities that machine learning provides. We suspect that the institutions that are using machine learning are not talking about it much, and we encourage them to reach out to us and others to share their successes and challenges.

## Recommendations for Adopting Machine Learning

Getting started with machine learning is not as difficult as some might imagine or claim. Universities, colleges, and other educational institutions are in a good position to adopt, start, grow, and implement machine learning projects, given their access to faculty who have mathematical, statistical, and computer science backgrounds. We offer the following high-level recommendations on how to implement machine learning projects at the university level.

### Set Clear Expectations of Institutional Needs, Goals, and Requirements

Administrators and faculty should brainstorm about institutional needs that machine learning can help address. Start small with a very narrow question. For example, it might be useful to predict who is most likely to pass a certain difficult class. Are there discernable patterns that can help predict which students will pass calculus? Can machine learning predict enrollment in specific classes? Conversely, are there patterns in institutional data that can help predict which students are likely to earn degrees by using clustering analysis of some type? Also, be sure to have a goal for the potential findings. How can the university use these results to enhance students’ success, boost retention, and enhance student enrollment?

### Temper with Realism

Find out if a faculty member or other expert at your institution or nearby can offer an informed opinion about whether the questions being asked can be answered — can the problem be solved by machine learning? Some problems are easy and inexpensive to solve, and others are not. If not, consult with the expert and go back to the drawing board. Make sure you have individuals who can do the proposed work — typically someone with a mathematics, statistics, and programming background. Industry refers to individuals who possess this combination of skills as “magical unicorns.”8 This is where being in higher learning pays off — these talents are usually close by, if not in one person then definitely in a group of people. The challenge for administrators is to be the bridge for people to cross traditional boundaries and make sure people involved pursue this as an interdisciplinary approach on behalf of the institution.

### Consider Finances

Can your institution afford to hire a full-time data scientist? In many case, this might not be an option, given the salaries that such individuals command.9 A reasonable alternative is to put together a diverse, interdisplinary team of volunteers together who agree to do this work. The cost, so to speak, is whatever commitment the institution is willing to make in time investment from having members of that team engaged in other activities. Also important is to be aware that some investment in computing or storage may be required, but depending on how up-to-date your institution’s technology infrastructure is, this might not be necessary.

### Realize That This Work Takes Time and Can Be Complicated

Being realistic about the overhead upfront tempers expectations. Your institution may have loads of historical data, but they might be in legacy systems or there may be technical hurdles that make it difficult easily access those data. The value is there, but it may take time to come up with a solution that is easy for everyone to use. Depending on questions administrators or faculty ask, it may also take some time to do proper data analysis. The key is to be patient and strategic. If you commit to doing machine learning, play the long game.

### Understand That Security and Privacy Are Paramount to Machine Learning

There are likely local, state, and federal guidelines and laws that educational institutions must adhere to in order to safeguard their data. Before moving further on machine learning, all data should be as secure as possible. In addition, privacy of individuals must be protected. Most industries process, clean, and store data so that no individually discernable information may be gleaned out of it.

### Do Machine Learning

The combination of imaginative, creative, and capable people means that new applications, innovations, and benefits are being found very quickly. If you do not have a group of individuals who can currently do machine learning, then find people interested and invest in them to do it. There are many resources in online learning and education to teach data science. Many of these resources are free, and some offer certifications at a reasonable price. The bar of entry is lower than you might think. Many of the programming and data science tools are free.

## Ethical Considerations and Limitations

While we believe in the future of machine learning, it always pays to be cautious when adopting new technology. Machine learning is powerful. As the saying goes, with great power comes great responsibility. Often, in the excitement, it can be easy to lose site of the downsides of a new technology or tool. Machine learning provides analysts and decision makers with previously undreamed powers due to its ability to find patterns, make predictions, and draw inferences. The examples below can serve as cautionary tales and motivate questions regarding ethical considerations and the potential limitations of machine learning. In other words, just because we can do something does not imply or suggest we must. Any machine learning project should respect the institution’s policies and mission.

### Respect Privacy

One of the earliest examples of using machine learning in predictive analytics came about from an incident in which Target sent coupons to a woman it determined was likely to be pregnant. The story goes something like this: The indignant father went to Target and complained to management that it sent coupons addressed to his teenage daughter with advertisements for maternity clothing and baby furniture. The store management apologized, but the father later contacted them and produced his own apology when he learned his daughter indeed was pregnant.10 How can machine learning techniques determine that a woman was pregnant? The short answer is that we are creatures of habit. Therefore, human behavior and patterns in data collected by companies can be used to identify emerging trends, such as pregnancy.11 The open question presented here is to ask if we always should.

### Consider the Implications

More recently, machine-learning techniques were successfully used to infer the sexual orientation of individuals based on their facial features using data from dating websites. Specifically, two researchers from Stanford University trained an AI system to detect patterns in facial features and use this to identify the sexual orientation of a random male (with accuracy of 81%) and for a random woman (71% accurate).12 This is much higher than the reported capability of humans. This research has generated questions of whether such capabilities mean it is also possible to infer a person’s political orientation and IQ from their appearance.13

Currently, these are cutting-edge findings and research. Frankly, we are somewhat skeptical of the findings as a new form of pseudoscience. We present them as probable use cases where machine learning should not have even been applied to begin with. Machine learning should service the mission of higher education to reduce bias and prejudice in human society, not potentially promote it.

### Insist on Appropriate Goals

Some controversial work has appeared in detecting criminality based on facial features. Researchers demonstrated that it was possible to infer criminality based solely on still face images using common machine learning techniques.14 Academia and media alike harshly criticized the findings as a new form of craniometrics and pseudoscience.15 One risk of data science is to create difficulty in understanding artificial intelligence systems based on questionable or pseudoscientific ideas.

These examples lead to one invariable and fundamental conclusion regarding the ethical implications of machine learning: We must be careful that machine learning is not abused, resulting in either intentional or unintentional biases or exclusionary analyses, predictions, and artificial intelligence systems. Some nascent research demonstrates that this is possible — research has shown that political leanings can influence how an artificial intelligence system might pick synonyms for political hashtags.16

These examples are not intended to create fear or dissuade readers from pursuing machine learning. Rather, we hope to generate a positive discussion about machine learning and how it can be carefully, responsibly, and maturely applied. Colleges, universities, and other educational institutions should define clear standards so that machine learning projects do not violate ethical standards and stay true to institutional goals and high standards. In fact, this is an opportunity for higher education to lead society by doing things the right way. One way to address many of these issues is to recruit a diverse, inclusive team of experts to analyze data carefully in an ethical and sound way. This is an easy and natural strength for universities, colleges, and other academic organizations.

## Conclusion

Machine learning shows great potential to disrupt how we process and consume data and use software. Serious ethical considerations and limitations must be considered. However, higher education is naturally and uniquely positioned to capitalize on the promise of machine learning by using it as a tool for social and moral good. Higher education has the opportunity not only to use machine learning to help transform itself to make better decisions but also to explore how it might apply machine learning as a force for good. How can machine learning relate to and benefit higher education? Considering the trend towards automation in technology as a guide, we believe that the answer, ultimately, is in everything.

## An Introduction to Machine Learning

### Introduction

Machine learning is a subfield of artificial intelligence (AI). The goal of machine learning generally is to understand the structure of data and fit that data into models that can be understood and utilized by people.

Although machine learning is a field within computer science, it differs from traditional computational approaches. In traditional computing, algorithms are sets of explicitly programmed instructions used by computers to calculate or problem solve. Machine learning algorithms instead allow for computers to train on data inputs and use statistical analysis in order to output values that fall within a specific range. Because of this, machine learning facilitates computers in building models from sample data in order to automate decision-making processes based on data inputs.

Any technology user today has benefitted from machine learning. Facial recognition technology allows social media platforms to help users tag and share photos of friends. Optical character recognition (OCR) technology converts images of text into movable type. Recommendation engines, powered by machine learning, suggest what movies or television shows to watch next based on user preferences. Self-driving cars that rely on machine learning to navigate may soon be available to consumers.

Machine learning is a continuously developing field. Because of this, there are some considerations to keep in mind as you work with machine learning methodologies, or analyze the impact of machine learning processes.

In this tutorial, we’ll look into the common machine learning methods of supervised and unsupervised learning, and common algorithmic approaches in machine learning, including the k-nearest neighbor algorithm, decision tree learning, and deep learning. We’ll explore which programming languages are most used in machine learning, providing you with some of the positive and negative attributes of each. Additionally, we’ll discuss biases that are perpetuated by machine learning algorithms, and consider what can be kept in mind to prevent these biases when building algorithms.

## Machine Learning Methods

In machine learning, tasks are generally classified into broad categories. These categories are based on how learning is received or how feedback on the learning is given to the system developed.

Two of the most widely adopted machine learning methods are supervised learning which trains algorithms based on example input and output data that is labeled by humans, and unsupervised learning which provides the algorithm with no labeled data in order to allow it to find structure within its input data. Let’s explore these methods in more detail.

### Supervised Learning

In supervised learning, the computer is provided with example inputs that are labeled with their desired outputs. The purpose of this method is for the algorithm to be able to “learn” by comparing its actual output with the “taught” outputs to find errors, and modify the model accordingly. Supervised learning therefore uses patterns to predict label values on additional unlabeled data.

For example, with supervised learning, an algorithm may be fed data with images of sharks labeled as `fish` and images of oceans labeled as `water`. By being trained on this data, the supervised learning algorithm should be able to later identify unlabeled shark images as `fish` and unlabeled ocean images as `water`.

A common use case of supervised learning is to use historical data to predict statistically likely future events. It may use historical stock market information to anticipate upcoming fluctuations, or be employed to filter out spam emails. In supervised learning, tagged photos of dogs can be used as input data to classify untagged photos of dogs.

### Unsupervised Learning

In unsupervised learning, data is unlabeled, so the learning algorithm is left to find commonalities among its input data. As unlabeled data are more abundant than labeled data, machine learning methods that facilitate unsupervised learning are particularly valuable.

The goal of unsupervised learning may be as straightforward as discovering hidden patterns within a dataset, but it may also have a goal of feature learning, which allows the computational machine to automatically discover the representations that are needed to classify raw data.

Unsupervised learning is commonly used for transactional data. You may have a large dataset of customers and their purchases, but as a human you will likely not be able to make sense of what similar attributes can be drawn from customer profiles and their types of purchases. With this data fed into an unsupervised learning algorithm, it may be determined that women of a certain age range who buy unscented soaps are likely to be pregnant, and therefore a marketing campaign related to pregnancy and baby products can be targeted to this audience in order to increase their number of purchases.

Without being told a “correct” answer, unsupervised learning methods can look at complex data that is more expansive and seemingly unrelated in order to organize it in potentially meaningful ways. Unsupervised learning is often used for anomaly detection including for fraudulent credit card purchases, and recommender systems that recommend what products to buy next. In unsupervised learning, untagged photos of dogs can be used as input data for the algorithm to find likenesses and classify dog photos together.

## Approaches

As a field, machine learning is closely related to computational statistics, so having a background knowledge in statistics is useful for understanding and leveraging machine learning algorithms.

For those who may not have studied statistics, it can be helpful to first define correlation and regression, as they are commonly used techniques for investigating the relationship among quantitative variables. Correlation is a measure of association between two variables that are not designated as either dependent or independent. Regression at a basic level is used to examine the relationship between one dependent and one independent variable. Because regression statistics can be used to anticipate the dependent variable when the independent variable is known, regression enables prediction capabilities.

Approaches to machine learning are continuously being developed. For our purposes, we’ll go through a few of the popular approaches that are being used in machine learning at the time of writing.

### k-nearest neighbor

The k-nearest neighbor algorithm is a pattern recognition model that can be used for classification as well as regression. Often abbreviated as k-NN, the k in k-nearest neighbor is a positive integer, which is typically small. In either classification or regression, the input will consist of the k closest training examples within a space.

We will focus on k-NN classification. In this method, the output is class membership. This will assign a new object to the class most common among its k nearest neighbors. In the case of k = 1, the object is assigned to the class of the single nearest neighbor.

Let’s look at an example of k-nearest neighbor. In the diagram below, there are blue diamond objects and orange star objects. These belong to two separate classes: the diamond class and the star class.

When a new object is added to the space — in this case a green heart — we will want the machine learning algorithm to classify the heart to a certain class.

When we choose k = 3, the algorithm will find the three nearest neighbors of the green heart in order to classify it to either the diamond class or the star class.

In our diagram, the three nearest neighbors of the green heart are one diamond and two stars. Therefore, the algorithm will classify the heart with the star class.

Among the most basic of machine learning algorithms, k-nearest neighbor is considered to be a type of “lazy learning” as generalization beyond the training data does not occur until a query is made to the system.

### Decision Tree Learning

For general use, decision trees are employed to visually represent decisions and show or inform decision making. When working with machine learning and data mining, decision trees are used as a predictive model. These models map observations about data to conclusions about the data’s target value.

The goal of decision tree learning is to create a model that will predict the value of a target based on input variables.

In the predictive model, the data’s attributes that are determined through observation are represented by the branches, while the conclusions about the data’s target value are represented in the leaves.

When “learning” a tree, the source data is divided into subsets based on an attribute value test, which is repeated on each of the derived subsets recursively. Once the subset at a node has the equivalent value as its target value has, the recursion process will be complete.

Let’s look at an example of various conditions that can determine whether or not someone should go fishing. This includes weather conditions as well as barometric pressure conditions.

In the simplified decision tree above, an example is classified by sorting it through the tree to the appropriate leaf node. This then returns the classification associated with the particular leaf, which in this case is either a `Yes` or a `No`. The tree classifies a day’s conditions based on whether or not it is suitable for going fishing.

A true classification tree data set would have a lot more features than what is outlined above, but relationships should be straightforward to determine. When working with decision tree learning, several determinations need to be made, including what features to choose, what conditions to use for splitting, and understanding when the decision tree has reached a clear ending.

### Deep Learning

Deep learning attempts to imitate how the human brain can process light and sound stimuli into vision and hearing. A deep learning architecture is inspired by biological neural networks and consists of multiple layers in an artificial neural network made up of hardware and GPUs.

Deep learning uses a cascade of nonlinear processing unit layers in order to extract or transform features (or representations) of the data. The output of one layer serves as the input of the successive layer. In deep learning, algorithms can be either supervised and serve to classify data, or unsupervised and perform pattern analysis.

Among the machine learning algorithms that are currently being used and developed, deep learning absorbs the most data and has been able to beat humans in some cognitive tasks. Because of these attributes, deep learning has become an approach with significant potential in the artificial intelligence space

Computer vision and speech recognition have both realized significant advances from deep learning approaches. IBM Watson is a well-known example of a system that leverages deep learning.

## Programming Languages

When choosing a language to specialize in with machine learning, you may want to consider the skills listed on current job advertisements as well as libraries available in various languages that can be used for machine learning processes.

Python’s is one of the most popular languages for working with machine learning due to the many available frameworks, including TensorFlow, PyTorch, and Keras. As a language that has readable syntax and the ability to be used as a scripting language, Python proves to be powerful and straightforward both for preprocessing data and working with data directly. The scikit-learn machine learning library is built on top of several existing Python packages that Python developers may already be familiar with, namely NumPy, SciPy, and Matplotlib.

To get started with Python, you can read our tutorial series on “How To Code in Python 3,” or read specifically on “How To Build a Machine Learning Classifier in Python with scikit-learn” or “How To Perform Neural Style Transfer with Python 3 and PyTorch.”

Java is widely used in enterprise programming, and is generally used by front-end desktop application developers who are also working on machine learning at the enterprise level. Usually it is not the first choice for those new to programming who want to learn about machine learning, but is favored by those with a background in Java development to apply to machine learning. In terms of machine learning applications in industry, Java tends to be used more than Python for network security, including in cyber attack and fraud detection use cases.

Among machine learning libraries for Java are Deeplearning4j, an open-source and distributed deep-learning library written for both Java and Scala; MALLET (MAchine Learning for LanguagE Toolkit) allows for machine learning applications on text, including natural language processing, topic modeling, document classification, and clustering; and Weka, a collection of machine learning algorithms to use for data mining tasks.

C++ is the language of choice for machine learning and artificial intelligence in game or robot applications (including robot locomotion). Embedded computing hardware developers and electronics engineers are more likely to favor C++ or C in machine learning applications due to their proficiency and level of control in the language. Some machine learning libraries you can use with C++ include the scalable mlpack, Dlib offering wide-ranging machine learning algorithms, and the modular and open-source Shark.

## Human Biases

Although data and computational analysis may make us think that we are receiving objective information, this is not the case; being based on data does not mean that machine learning outputs are neutral. Human bias plays a role in how data is collected, organized, and ultimately in the algorithms that determine how machine learning will interact with that data.

If, for example, people are providing images for “fish” as data to train an algorithm, and these people overwhelmingly select images of goldfish, a computer may not classify a shark as a fish. This would create a bias against sharks as fish, and sharks would not be counted as fish.

When using historical photographs of scientists as training data, a computer may not properly classify scientists who are also people of color or women. In fact, recent peer-reviewed research has indicated that AI and machine learning programs exhibit human-like biases that include race and gender prejudices. See, for example “Semantics derived automatically from language corpora contain human-like biases” and “Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints” [PDF].

As machine learning is increasingly leveraged in business, uncaught biases can perpetuate systemic issues that may prevent people from qualifying for loans, from being shown ads for high-paying job opportunities, or from receiving same-day delivery options.

Because human bias can negatively impact others, it is extremely important to be aware of it, and to also work towards eliminating it as much as possible. One way to work towards achieving this is by ensuring that there are diverse people working on a project and that diverse people are testing and reviewing it. Others have called for regulatory third parties to monitor and audit algorithms, building alternative systems that can detect biases, and ethics reviews as part of data science project planning. Raising awareness about biases, being mindful of our own unconscious biases, and structuring equity in our machine learning projects and pipelines can work to combat bias in this field.

## Conclusion

This tutorial reviewed some of the use cases of machine learning, common methods and popular approaches used in the field, suitable machine learning programming languages, and also covered some things to keep in mind in terms of unconscious biases being replicated in algorithms.

Because machine learning is a field that is continuously being innovated, it is important to keep in mind that algorithms, methods, and approaches will continue to change.

In addition to reading our tutorials on “How To Build a Machine Learning Classifier in Python with scikit-learn” or “How To Perform Neural Style Transfer with Python 3 and PyTorch,” you can learn more about working with data in the technology industry by reading our Data Analysis tutorials.