- Overview
- Categories of Machine Learning
- Artificial Intelligence (AI) vs. Machine Learning (ML)
- Methodologies for Building Machine Learning Systems
Machine Learning is a branch of artificial intelligence (AI) and computer science. It is a collection of algorithms and statistical techniques used to create computational systems that learn from data in order to make predictions, intelligent decisions, and inferences.
Teacher: This is a dog 🐶 and that is a cat 🐱.
Toddler: Ohh I see; thank you.
Teacher: {Provides picture of a dog and asks}: "What is this?"
Toddler: It's a dog 🐶.
Teacher: {Provides picture of a cat and asks}: "What is this?"
Toddler: It's a cat 🐱.
On the other hand, a machine learning model is a program that has been trained using a specific dataset to recognize certain types of similar patterns and make informed predictions. You provide a model with a data set and train it with some algorithms. When you introduce new data, the model will use learned knowledge to recognize the new data set.
Some example use cases:
- Object detection and face recognition security systems where the device recognizes your face as a human face (and not an object) and grants entry upon authorized access.
- Image recognition, medical image analysis, disease prediction, personalized treatment plans, drug discovery, etc.
- Recommendation systems like product recommendations (e.g., Amazon, Netflix) and content suggestions (e.g., Spotify, YouTube).
- Sentiment analysis, trend prediction, customer behavior analysis, chatbots, language translation, speech recognition, text summarization, etc.
- Identifying unusual patterns or outliers in data, such as fraud detection in finance or network security.
- Detecting emotions from facial expressions or speech (voice recognition).
- Converting handwritten text into digital format and further using that to solve math problems.
- Assessing creditworthiness and determining lending risk for individuals or businesses.
- And much more!
There are three paradigms used to develop machine learning models. Let's discuss them briefly.
Note
All data and illustrations described below are synthetic, hypothetical, and randomly generated. They are not real data and are only used for explanation purposes. I (Bolaji) will share more context during the workshop.
This is a type of ML where an algorithm is given a large input dataset with corresponding output or event/class, usually prepared in consultation with the subject matter domain expert ("supervisor"). Consider the example data in Table 1.0 below with some animal images and their corresponding labels. The table represents a collection of data points; each column represents a feature or attribute, and each row represents labelled data (cat or dog). The last column usually represents the target or label or output variable. The algorithm will use this data to learn the relationship between the features (animal images, in this case) and the target output. So, after training and testing the model, when you provide a new image, the model will use the learned knowledge to predict the label of the new image.
Table 1.0
Image | Label |
---|---|
Cat | |
Cat | |
Dog | |
Cat | |
Dog | |
Dog |
Image credits: the beautiful animal illustrations are from freepik.com.
Table 2.0 (Another example with multiple features)
Gender | Marital Status | Dependents | Education | Amount | Credit_Score | Property_Area | Loan Status |
---|---|---|---|---|---|---|---|
Female | Yes | 3 | Yes | 300 | 1.0 | Urban | Approved |
Male | Yes | 0 | No | 10000 | 0 | Rural | Rejected |
Female | No | 2 | Yes | 6500 | 0 | Urban | Approved |
Female | Yes | 1 | Yes | 6500 | 1.0 | Urban | Approved |
Male | No | 0 | No | 10000 | 1.0 | Rural | Rejected |
Male | No | 7 | No | 10000 | 1.0 | Urban | Rejected |
Female | Yes | 1 | Yes | 6500 | 0 | Rural | Approved |
Male | Yes | 0 | No | 10000 | 1.0 | Urban | Rejected |
Female | No | 2 | Yes | 6500 | 0 | Rural | Approved |
Male | No | 4 | Yes | 5000 | 0 | Urban | Approved |
Supervised learning is further classified into two main categories of algorithms:
This is a type of ML where an algorithm is given some input dataset without the desired output or event/class and subject matter domain expert consultation. Consider the example data in Table 3.0 below with some data of individuals who applied for a mortgage (you'd observe that there are no corresponding labels that will tell the algorithm what the result should be—approved or rejected). The table represents a collection of data points; each column represents a feature or attribute, and each row represents unlabelled data. The algorithm will use this data to learn the unique qualities, similarities, patterns, and differences between the features without labelled guidance. So, after training the model, the model will use the self-learned knowledge to predict the loan status of a new applicant.
Table 3.0
Gender | Marital Status | Dependents | Education | Amount | Credit_Score | Property_Area |
---|---|---|---|---|---|---|
Female | Yes | 3 | Yes | 300 | 1.0 | Urban |
Male | Yes | 0 | No | 10000 | 0 | Rural |
Female | No | 2 | Yes | 6500 | 0 | Urban |
Female | Yes | 1 | Yes | 6500 | 1.0 | Urban |
Male | No | 0 | No | 10000 | 1.0 | Rural |
Male | No | 7 | No | 10000 | 1.0 | Urban |
Female | Yes | 1 | Yes | 6500 | 0 | Rural |
Male | Yes | 0 | No | 10000 | 1.0 | Urban |
Female | No | 2 | Yes | 6500 | 0 | Rural |
Male | No | 4 | Yes | 5000 | 0 | Urban |
Unsupervised learning is further classified into two main categories of algorithms:
Category | Description | Example |
---|---|---|
Clustering | Grouping the data based on similarities and differences to identify those with similar properties. | ~Image credits: the periodic table is from Wikipedia. |
Association | Finding corelations or occurences in the data using some rules and mapping the data based on the observed dependency information. |
I won't go into full details for this one but will briefly define it and give some examples. This type of ML algorithm maps situations to actions that result in the highest possible reward (maximizing the reward and minimizing penalty to increase the total reward achieved). The program (mostly called agent) learns without intervention from a human but from its own experiences and interactions with the environment. The agent learns from its own mistakes and tries to improve its performance based on the consequences of its actions. The agent learns what to do and what not to do from the feedback given by the environment.
Consider the example of a dog barking towards a stranger. The dog will learn that barking at strangers is good and will continue to do so. However, suppose the dog barks at a family member and receives a negative response. In that case, it will learn that barking at family members is bad and will stop doing so. If the dog doesn't bark at a family member and receives a bone as a reward, it will stop barking towards family members. The more the reward and response, the more the dog adjusts. The dog therefore learns from its own experiences and interactions with the environment.
Some real-world examples include self-driving cars, robotics, games, personalized medical treatment plans/prosthetics, algorithmic trading, etc.
To conclude, I'll quote Wikipedia's succinct summary of how reinforcement learning works:
"Reinforcement learning differs from supervised learning in not needing labelled input/output pairs to be presented, and in not needing sub-optimal actions to be explicitly corrected. Instead the focus is on finding a balance between exploration (of uncharted territory) and exploitation (of current knowledge)."
In summary and to contrast:
Supervised Learning | Unsupervised Learning | Reinforcement Learning | |
---|---|---|---|
Number of Classes | Known | Unknown | Unknown |
Input | Labeled data | Unlabeled data | Data or Feedback |
Output | Predictions | Discover (hidden or interesting) patterns | Optimal actions |
Computational Complexity | Low | High | High |
Accuracy | High | Low | High |
Reliability | Yes | Yes | Yes |
Example(s) | Image classification, Speech-to-text, sentiment analysis, etc. | Anomaly detection, image recognition, recomendation engines, etc. | Self-driving cars, robotics, digital games, etc. |
Artificial Intelligence is a broad term that refers to the ability of a machine or system to perform tasks that require human intelligence. Machine Learning, on the other hand, is a subset of Artificial Intelligence that uses statistical techniques to enable machines to improve their experience at performing tasks. There are other subsets like Deep Learning that use neural networks to allow machines to improve their experience at performing tasks.
There are several data-mining methodologies for building machine learning systems. Some of the most popular ones are:
Knowledge Discovery Databases (KDD)
This refers to the overall iterative process (sequence of steps) of discovering useful and valid knowledge from data.
Sample, Explore, Modify, Model and Assess (SEMMA)
This refers to the sequential steps required to build machine learning models incorporated in 'SAS Enterprise Miner'—a product by SAS Institute Inc. (one of the largest producers of commercial statistical and business intelligence software).
Cross Industrial Standard Process for Data Mining (CRISP-DM)
This refers to a process model with six phases (consolidating the best practices followed by experts) that is not domain-dependent and describes the entire life cycle of a data mining project.
Thank you for coming this far; you've done well 👏🏾. Please open a new GitHub discussion using the links below and let me know your thoughts about this lesson or any issues you're experiencing.
<< previous lesson | next lesson >>