How does Machine Learning work?

After all, Machine Learning has been installed in our day to day without us really noticing. According to a study extracted from Hubspot,

How does Machine Learning work?
How does Machine Learning work?

After all, Machine Learning has been installed in our day to day without us really noticing. According to a study extracted from Hubspot, 63% of people use technologies based on machine learning, such as Siri, Apple's artificial intelligence, or the fact of interacting with bots via Facebook or through some E-commerce. It is also used to find new cures, translate languages ‹‹and many other things.

Any computer program with the objective of learning needs a fundamental ingredient: Data. It is for this reason that this concept belonging to Artificial Intelligence is frequently linked to the branch of Big Data.

How does a machine learn with Machine Learning?

First, to clearly understand how a machine learns, we have to understand the difference between structured data and unstructured data.

Structured data: These are the data most used by companies, and are the ones that are usually found in most databases. They are text type files that are usually classified by rows and columns with titles, such as excel or access. This type of data is already ordered and is easily processable by any data mining tool.

In practical terms, it would be like a perfectly tagged filing cabinet. An example would be a customer CRM, classified by names, email, phone and previous invoices.

Unstructured data: It is defined as binary data that has no identifiable internal structure, that is, a massive and disorganized conglomerate of data that is of no use until it is processed and stored in an orderly manner.

Once this data has gone through a filtering process, it can be easily found and categorized (with greater or lesser accuracy) to obtain information. A practical case would be the emails, if these are exported masse with only their "subject" and their "message", a data mining program would not be able to classify the messages by different categories, since you need to process each one of the words, compare them with a context, and then classify them according to previous patterns.

 90% of the data generated in the world comes from unstructured data sources, and only 10% from structured data sources. Here is the strict need for the combination of data science for data extraction and processing, added to the use of Machine Learning for the organization of data, especially for unstructured information.

Types of Machine Learning

If I wanted to know how I can classify my clients in a natural way, I would let the classification criteria be taken by the machine for me. This would be unsupervised learning, since I did not make it clear what the objective of the grouping is, on the other hand, if we wanted to classify clients by probability of unsubscribing as soon as the contract ends, there is a clearly defined objective, which we would define as a model supervised learning.

Must Read: Get better at math

To be clear about what type of Machine Learning the bot will use in the algorithm, we have to make it clear which is the target variable, or what is the same, which is the unknown we want to solve with our information system.

Supervised learning

This methodology requires a previous training phase (datasets), where hundreds of labeled data (labels) are entered. Imagine that you want a machine to be able to recognize between cats and dogs in a photo, because for this you would have to "teach" the program thousands of images where it is clear "What is a cat?" and "What is a dog?" After this training phase, the program would be able to identify each of the animals in different circumstances. This method is called classification.

Another type of supervised learning would be regression, or what is the same, following a continuous value. It is something similar to the machine being able to follow logical values ‹‹such as if there is a numerical series of 2, 4, 6, that the machine is able to follow it as 8, 10, 12. This is used especially for prediction.

Unsupervised learning

In this procedure, a training phase is not required, in this case the machine has to be able to understand and find patterns in the information itself directly. An example would be to group clients into homogeneous groups. If we gave the system information on thousands of clients with unstructured data, the computer system would be able to recognize the characteristics of the clients, and segment them into profiles with similar criteria.

This problem is called Clustering or data agglomeration. This is useful to reduce the number of total variables to 2 or 3 maximums, so that there is no loss of information, and thus the data can be visualized visually facilitating its understanding.

Reinforcement learning

This type of learning is similar to the human, since it works by an operant conditioning to the extreme. It is based on a reward system, where if the machine gives a positive result, it is "pushed", but if it makes any kind of mistake it is "punished". Thus, you learn to perform your task better based on trial and error. It is one of the most promising techniques of Machine Learning since it does not require large amounts of data, but is capable of creating an optimal solution through variables.

This methodology is used for learning in autonomous cars, or making decisions in manufacturing machinery. A good example of this case would be how a machine learns to walk on its own, just telling it to get from point A to point B. The bot itself will improve and make its own mistakes until it finds a way to reach its destination faster. Destination.

What can be done with Machine Learning?

Ranking: Which Vodafone customers will be interested in this offer? It consists of trying to classify an individual based on what you have learned from other individuals. So the program can label "interested" or "not interested" according to a data history.

Regression: How will Juan's electricity consumption be this year? It is similar to classification (in fact they both belong to supervised learning), but with the difference of trying to predict future behavior from past data.

Identify similarities: You may also be interested in these products ... The clearest example is Amazon, which suggests a catalog of products according to the purchases you have made. Try to find common patterns.

Clustering: What products should we develop? It is about grouping individuals by similarity, but without a specific purpose. It is usually used for data exploration in the preliminary phase.

Grouping co-occurrences: What products are usually bought together? This technique tries to find associations between entities based on a match in some transaction. For example, if a person buys a printer, they will also buy ink cartridges. This conclusion is not always easily removable.

Profiling: What is the typical mobile consumption of this customer segment? With this we try to identify typical behaviors, that is, it tries to look for behaviors characteristic of an individual, group or population.

Data reduction: Are all columns of data useful? If you were to create a customer database from different sources, a Dirty Data problem might occur, or you would have information overload problems. This type of technique allows you to reduce the volume of total information and make it more utilitarian.

Causal modeling: Did sales increase thanks to the campaign, or because of the Marketing actions that were made? Try to identify the influence that one event has on another.

Relationship Prediction: We have 10 friends in common. Shouldn't we be friends? Try to predict connections between elements. An applied example would be Linkedin, in which it recommends the people you could meet from the "My network" tab using similarity patterns such as similarity of studies, occupation, contacts, etc.

Our Smart Panel platform allows us to represent large complexities of structured and unstructured data, to find patterns and group the sources of income by groups that allow you to easily and simply identify which income channels are providing you the best benefits, while makes you a prediction of your business figures throughout the year.