Tag Archives: how machine learning works

Types of Machine Learning

Machine learning methods are a set of tasks aimed at testing hypotheses, finding optimal solutions using artificial intelligence.

There are three methods

Supervised learning: In this case, an array of data on a specific task is loaded into the analytical system and a direction is set – the goal of the analysis. As a rule, you need to predict something or test some hypothesis.

For example, we have data on the income of an online store for six months of operation. We know how many products were sold, how much money was spent on attracting customers, ROI, average check, number of clicks, bounces and other metrics. The task of the machine is to analyze the entire data array and issue a forecast of income for the upcoming period – month, quarter, six months or a year. It is a regressive problem-solving method.
Another example. Based on the array of data and selection criteria, it is necessary to determine whether the text of the letter to the e-mail is spam. Or, having data on the performance of schoolchildren in subjects, knowing their IQ on tests, gender and age, you need to help graduates decide on career guidance. The analytical engine seeks out and checks common features, compares and classifies test results, grades in the school curriculum, and a mindset. Based on the data, it makes a forecast. These are classification tasks.

Unsupervised learning: Learning is based on the fact that the person and the program do not know the correct answers in advance, there is only a certain amount of data. The analytical engine, processing information, itself looks for interconnections. Often we have unobvious solutions at the end.

For example, we know the data on the weight, height and body type of 10,000 potential buyers of jumpers of a certain style. We load information into the machine in order to divide clients into clusters in accordance with the available data. As a result, we will get several categories of people with similar characteristics in order to release a jumper of the desired style for them. These are clustering tasks. Another example. To describe any phenomenon, you have to use 200-300 characteristics. Accordingly, it is extremely difficult to visualize such data, and it is simply impossible to understand them. The analytical system is tasked with processing an array of characteristics and choosing similar ones, that is, compressing the data to 2-5-10 characteristics. These are dimensionality reduction problems.

Deep learning. Deep machine learning is necessarily Big Data analysis. That is, it is not possible to process so much data with one computer, one program. Therefore, neural networks are used. The nature of this training is that a big field of data is divided into small segments, the processing of which is delegated to other devices. For example, single processor only collects information on a task and transfers it further, four other processors analyze the collected data and transmit the results further. The other processors in the chain are looks for solutions

For example, an object recognition system works on the principle of a neural network. First, the entire object is photographed (obtaining graphic information), then the system breaks down the data into points, finds lines from these points, builds simple shapes from lines, and from them – complex two-dimensional and then 3D objects.

For each of these methods, there are various algorithms for adjusting the parameters in order to achieve the best possible agreement with the known data. These algorithms are the real learning processess in machine learning. Examples are gradient escape, reverse propagation, and genetic algorithms.

Some algorithms perform better or worse depending on the purpose of the application. This can also be influenced by data. Some special applications even require modification of the algorithms themselves. In many cases, very good results can be achieved using standard algorithms. In some cases, however, it may be necessary to modify the algorithm or develop your own.

How Machine Learning Works

It is easy to look at machine learning as a magical black box, in which you insert data and make predictions. With that, there is nothing magical about machine learning, writes IDG News. In fact, it is important to understand how the different parts of machine learning work, to get better results. So, join us on a tour.

As in many other IT contexts, such as devops, the term “pipeline” is used in machine learning. It is a visual parable of how data flows through a solution. The pipeline can be roughly divided into four parts:

  1. Collect data, called a little funny for “ingesting” (inta) in English.
  2. Prepare data, such as data wash and normalization if needed. Normalization in this context should not be confused with normalization of relational databases, but it is about adapting different value scales to each other.
  3. Model training.
  4. Provide predictions.

Here are more detailed descriptions of the four phases:

Decide on data

Two things are needed to get started with machine learning: data to train a model and algorithms that control training. Data can come from different sources. This is often about data from any business process that is already being collected, either continuously or in archived form.

In some cases, you have to work with streaming data. Then you can choose between managing data streaming or first storing it in a database. In the case of streaming data management, there is another choice between two options: Either you use new data to fine-tune an existing model or you build new models from time to time and train them with new data.

How Machine Learning Works

These decisions affect the choice of algorithms. Some algorithms are suitable for fine-tuning models, others not. In the latter case, you may start with new data.

Data washing is often about scales

There can be a lot of confusion in the data that is taken from a lot of different sources. One thing that often needs to be arranged is to normalize the data, ie to convert different data values ​​to the same scale.

A simple example is that 2.45 meters in high jump can be considered as worthwhile as 8.95 meters in long jump, as both are world records. In order to understand that the values ​​are equally valuable, they need to be converted, normalized, for example to 1.0 in both cases.

But in some cases normalization is not appropriate. It applies whether the scale actually matters. If you want to compare female and male height jumpers, it may be appropriate to normalize so that 2.45 meters for men will have the same value as 2.09 meters for women, as both are world records. But if you want to compare height jumpers regardless of gender then you should not normalize the values.

During the data preparation phase, it is also important to analyze how bias can affect models. This may include, for example, how to select data to use or how to normalize data.

Time for hard training

The next phase is the actual training of a model. It involves using data to generate a model from which predictions can be made. The key activity during training is to make settings, which is called “hyperparameterization” in English.

A hyperparameter is a setting that controls how a model is created based on an algorithm. A very simple example is if you want to divide a number of worlds into categories. In that case, a hyperparameter can be the number of categories you want. One way to arrive at good hyperparameters is to simply try them out. But in some cases, these settings can be optimized automatically.

Sometimes the training can be run in parallel on several processors, which of course provides performance benefits. It doesn’t have to be different processors, but you talk about workers. Workers in this case are simply different copies of a program that runs at the same time in different places.

The parallelization can mainly be done in two different ways: first, different “workers” can work with different parts of a data set, and different “workers” can work with different parts of the model.

Time for delivery

The final phase is to use the pre-trained model, which can be called the “predict and deliver” phase. Now you run the model on new data to generate a prediction. For example, if it is about face recognition, then incoming data is a digital image of a face. Based on training with other images on the faces, the model can now make new predictions. How you handle all the different activities in the different phases, or the different parts of the pipeline, varies. Using cloud services increases the chance of handling multiple parts in the same place, such as training data, pre-trained models, and so on.

In some cases, decisions must be made in cases where the different parts should be handled on servers or client devices. One advantage of running processing on a client, such as a smart mobile, is that accessibility is increased for the user. One potential disadvantage is the poor quality of the prediction, as there are less hardware resources, another poor performance, thus it takes longer to generate a prediction.

Iterative working method

To illustrate the whole flow of machine learning with a pipeline, ie a pipe, is a bit misleading. It is often about iterative work, that is, certain phases are repeated and refined. The type example is that a model is trimmed with new data.

The advantage of thinking of a pipeline with delimited parts is that it becomes easy to focus on the different parts as delimited areas that work in different ways.

A general observation that machine learning is actually as good can be called data analysis, or even math, as AI. What you call machine learning for AI may be because it is a technology that makes it possible to draw conclusions that humans, at least in most cases, cannot.