Yazılım Çorbası: Machine Learning - Support Vector Machines (SVM) Yöntemi

Giriş

Bu yöntem popüler. Açıklaması şöyle

Support Vector Machines (SVM) are one of the most popular supervised learning methods in Machine Learning(ML). Many researchers have reported superior results compared with older ML techniques.

SVM can be applied on regression problems as well as classification problems, ....

SVM Linear Applications ve Nonlinear Applications için kullanılabilir.

Linear Regression elimizdeki noktalara bakarak bir doğrusal denklem yani çizgi denklemi bulmak demek

SVM Linear Applications

Açıklaması şöyle

A popular classifier for linear applications because SVM’s have yielded excellent generalization performance on many statistical problems with minimal prior knowledge and also when the dimension of the input space(features) is very high.

SVM Nonlinear Applications

Açıklaması şöyle

A popular classifier for linear applications because SVM’s have yielded excellent generalization performance on many statistical problems with minimal prior knowledge and also when the dimension of the input space(features) is very high.

Şeklen şöyle

Maximum margin hyperplane

Açıklaması şöyle

The objective is to find the line passing as far as possible from all points

Kernel Trick

Açıklaması şöyle

SVM uses a Kernel trick to transform to a higher nonlinear dimension where an optimal hyperplane can more easily be defined.

Kernel tipleri şöyle

- Linear Kernel
- Polynomial Kernel
- RBF - Radial Basis Function Kernel
- Gaussian kernel
- Hyperbolic tangent kernel

Neural Networks vs SVMs

SVM Convex veri için daha uygun. Açıklaması şöyle

One important argument is SVM is convex but NN is generally not. Having a convex problem is desirable because we have more tools to solve it more reliable.

If we know our data, we can pick a better model to fit data better. For example, if we have some data like donut shape. Like this

Doğru kernel seçimi de önemli. Açıklaması şöyle

using SVM with right kernel is better than using NN and NN may overfit data in this case.

Neural Networks aslında SVN'den daha eski. Açıklaması şöyle.

Historically, neural networks are older than SVMs and SVMs were initially developed as a method of efficiently training the neural networks. So, when SVMs matured in 1990s, there was a reason why people switched from neural networks to SVMs. Later, as data sets grew larger and more complex, so that feature selection became a (even bigger) problem, while, at the same time, computational power rose, people switched back again.

This development already suggests that both have their strengths and weaknesses and that there is, as Haitao says, no free lunch.

Essentially, both methods do some kind of data transformation to "send" them into a higher dimensional space. What the kernel function does for the SVMs, the hidden layers do for neural networks. The last, output layer in the network also performs a linear separation of the so transformed data. So this is not the core difference.

Her iki yöntem de verinin boyutunu (dimension) artırır. Açıklaması şöyle.

As you can see below, a two-layer neural network, with 5 neurons in the hidden layer, can perfectly separate the two classes. The blue class can be fully enclosed in a pentagon (pale blue) area. Each neuron in the hidden layer determines a linear boundary---a side of the pentagon, producing, say, +1 when its input is a point on the "blue" side of the line and -1 otherwise (it could also produce 0, it doesn't really matter).

I have used different colours to highlight which neuron is responsible for which boundary. The output neuron (black) simply checks (performs a logical AND, which is again a linearly separable function) whether all hidden neurons give the same, "positive" answer. Observe that this last neuron has five inputs. I.e. its input is a 5-dimensional vector. So the hidden layers have transformed 2D data into 5D data.

Burada SVM'nin veri boyutu büyüdükçe boundary çizerken zorlandığı anlatılıyor

Notice, however, that the boundaries drawn by the neural network are somewhat arbitrary. You can shift and rotate them slightly without really affecting the result. How the network draws the boundary is somewhat random; it depends on the initialisation of the weights and on the order you present the training set to it. This is where SVMs differ: They are guaranteed to draw the boundary mid-way between the closest points of the two classes! It can be (has been) shown that this boundary is the optimal one. Finding the boundary is a convex (quadratic) optimisation problem for which fast algorithms exist. Also, the kernel trick has the computational advantage that it's usually much faster to compute a single non-linear function than to pass the vector through many hidden layers.

However, since SVMs never compute the boundary explicitly, but through the weighted sum of the kernel functions over the pairs of the input data, the computational effort scales quadratically with the data set size. For large data sets this quickly becomes impractical.

Also, when the data are high-dimensional (think of images, with millions of pixels) the SVMs might become overwhelmed by the curse of dimensionality: It becomes too easy to draw a good boundary on the training set, but which has poor generalisation properties. Convolutional neural networks, on the other hand, are capable of learning the relevant features from the data.

Yani temel tavsiye şöyle

In summary, my suggestion is to use SVMs for low-dimensional, small data sets and neural networks for high-dimensional large data sets.

Yazılım Çorbası

22 Şubat 2021 Pazartesi

Machine Learning - Support Vector Machines (SVM) Yöntemi

Hiç yorum yok:

Yorum Gönder

Blog Arşivi