22 Şubat 2021 Pazartesi

Machine Learning - Support Vector Machines (SVM) Yöntemi

Giriş
Bu yöntem popüler. Açıklaması şöyle
Support Vector Machines (SVM) are one of the most popular supervised learning methods in Machine Learning(ML). Many researchers have reported superior results compared with older ML techniques.

SVM can be applied on regression problems as well as classification problems, ....
SVM Linear Applications ve Nonlinear Applications için kullanılabilir.

Linear Regression elimizdeki noktalara bakarak bir doğrusal denklem yani çizgi denklemi bulmak demek

SVM Linear Applications
Açıklaması şöyle
A popular classifier for linear applications because SVM’s have yielded excellent generalization performance on many statistical problems with minimal prior knowledge and also when the dimension of the input space(features) is very high.
SVM Nonlinear Applications
Açıklaması şöyle
A popular classifier for linear applications because SVM’s have yielded excellent generalization performance on many statistical problems with minimal prior knowledge and also when the dimension of the input space(features) is very high.
Şeklen şöyle
Maximum margin hyperplane
Açıklaması şöyle
The objective is to find the line passing as far as possible from all points
Kernel Trick
Açıklaması şöyle
SVM uses a Kernel trick to transform to a higher nonlinear dimension where an optimal hyperplane can more easily be defined.
Kernel tipleri şöyle
- Linear Kernel
- Polynomial Kernel     
- RBF - Radial Basis Function Kernel
- Gaussian     kernel
- Hyperbolic     tangent kernel
Neural Networks vs SVMs
SVM Convex veri için daha uygun. Açıklaması şöyle
One important argument is SVM is convex but NN is generally not. Having a convex problem is desirable because we have more tools to solve it more reliable.

If we know our data, we can pick a better model to fit data better. For example, if we have some data like donut shape. Like this
Doğru kernel seçimi de önemli. Açıklaması şöyle
using SVM with right kernel is better than using NN and NN may overfit data in this case.
Neural Networks aslında SVN'den daha eski. Açıklaması şöyle.
Historically, neural networks are older than SVMs and SVMs were initially developed as a method of efficiently training the neural networks. So, when SVMs matured in 1990s, there was a reason why people switched from neural networks to SVMs. Later, as data sets grew larger and more complex, so that feature selection became a (even bigger) problem, while, at the same time, computational power rose, people switched back again.

This development already suggests that both have their strengths and weaknesses and that there is, as Haitao says, no free lunch.

Essentially, both methods do some kind of data transformation to "send" them into a higher dimensional space. What the kernel function does for the SVMs, the hidden layers do for neural networks. The last, output layer in the network also performs a linear separation of the so transformed data. So this is not the core difference.
Her iki yöntem de verinin boyutunu (dimension) artırır. Açıklaması şöyle.
As you can see below, a two-layer neural network, with 5 neurons in the hidden layer, can perfectly separate the two classes. The blue class can be fully enclosed in a pentagon (pale blue) area. Each neuron in the hidden layer determines a linear boundary---a side of the pentagon, producing, say, +1 when its input is a point on the "blue" side of the line and -1 otherwise (it could also produce 0, it doesn't really matter).

I have used different colours to highlight which neuron is responsible for which boundary. The output neuron (black) simply checks (performs a logical AND, which is again a linearly separable function) whether all hidden neurons give the same, "positive" answer. Observe that this last neuron has five inputs. I.e. its input is a 5-dimensional vector. So the hidden layers have transformed 2D data into 5D data.

Burada SVM'nin veri boyutu büyüdükçe boundary çizerken zorlandığı anlatılıyor
Notice, however, that the boundaries drawn by the neural network are somewhat arbitrary. You can shift and rotate them slightly without really affecting the result. How the network draws the boundary is somewhat random; it depends on the initialisation of the weights and on the order you present the training set to it. This is where SVMs differ: They are guaranteed to draw the boundary mid-way between the closest points of the two classes! It can be (has been) shown that this boundary is the optimal one. Finding the boundary is a convex (quadratic) optimisation problem for which fast algorithms exist. Also, the kernel trick has the computational advantage that it's usually much faster to compute a single non-linear function than to pass the vector through many hidden layers.

However, since SVMs never compute the boundary explicitly, but through the weighted sum of the kernel functions over the pairs of the input data, the computational effort scales quadratically with the data set size. For large data sets this quickly becomes impractical.

Also, when the data are high-dimensional (think of images, with millions of pixels) the SVMs might become overwhelmed by the curse of dimensionality: It becomes too easy to draw a good boundary on the training set, but which has poor generalisation properties. Convolutional neural networks, on the other hand, are capable of learning the relevant features from the data.
Yani temel tavsiye şöyle
In summary, my suggestion is to use SVMs for low-dimensional, small data sets and neural networks for high-dimensional large data sets.



Hiç yorum yok:

Yorum Gönder