What are the disadvantages of SVM algorithms
It seems that in addition to the sigmoid kernel function, the effects of other kernel functions are pretty good. In fact, rbf and poly have their own drawbacks. We'll use the breast cancer dataset as an example to demonstrate:
Then we found we couldn't get out, no matter how. The model is retained after the linear kernel function and no more results are printed. This proves that the kernel polynomial function takes a lot of time at this moment and the computation is very slow. Remove the polynomial kernel function in the loop and try again to get the result:
We can make two discoveries. First, the breast cancer data set is a linear data set and the linear kernel function works well. The two data rbf and sigmoid, which have good non-linearity, are completely unavailable with regard to the effect. Second, the running speed of the linear kernel function is far inferior to the two nonlinear kernel functions.
If the data is linear and we set the degree parameter to 1, the polynomial kernel function should also get good results:
The running speed of the polynomial kernel function is instantly accelerated, and the accuracy is also improved to a level close to the linear kernel function, which is pleasing. However, in previous experiments we learned that rbf can also work very well with linear data. So why is the result so bad here?
In fact, the real problem here is the dimension of the data. Do you remember how we solve the decision boundary and how we can judge whether the point is on one side of the decision boundary? This depends on the calculation of the "distance". While we cannot say that SVM is a complete distance model, the data dimension seriously affects it. Let's examine the dimensions of the breast cancer dataset:
At a glance, it turns out that the data has serious problems with different dimensions. Let's use the standardized classes in data preprocessing to standardize the data:
After completing the standardization, let SVC run the kernel function again. At this point, we set the degree to 1 and observe the performance of each kernel function for the dedimensional data:
After the dimensions have been unified, it can be observed that the computation time of all kernel functions is greatly reduced, especially for linear kernels, while polynomial kernel functions have become the fastest computation. Second, rbf showed very good results. After our exploration, we can draw the conclusion:
1. Linear kernels, especially polynomial kernels, are very slow in many respects.
2. Both the rbf and polynomial kernel functions are not good at handling records with uneven dimensions
Fortunately, dimensionless data can correct both of these shortcomings. therefore,Before running SVM, it is strongly recommended that you first make the data dimensionless！
- How does money divide people
- Why is the luxury fashion industry declining
- What do the Irish eat for Christmas
- Which is the best JBL speaker
- What was the Harlem Renaissance
- Is Thugs of Hindostan a flop film
- Can work as a materials engineer as a metallurgy engineer
- Does this mole look cancerous?
- What is enema
- What is Walmart's Employee Vacation Policy
- What graphic novels have you read recently?
- Why is iodized salt called iodized salt
- There are citizenship and passports of Hong Kong
- Income tax will be included in the GST
- Work at home jobs are a scam
- IPCC equals completion
- Are all White Aryans
- Why do people eat snakes
- Is the automotive industry synonymous with mechanical engineering
- Review companies training on background checks
- Is a GPA important for an internship?
- What makes you think of France
- Is Andorra worth a visit
- How has technology changed our lives