3 Practical SVM Examples to Boost Your Machine Learning Skills

In the previous guide on Support Vector Machines, we understood the basic implementation and functioning of the machine learning classifier.

In this post, I’ll walk you through three practical examples where we’ll use SVM for both classification and regression tasks. We start by a simple linear classification example and climb up to a more challenging non-linear classification in the second one. Ultimately we get our hands on the third one where the data is not only non-linear but we will also fine tune the hyperparameters for model optimization. Each example comes with Python code and explanations to help you understand what we’re doing and why.

1. Binary Classification on Linearly Separable Data

The first and most basic use of SVM is for binary classification on linearly separable data. Here, we create simple 2D data where two classes can be separated by a straight line. In this example, we will learn the basics of linear SVMs and how to visualize decision boundaries.

# Importing libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Generate linearly separable data
X, y = datasets.make_classification(n_features=2, n_classes=2, n_redundant=0,
                                    n_clusters_per_class=1, n_samples=100, random_state=42)

# Split into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and fit SVM classifier
clf = SVC(kernel='linear')
clf.fit(X_train, y_train)

# Predictions and accuracy
y_pred = clf.predict(X_test)
print(f"Accuracy: {accuracy_score(y_test, y_pred):.2f}")

# Plot the data and decision boundary
def plot_decision_boundary(X, y, model):
    xx, yy = np.meshgrid(np.linspace(X[:,0].min()-1, X[:,0].max()+1, 100),
                         np.linspace(X[:,1].min()-1, X[:,1].max()+1, 100))
    Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
    Z = Z.reshape(xx.shape)
    plt.contourf(xx, yy, Z, alpha=0.3)
    plt.scatter(X[:, 0], X[:, 1], c=y, edgecolors='k')
    plt.show()

plot_decision_boundary(X, y, clf)

machine learning examples

Explanation of key code lines:

  • clf = SVC(kernel=’linear’): Since the data is linearly separable, we create an instance of an SVM model with a linear kernel.
  • clf.fit(X_train, y_train): We then train the linear SVM model on the training dataset.

2. Support Vector Regression (SVR) on a Sine Wave

SVM isn’t just for classification—it can also be used for regression tasks. In this example, we use Support Vector Regression (SVR) to fit a sine wave with added noise.

Non-linear SVM with an RBF (Radial Basis Function) kernel is a method to classify data that isn’t linearly separable in its original feature space. By leveraging the kernel trick, SVM maps the input data into a higher-dimensional space where a linear decision boundary can effectively separate the classes.

Furthermore, this example introduces epsilon-tube regression, where deviations within a margin are ignored. Epsilon-tube regression is a concept used in Support Vector Regression (SVR), where the goal is to predict continuous values. It defines a margin of tolerance around the true values. Deviations within this margin are ignored during training, allowing the model to focus only on.

# Import libraries
from sklearn.svm import SVR

# Generate sine wave data
np.random.seed(42)
X = np.sort(5 * np.random.rand(100, 1), axis=0)
y = np.sin(X).ravel() + np.random.normal(0, 0.1, X.shape[0])

# Fit SVR with RBF kernel
svr_rbf = SVR(kernel='rbf', C=100, gamma=0.1, epsilon=0.1)
svr_rbf.fit(X, y)

# Predict and plot
X_pred = np.linspace(0, 5, 100).reshape(-1, 1)
y_pred = svr_rbf.predict(X_pred)

plt.scatter(X, y, color='darkorange', label='Data')
plt.plot(X_pred, y_pred, color='navy', lw=2, label='SVR RBF Fit')
plt.xlabel('X')
plt.ylabel('y')
plt.title('Support Vector Regression')
plt.legend()
plt.show()

machine learning

Key Code Lines Explained

  • SVR(kernel=’rbf’): Specifies the regression model with an RBF kernel for non-linear fitting.
  • epsilon=0.1: Defines a margin of tolerance where deviations are not penalized. Smaller values create tighter fits, while larger values allow for more flexibility.

3. Non-linear Data with RBF Kernel and Hyperparameter Tuning

After getting started with a linear dataset, let us now try a slightly more complicated example that involves training our model on a non-linear dataset. In many real-world datasets, the classes are not linearly separable. Here again, we use the Radial Basis Function (RBF) kernel to handle non-linear relationships.

Moreover, we will understand the important hyperparameters for Support Vector Machines and tune them to optimize the model performance.

Key Code Lines Explained

# Import libraries
from sklearn.datasets import make_moons
from sklearn.model_selection import GridSearchCV

# Generate non-linear data
X, y = make_moons(n_samples=200, noise=0.1, random_state=42)

# Split into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Hyperparameter tuning with GridSearchCV
param_grid = {
    'C': [0.1, 1, 10, 100],
    'gamma': [0.01, 0.1, 0.5, 1],
    'kernel': ['rbf']
}

grid_search = GridSearchCV(SVC(), param_grid, cv=5)
grid_search.fit(X_train, y_train)

# Best parameters and model
print(f"Best parameters: {grid_search.best_params_}")
best_model = grid_search.best_estimator_

# Predictions and accuracy
y_pred = best_model.predict(X_test)
print(f"Accuracy with RBF Kernel and GridSearch: {accuracy_score(y_test, y_pred):.2f}")

# Plot decision boundary
plot_decision_boundary(X, y, best_model)

machine learning

Explanation of key code lines:

  • GridSearchCV: Automates hyperparameter tuning by testing multiple combinations of C and gamma. C or Regularization Parameter balances maximizing the margin and minimizing classification error. A smaller value makes the decision boundary smoother, while a larger value focuses on correctly classifying all training points.Gamma or Kernel Coefficient determines the influence of a single data point. Higher gamma values lead to more localized decision boundaries, while lower values create smoother boundaries.
  • param_grid: Defines the grid of hyperparameters to search over.
  • best_model = grid_search.best_estimator_: Retrieves the model with the best hyperparameter combination.

Summary

In this blog, we’ve curated three engaging examples to help us master Support Vector Machines (SVM). Here’s what we’ve learned from each example:

We started with the Binary Classification with Linear SVM where we trained an SVM model on linear dataset. To make things a little more interesting, we built a Support Vector Regression (SVR) on a Sine Wave to understand the regression variant of Support Vector Machines. Ultimately, to optimize the model performance, the hyperparameters were tuned to optimal values using the GridSearchCV function of scikit-learn library.

It is always fun to build things in Python like these machine learning models. But it gets even more interesting when the model is evaluated using some metric. One such metric for classifiers is the Confusion Matrix that analyses the correct and the incorrect classifications made by the model. How does it work? Let us have a look together -> Confusion Matrix 101: Understanding Precision and Recall for Machine Learning Beginners

If you enjoyed this blog and found this helpful, support me by following me on social media:

This Post Has One Comment

Leave a Reply