This Rmarkdown documents contains the code and 2D simulations of the error propagation and classification uncertainty estimation using Monte Carlo simulations for two generic classes with polynomial separation classified with Polynomial SVM.
windowsFonts(A = windowsFont("DejaVu Sans"))
palette("Tableau10")
library(caret) #classification using train()
library(kernlab)
library(MASS) #generate the samples and calculate the augmented noise
library(dplyr) #data handling
library(knitr) #knit options for the Rmarkdown report
library(kableExtra) #table options
source("predict_with_uncertainty_parallel_v4.R")
source("plot_functions.R")
set.seed(123)
n_points <- 10
n_replicates <- 5
noise_sd <- 0.1
rho <- 0
cov_matrix <- matrix(c(noise_sd^2, rho * noise_sd^2,
rho * noise_sd^2, noise_sd^2), nrow = 2)
# Define curve generator with vertical offset
curve_points <- function(n_points, curvature = 2, offset = 0, x_range = c(-0.8, 0.8)) {
x_vals <- seq(x_range[1], x_range[2], length.out = n_points)
y_vals <- (x_vals)^curvature + offset
return(data.frame(x = x_vals, y = y_vals))
}
# Replicated point generation
generate_class_curve <- function(n_points, n_replicates, centers, cov_matrix) {
points <- matrix(0, nrow = n_points * n_replicates, ncol = 2)
for (i in 1:n_points) {
base_point <- c(centers$x[i] + rnorm(1, 0, 0.1), centers$y[i] + rnorm(1, 0, 0.1))
for (j in 1:n_replicates) {
noise <- mvrnorm(1, mu = c(0, 0), Sigma = cov_matrix)
points[(i - 1) * n_replicates + j, ] <- base_point + noise
}
}
return(points)
}
# Class 1 and Class 2 on same curve, offset vertically
curve1 <- curve_points(n_points, curvature = 2, offset = 0)
curve2 <- curve_points(n_points, curvature = 2, offset = 0.5) # class 2 is shifted up
class1 <- generate_class_curve(n_points, n_replicates, curve1, cov_matrix)
class2 <- generate_class_curve(n_points, n_replicates, curve2, cov_matrix)
generate_class_curve <- function(n_points, n_replicates, centers, cov_matrix) {
points <- matrix(0, nrow = n_points * n_replicates, ncol = 2)
for (i in 1:n_points) {
base_point <- c(centers$x[i] + rnorm(1, 0, 0.1), centers$y[i] + rnorm(1, 0, 0.1))
for (j in 1:n_replicates) {
noise <- mvrnorm(1, mu = c(0, 0), Sigma = cov_matrix)
points[(i - 1) * n_replicates + j, ] <- base_point + noise
}
}
return(points)
}
# Generate classes
class1 <- generate_class_curve(n_points, n_replicates, curve1, cov_matrix)
class2 <- generate_class_curve(n_points, n_replicates, curve2, cov_matrix)
data <- data.frame(
Class = factor(rep(c("Class 1", "Class 2"), each = n_points * n_replicates)),
Sample_ID = rep(1:(2 * n_points), each = n_replicates),
X1 = c(class1[, 1], class2[, 1]),
X2 = c(class1[, 2], class2[, 2])
)
# Mean center the data
data$X1 <- data$X1 - mean(data$X1)
data$X2 <- data$X2 - mean(data$X2)
# Calculate the mean and standard deviation of each sample
mean_data <- aggregate(cbind(X1, X2) ~ Class + Sample_ID, data = data, FUN = mean)
sd_data <- aggregate(cbind(X1, X2) ~ Class + Sample_ID, data = data, FUN = sd)
rm(class1, class2)
svm_model <- train(y = mean_data$Class,
x = mean_data[,3:4],
method = "svmPoly")
svm_model
## Support Vector Machines with Polynomial Kernel
##
## 20 samples
## 2 predictor
## 2 classes: 'Class 1', 'Class 2'
##
## No pre-processing
## Resampling: Bootstrapped (25 reps)
## Summary of sample sizes: 20, 20, 20, 20, 20, 20, ...
## Resampling results across tuning parameters:
##
## degree scale C Accuracy Kappa
## 1 0.001 0.25 0.4413968 0.09962096
## 1 0.001 0.50 0.4413968 0.09962096
## 1 0.001 1.00 0.4413968 0.09962096
## 1 0.010 0.25 0.4413968 0.09962096
## 1 0.010 0.50 0.4413968 0.09962096
## 1 0.010 1.00 0.4413968 0.09962096
## 1 0.100 0.25 0.4515556 0.11578257
## 1 0.100 0.50 0.5377937 0.23299353
## 1 0.100 1.00 0.6264762 0.32890026
## 2 0.001 0.25 0.4413968 0.09962096
## 2 0.001 0.50 0.4413968 0.09962096
## 2 0.001 1.00 0.4413968 0.09962096
## 2 0.010 0.25 0.4413968 0.09962096
## 2 0.010 0.50 0.4413968 0.09962096
## 2 0.010 1.00 0.4413968 0.09962096
## 2 0.100 0.25 0.5377937 0.23299353
## 2 0.100 0.50 0.6368413 0.34194193
## 2 0.100 1.00 0.7106667 0.46895787
## 3 0.001 0.25 0.4413968 0.09962096
## 3 0.001 0.50 0.4413968 0.09962096
## 3 0.001 1.00 0.4413968 0.09962096
## 3 0.010 0.25 0.4413968 0.09962096
## 3 0.010 0.50 0.4413968 0.09962096
## 3 0.010 1.00 0.4610000 0.11983843
## 3 0.100 0.25 0.5613651 0.23825407
## 3 0.100 0.50 0.6676349 0.38951448
## 3 0.100 1.00 0.7264603 0.50677692
##
## Accuracy was used to select the optimal model using the largest value.
## The final values used for the model were degree = 3, scale = 0.1 and C = 1.
The predict_with_uncertainty_parallel()
function is used
to estimate the uncertainty of each sample.
loc_pred <- predict_with_uncertainty(data[,-1],
svm_model,
n_sim = 500,
return_simulations = T)
This function returns the mean and standard deviation of the \(\hat{y}\) of each sample based on the Monte
Carlo simulation. It also returns the area above and bellow the decision
boundary (in this case, zero) of each sample. The classification output
indicates if the sample is being classified with certainty or not. For
samples where the distribution crosses the decision boundary
(considering the 95% confidence level), the sample will be assigned as
uncertain. Once return_simulations = T
, the augmented noise
will be saved. The results for this 2D simulated data is displayed in
Table 1.
class | sample | y_mean | y_sd | area_above_0 | area_below_0 | classification |
---|---|---|---|---|---|---|
Class 1 | 1 | 0.4193 | 0.1841 | 0.989 | 0.011 | Class 1 |
Class 1 | 2 | 0.5893 | 0.1187 | 1.000 | 0.000 | Class 1 |
Class 1 | 3 | 0.3278 | 0.0735 | 1.000 | 0.000 | Class 1 |
Class 1 | 4 | 1.3338 | 0.1170 | 1.000 | 0.000 | Class 1 |
Class 1 | 5 | 0.9997 | 0.1134 | 1.000 | 0.000 | Class 1 |
Class 1 | 6 | 1.6352 | 0.1797 | 1.000 | 0.000 | Class 1 |
Class 1 | 7 | 1.3401 | 0.1353 | 1.000 | 0.000 | Class 1 |
Class 1 | 8 | 0.4900 | 0.0984 | 1.000 | 0.000 | Class 1 |
Class 1 | 9 | 0.1572 | 0.0957 | 0.950 | 0.050 | Uncertain |
Class 1 | 10 | -0.1727 | 0.1301 | 0.092 | 0.908 | Uncertain |
Class 2 | 1 | -1.8278 | 0.1588 | 0.000 | 1.000 | Class 2 |
Class 2 | 2 | -0.9948 | 0.1009 | 0.000 | 1.000 | Class 2 |
Class 2 | 3 | -0.8295 | 0.1174 | 0.000 | 1.000 | Class 2 |
Class 2 | 4 | -0.2670 | 0.1090 | 0.007 | 0.993 | Class 2 |
Class 2 | 5 | -0.1796 | 0.0493 | 0.000 | 1.000 | Class 2 |
Class 2 | 6 | -0.3763 | 0.1620 | 0.010 | 0.990 | Class 2 |
Class 2 | 7 | 0.1155 | 0.1615 | 0.763 | 0.237 | Uncertain |
Class 2 | 8 | -0.7400 | 0.1782 | 0.000 | 1.000 | Class 2 |
Class 2 | 9 | -0.9937 | 0.0957 | 0.000 | 1.000 | Class 2 |
Class 2 | 10 | -2.1843 | 0.1791 | 0.000 | 1.000 | Class 2 |
These results can be visualized, as showed in Figure 1.
Figure 1. Uncertainty of the classification outputs for the Polynomial SVM model. (a) 2D data simulation. Circles represents the replicates, while the triangles represents the mean of replicates. (b) Augmented noise generated through eigen decomposition of the original covariance matrix for each sample. 50 points are showed for each sample. (c) Estimated uncertainty for each sample.