CodeToLive

Statistical Analysis in R

R is widely used for statistical analysis, including hypothesis testing, regression, ANOVA, and more.

Descriptive Statistics

Use functions like mean(), median(), and sd() to calculate descriptive statistics.


data <- c(10, 20, 30, 40, 50)
mean_value <- mean(data)
median_value <- median(data)
std_dev <- sd(data)

print(paste("Mean:", mean_value))
print(paste("Median:", median_value))
print(paste("Standard Deviation:", std_dev))
      

Hypothesis Testing

Perform hypothesis testing using functions like t.test().


group1 <- c(23, 25, 28, 30, 32)
group2 <- c(20, 22, 24, 26, 28)

t_test_result <- t.test(group1, group2)
print(t_test_result)
      

Linear Regression

Use the lm() function to perform linear regression.


data <- data.frame(x = 1:10, y = (1:10)^2)
model <- lm(y ~ x, data = data)
summary(model)
      

ANOVA

Use the aov() function to perform Analysis of Variance (ANOVA).


data <- data.frame(
    group = factor(c(rep("A", 5), rep("B", 5), rep("C", 5))),
    value = c(23, 25, 28, 30, 32, 20, 22, 24, 26, 28, 18, 19, 21, 23, 25)
)

anova_result <- aov(value ~ group, data = data)
summary(anova_result)
      

Correlation Analysis

Use the cor() function to calculate the correlation between variables.


data <- data.frame(x = 1:10, y = (1:10)^2)
correlation <- cor(data$x, data$y)
print(paste("Correlation:", correlation))
      

Non-Parametric Tests

Use non-parametric tests like the Wilcoxon rank-sum test for data that does not meet the assumptions of parametric tests.


group1 <- c(23, 25, 28, 30, 32)
group2 <- c(20, 22, 24, 26, 28)

wilcox_test_result <- wilcox.test(group1, group2)
print(wilcox_test_result)
      

Data Visualization

Use the ggplot2 package for advanced data visualization.


library(ggplot2)
data <- data.frame(x = 1:10, y = (1:10)^2)

ggplot(data, aes(x = x, y = y)) +
    geom_point() +
    geom_smooth(method = "lm", col = "blue") +
    ggtitle("Scatter Plot with Linear Regression Line")
      
Next: Machine Learning