Statistical Analysis in R
R is widely used for statistical analysis, including hypothesis testing, regression, ANOVA, and more.
Descriptive Statistics
Use functions like mean()
, median()
, and sd()
to calculate descriptive statistics.
data <- c(10, 20, 30, 40, 50)
mean_value <- mean(data)
median_value <- median(data)
std_dev <- sd(data)
print(paste("Mean:", mean_value))
print(paste("Median:", median_value))
print(paste("Standard Deviation:", std_dev))
Hypothesis Testing
Perform hypothesis testing using functions like t.test()
.
group1 <- c(23, 25, 28, 30, 32)
group2 <- c(20, 22, 24, 26, 28)
t_test_result <- t.test(group1, group2)
print(t_test_result)
Linear Regression
Use the lm()
function to perform linear regression.
data <- data.frame(x = 1:10, y = (1:10)^2)
model <- lm(y ~ x, data = data)
summary(model)
ANOVA
Use the aov()
function to perform Analysis of Variance (ANOVA).
data <- data.frame(
group = factor(c(rep("A", 5), rep("B", 5), rep("C", 5))),
value = c(23, 25, 28, 30, 32, 20, 22, 24, 26, 28, 18, 19, 21, 23, 25)
)
anova_result <- aov(value ~ group, data = data)
summary(anova_result)
Correlation Analysis
Use the cor()
function to calculate the correlation between variables.
data <- data.frame(x = 1:10, y = (1:10)^2)
correlation <- cor(data$x, data$y)
print(paste("Correlation:", correlation))
Non-Parametric Tests
Use non-parametric tests like the Wilcoxon rank-sum test for data that does not meet the assumptions of parametric tests.
group1 <- c(23, 25, 28, 30, 32)
group2 <- c(20, 22, 24, 26, 28)
wilcox_test_result <- wilcox.test(group1, group2)
print(wilcox_test_result)
Data Visualization
Use the ggplot2
package for advanced data visualization.
library(ggplot2)
data <- data.frame(x = 1:10, y = (1:10)^2)
ggplot(data, aes(x = x, y = y)) +
geom_point() +
geom_smooth(method = "lm", col = "blue") +
ggtitle("Scatter Plot with Linear Regression Line")
Next: Machine Learning