R Tutorial

R STATISTICS

Statistics in R

R is one of the most powerful languages for statistical computing and analysis. It provides built-in functions for descriptive statistics, probability, and hypothesis testing.

1. Descriptive Statistics

Use these functions to get summary statistics of your data:

data <- c(10, 20, 15, 25, 30, 20, 40)

mean(data)        # Average
median(data)      # Middle value
mode <- function(x) {
  ux <- unique(x)
  ux[which.max(tabulate(match(x, ux)))]
}
mode(data)        # Custom mode function

range(data)       # Minimum and maximum
var(data)         # Variance
sd(data)          # Standard deviation
summary(data)     # Full statistical summary

2. Probability Distributions

R can work with common probability distributions:

dnorm(), pnorm() – Normal distribution
dbinom(), pbinom() – Binomial distribution
dpois(), ppois() – Poisson distribution

# Normal distribution probability
dnorm(0, mean = 0, sd = 1)

# Binomial probability
dbinom(3, size = 5, prob = 0.5)

# Poisson probability
dpois(2, lambda = 3)

3. Correlation and Covariance

Used to measure relationships between variables:

x <- c(1, 2, 3, 4, 5)
y <- c(2, 4, 6, 8, 10)

cor(x, y)   # Correlation
cov(x, y)   # Covariance

4. Hypothesis Testing

Basic statistical tests for comparing data samples:

# One sample t-test
t.test(data, mu = 20)

# Two sample t-test
group1 <- c(10, 12, 14)
group2 <- c(15, 18, 20)
t.test(group1, group2)

# Chi-square test
observed <- c(25, 30, 45)
expected <- c(30, 30, 40)
chisq.test(x = observed, p = expected/sum(expected))

5. Linear Regression

Fit a linear model to predict values:

x <- c(1, 2, 3, 4, 5)
y <- c(2, 4, 6, 8, 10)

model <- lm(y ~ x)
summary(model)

# Plot the regression line
plot(x, y)
abline(model, col = "red")

Tips for Students

Use summary() on datasets to explore distributions.
Know the difference between descriptive and inferential statistics.
Use help(function_name) to explore usage.
Practice interpreting statistical outputs like p-values and coefficients.

Practice Questions

Calculate mean, median, and standard deviation for a sample dataset.
Use cor() and cov() to analyze relationships.
Perform a t-test to compare two groups.
Fit a linear regression model and interpret the summary.

🌟 Enjoyed Learning with Us?

Help others discover Technorank Learning by sharing your honest experience.
Your support inspires us to keep building!

Leave a Google Review

Topics