R Programming:
The Language of Data.
R is a special tool made just for numbers, charts, and facts. If you want to understand information, R is the best helper you can get.
What Exactly is R?
The Simple Answer
Think of R as a super-powered calculator that can read giant spreadsheets and draw beautiful pictures.
Instead of clicking buttons with a mouse (like you do in Excel), you type simple commands. You tell R what to do, and it does it instantly, even if you have millions of rows of data.
-
It is 100% Free. You do not have to pay to use it. Anyone can download it right now.
-
It is Made for Data. While other languages build websites or games, R was built purely to understand numbers.
R vs. Normal Spreadsheets
| Feature | Normal Spreadsheets | R Programming |
|---|---|---|
| Data Size | Slows down with lots of rows. | Can handle millions of rows easily. |
| Repeating Work | You have to click the same buttons every week. | Write code once, press play forever. |
| Making Errors | Easy to delete a cell by mistake and not know. | Code keeps a perfect record of everything you did. |
The Heart of R: Data Frames
In R, we store information in something called a Data Frame. It looks just like a table. Every column is a type of fact. Every row is a single item. Let us look at a Data Frame of fruits.
Add-ons: The Power of Packages
By itself, R is smart. But its real power comes from Packages.
Packages are bundles of extra code that other smart people wrote. You can download them for free. Think of them like apps on a smartphone.
ggplot2
The best tool in the world for drawing beautiful charts and graphs.
dplyr
A tool to slice, filter, and sort your data incredibly fast.
tidyr
Cleans up messy data so it is neat, tidy, and ready to use.
shiny
Turns your data into a real website that other people can click and play with.
Reading R Code
R code is designed to look like simple math and English. In R, we use an arrow symbol <-
to put data into a name. Let us look at a simple example.
ages <- c(25, 30, 22, 40, 28)
# 2. We ask R to find the average (mean) age
average_age <- mean(ages)
# 3. We ask R to print the answer
print(average_age)
Drawing Pictures: Data Visualization
If you have a thousand numbers, your brain cannot understand them. But if you turn those numbers into a picture, your brain understands it instantly.
R is famous around the world because its pictures (charts and graphs) look highly professional. Newspapers and scientific magazines use R to draw their charts.
"A picture is worth a thousand numbers."
Who Uses R for Work?
Doctors & Biologists
They use R to track diseases and understand how medicines work on the human body.
Finance Experts
Banks use R to guess if the stock market will go up or down based on old numbers.
Shop Owners
Big stores use R to find out what items people buy together (like milk and cookies).
Weather Forecasters
They load temperature data into R to predict if it will rain next week.
How to Start Using R Today
Download Base R
First, you need the brain. Go to the CRAN website and download the R software for your computer (Windows or Mac).
Download RStudio
Next, you need a nice face for the brain. RStudio is a free program that makes typing R code much easier and prettier to look at.
Type Your First Code
Open RStudio, type print("Hello World") and press
Enter. You are now a programmer!
Simple R Dictionary
Variable
A nickname you give to a piece of data so you can remember it later.
Vector
A simple line or list of items. Like a shopping list.
Function
An action word. It tells R to DO something (like add numbers, or draw a chart).
Machine Learning
Teaching the computer to find patterns in old data so it can guess what will happen in the future.
Bringing Data Into R
Before R can read your data, you have to bring it inside the program. Most of the time, data lives inside a file on your computer. R has simple commands to open almost any file type.
CSV Files
A very basic spreadsheet file. This is the most common format.
read.csv("file.csv")
Excel Files
Files made directly in Microsoft Excel. You need a package for this.
read_excel("data.xlsx")
Web Data
You can paste a web link, and R will download the data right from the internet.
read.csv("http...")
The 3 Flavors of Data
R needs to know exactly what kind of facts it is looking at. You cannot do math on a word, and you cannot spell with a number. There are three main types you must know:
Numbers
Also known as "Numeric". Used for math, age, height, and money.
price <- 19.99
Words
Also known as "Characters" or "Strings". Always put them inside quote marks.
city <- "Tokyo"
True / False
Also known as "Logical". Used to answer yes or no questions.
is_open <- FALSE
Cleaning Messy Data
In the real world, data is never perfect. People spell things wrong, or they forget to fill out forms. When a box is empty, R calls it NA (Not Available). You have to clean this up before doing math.
Grouping Things Together
Imagine a pile of mixed coins. To count them, you would first separate the pennies, nickels, and dimes into their own groups. Then, you count each group.
R does exactly this with a tool called group_by(). You can ask R to group all sales by "City", and then find the average money made in each city. It does this instantly.
Making Reports: R Markdown
Nobody wants to read raw computer code. Your boss just wants the answers and the charts.
R Markdown is a magic tool. It lets you write normal English words, and put your R code right next to it. When you press a button called "Knit", R squishes everything together and creates a beautiful PDF or Word Document.
- Code runs automatically.
- Charts update by themselves.
- It looks incredibly professional.
Sales Report
Here is the chart showing our growth this year. As you can see, profits are up.
The Big Fight: R vs. Python
R
The Specialist
- Built by statisticians for data.
- The best charts out of the box.
- Incredible for academic research.
Python
The All-Rounder
- Built for any type of programming.
- Great for making websites or apps.
- The favorite for deep Artificial Intelligence.
Conclusion: Both are amazing. Pick R for pure data work, pick Python to build software.
Common Beginner Mistakes
Computers are very strict. If you make a tiny typo, the computer will stop working and give you an error. Here are the 3 most common traps beginners fall into.
R cares about big and small letters. Age is
completely different from age. If you mix them up, R will be confused.
If you open a door, you have to close it. If you type an open
bracket (, you MUST type a closing bracket ) at the end.
When giving R a list of numbers, you must put a comma between
every single item. c(1, 2, 3) works. c(1 2 3) will crash.
You Are Not Alone: The Community
Learning to code can be hard. The good news is that R has one of the nicest, most helpful groups of users on the internet.
If your code is broken and you do not know why, someone has probably already had the exact same problem, and someone else has posted the answer online.
Saving Your Results
Once you clean your data and make a chart, you need to get it OUT of R to share it with your boss or team.
Save as CSV
Push your clean data back into a spreadsheet file everyone can open.
Save as Image
Download your charts as high-quality pictures to put in presentations.
The Secret: Cheat Sheets
You Do Not Have to Memorize Anything.
Even experts look up codes every single day. Nobody remembers all the commands.
The creators of RStudio make free Cheat Sheets. These are 1-page PDF documents that show you exactly what to type to get things done.
If you want to make a chart, you just print out the "Data Visualization Cheat Sheet" and keep it next to your keyboard. It is that simple!