The “>” in the R console window is called the command prompt. This is where you can type commands. R can work like a calculator and has the following arithmetic operators built in:
| + | addition | - | subtraction | * | multiplication | / | division | ^ | exponent
The following code and output demonstrate how R can be used like a calculator:
Multiply 2 times 3 and print the answer
##  6
Divide 27 by 9
##  3
Another way to divide 27 by 9
##  3
Notice that anything after the # symbol is considered a comment and is ignored by R
27/(3+3+3) #yet another way to divide 27 by 9
##  3
Data Structures and Types
R can take several data types and several data structures. Its basic data structures can be categorized by their dimensions (1, 2, or more than 2 dimensions) and whether they are homogeneous/atomic (elements are of the same data type) or heterogeneous (elements are of more than one data type).
The most common are categorized in the table below.
A data frame is a collection of different types of atomic vectors. In this way, a data frame is a ‘list of lists’.
To use R beyond its calculator capability, give everything a name. That is, define an object. To define an object, pick a name and use the definition symbol “<-” followed by its definition. The following line of code multiplies 2 times 4, but stores the result as A rather than printing it.
A <- 2*4
To see what A is, simply ask R to tell you by typing A into the console.
##  8
To create a vector of numbers, use the combine function. This function is simply the letter ‘c’.
myFirstVector <- c(1,2,3,4) myFirstVector
##  1 2 3 4
Another way to create a vector with the numbers 1 through 4, use the ‘:’ operator to replace the word ‘through’.
vec2 <- 1:4
The sequence function, seq(), is another handy way to create a vector. The following line of code creates a sequence from 1 to 10 by 0.1.
seqVec <- seq(1,10,0.1) #seq(from, to, by)
R has many built-in functions beyond c() and seq() that make manipulating and performing calculations on data simple.
A function has arguments. In order to learn about a function and its arguments, type ?functionName or help(functionName). This should prompt a web browser tab to open and display the help file. At first, reading a help file for a function can seem intimidating, but it is an important skill to be able to understand the help file.
The following is a portion of the help file generated by typing ?seq
seq(from = 1, to = 1, by = ((to - from)/(length.out - 1)),
length.out = NULL, along.with = NULL, …)
The first argument is ‘from’, the second is ‘to’, and the third is ‘by’. The default arguments are also displayed. If not specified, the sequence will start at 1, since the argument says ‘from = 1’.
Notice that if you type the arguments in the order they are found in the help file, you don’t have to specify the argument name.
##  2 4 6 8 10
seq(from=2, to=10, by=2)
##  2 4 6 8 10
seq(to=10, from=2, by=2)
##  2 4 6 8 10
All three of the above functions are identical. Notice, in the third, we needed to define the argument name since we defined them in a different order than what the help file specified.
Typing seq(10,2,2) will produce an error, but changing the sign in the ‘by’ argument works.
seq(10,2,-2) #count down from 10 to 2 by 2
##  10 8 6 4 2
The mean() function is a very frequently used function. Consider the following vector with a missing value, which is always coded as ‘NA’ in R:
##  1 2 3 NA
If we try to calculate the average, we get ‘NA’ as the result.
mean(vec2) #This won't work because of the missing data
##  NA
A quick look at the help file shows:
mean(x, trim = 0, na.rm = FALSE, …)
a logical value indicating whether NA values should be stripped before the computation proceeds.
Also note that if it’s not ambiguous, R will take an abbreviated version of function arguments. The following 3 lines of code are all equivalent
##  2
##  2
##  2
There are too many functions to list but here are some that are commonly used.
Table of Common Functions
| sqrt() | square root | log() | natural logarithm | exp(a) | e^a | mean() | average | sd() | sample standard deviation | var() | sample variance | length() | length of vector (including NAs) | dim() | dimension of data frame - rows, columns | factor(x) | makes R recognize x as categorical | cbind() | bind (combine) objects by columns | rbind() | bind objects by rows | sort() | sort from smallest to largest | order() | sort a dataframe based on specified column(s) | round(x,a) | round x to a decimal places | rep(x,n) | repeat x n times | which() | returns the position number where are gument is TRUE | ifelse(cond,a,b) | if condition is met, return a, otherwise return b | sample() | sample from a dataset | paste() | concatenate text | help(function) | pulls up help file for function | or ?function | pulls up help file for function |objects() or ls() | returns a list of all stored objects
An important type of vector in R is a logical vector. There are many logical and other miscellaneous operators built into R.
| == | equal to | != | not equal to | < | less than | <= | less than or equal to | > | greater | >= | greater than or equal to | | | or | & | and | a%in%b | is 'a' contained in 'b' | %*% | matrix multiplication
Consider a vector of 4 values.
We can ask R, “Check each element of vec3 to see if it equals 10.”
##  TRUE FALSE FALSE FALSE
The logical vector containing TRUE FALSE FALSE FALSE is returned since the first element of vec3 is 10 and the others are not. Operators can be combined to check several conditions. The following example says ‘’Check each element of vec3 to see if it is greater than 15 or equal to 10.’’
##  TRUE TRUE FALSE TRUE
The %in% operator allows us to check that the elements of one vector are contained in another.
vec4<-c(5,10,17,20,30,40,5,17) vec3%in%vec4 #is each element of vec3 in vec4
##  TRUE TRUE TRUE TRUE
vec4%in%vec3 #is each element of vec4 in vec3
##  TRUE TRUE TRUE TRUE FALSE FALSE TRUE TRUE
Using these logical operators to return a logical vector is a very powerful tool. They can be used inside many functions in R. The ifelse() function is a very useful function. The following example says “if the element of vec4 is 10, return 10, otherwise return 0.”
##  10 0 0 0
The which() function returns the position of an element when the logical vector is TRUE. This function will be useful when we want to pick out only certain elements of an object. The following example asks R to return the position of any element of vec3 that is 10. We know only the first entry of vec3 is 10, so it should return the position ‘1’.
##  1
The following example demonstrates another way to use logicals inside other functions.
vec5<-c(1,2,3,NA) #create vector with one missing value length(vec5) #length of vec5
##  4
is.na(vec5) #logical indicating if each element is an NA
##  FALSE FALSE FALSE TRUE
length(vec5[!is.na(vec5)]) #length of vector, not counting NAs (length of vector such that it isn't an NA)
##  3
- Start object names with a letter — R will not allow an object name to start with a number.
- R is case sensitive — objects need to be referred to with the exact spelling.
- If you get a + instead of the command prompt after hitting Enter, you haven’t closed out a quote, bracket, parenthesis, etc… Simply close it out on the next line and hit enter to get the command prompt again.
- Use the UP and DOWN arrows on your keyboard to scroll through previously executed lines of code. This is an efficient way to make changes.
- It is not recommended to copy and paste code from Microsoft Word. It will re-format quotes and other characters and then R will give an error because it doesn’t recognize the character.