LZN's Blog CodePlayer

【Coursera】R Language



Background Material


get the current working directory >getwd()  #same in MAC read the csv (comma-seperated values) file >read.csv("xx.csv") list the file and directory under the current directory >dir() change the working directory >setwd("F:/workspace/") To list out all the functions in one script >ls()



assignment >x <- 5 comments # hi, this is a comment


5 basic "atomic" classes
  • character
  • numeric (real numbers)
  • integer
  • complex
  • logical (True/False)
basic object: vector vector function: vector(type_of, length_of) basically, R think all number as real type, if you want a integer 1, type: 1L Infinity: Inf not a number: 0/0=NAN (or as missing value) R objects can have attributes:
  • names, dimnames
  • dimensions
  • class (numeric)
  • length
function to modify attributes attributes()


c() function is to create a vector: x <- c(0.5,0.6) x <- c(True, Flase) vector() function acts alike, with initial value 0 for numeric. x <- vector("numeric", length = 10) mixed classes x<- c(True, 3) # will be numeric Explicit Coercion > x<-0:2 > class(x) [1] "integer" > as.logical(x) [1] FALSE TRUE TRUE we use list() to create list. >y<-list("a", 1, TRUE)


create matrix >m<-matrix(nrows=2, ncols=3) >attributes(m) $dim [1] 2 3 matrix is column wise change vector to matrix >m<-1:10 >dim(m)<-c(2,5)   or use cbind or rbind >x<-1:3 >y<-10:12 >cbind(x,y) 1 10 2 11 2 12 >rbind(x,y) similar but in row


factors are ordered or unordered, like key for the php array, factors could be treated as numeric vector with labels give an example >x<-factor(c("yes","yes","no")) >x [1] yes yes no Levels: no yes >table(x) yes no 2     1 >unclass(x) 221 that is how factor x expressed in R underneath! The first level is called the baseline level, it is determined by the alphabet rank, however, you could change the order. >x<-factor(c("yes","yes","no"),levels=c("yes","no")) and yes will be in the first place.

L9-Missing Values

is.na() is.nan() NaN value all be treated as NA, but the converse is not true.

L10-Data Frames

tabular 表格的 data frames ~ matrices list ~ vector Yes, that the class doesn't matter in data frames. Special attributes: row.names created by read.table() or read.csv() convert to matrix data.matrix() an example: >x<- data.frame(foo=1:4, bar = c(T,T,F,F)) >x    foo    bar 1    1    TRUE 2    2    TRUE 3    3    FALSE 4    4    FALSE

L10-Names Attribute

>x >- 1:3 > names(x) <- c("foo","bar","norf") >x foo bar norf    1     2     3

L12-Reading Tabular Data

  • read.table, read.csv
  • readLines
  • source, for reading in R code files (inverse of dump)
  •  dget, same as above, but for dparsed code (inverse of dput)
  • load, for reading in saved workspaces
  • unserialize, for reading single R objects in binary form
  • write.table
  • writeLines
  • dump
  • dput
  • save
  • serialize


  • file, the name of a file, or a connection
  • header, logical indicating if the file has a header line
  • sep, a string indicating how the columns are separated
  • colClasses, a character vector indicating the class of each column in the dataset
  • nrows, the number of rows in the dataset
  • comment.char, a character string indicating the comment character
  • skip, the number of lines to skip from the beginning
  • stringsAsFactors, should character variables be coded as factors?
no argument is fine, and the result would be in a data frame. read.table default separator is space. be sure to read the document of read.table

L12-Reading Large Tables

set the arguments! all numeric, one is fine: colClasses = "numeric"

L13-Textual Data Format


>x <- c("a","b","c") >x[1] [1]  "a" > x[1:3] >x[x>"a"] >u <- x>"a" >u [1] FALSE TRUE TRUE


>x <- list (foo =1:4, bar =0.6) >x[1] $foo [1] 1 2 3 4 we got a list! >x[[1]] [1]  1 2 3 4 we got a sequence! >x$bar [1]  0.6 >x[["bar"]] #this is equal >x["bar"]  # we got a list >x[c(1, 3)]   >name = "foo" >x[[name]] this is useful


>x[1, ]    # missing is fine No dropping forcing: >x[1, 2, drop =FALSE]

L18-partial matching

>x<-list(aardvark=1:5) >x$a [1] 1 2 3 4 5 >x[["a"]] NULL >x[["a", exact = FALSE]] [1] 1 2 3 4 5

L19-Removing NA Values

>x <- c(1, 2, NA, 4, NA, 5) >bad <- is.na(x) >x[!bad]   >good<-complete.cases(x, y)

L20-Vectorized Operations

matrix x*y #by rank x%*%y # by true matrix multiplication


L2 if-else

if(x>3) { y<-10 }else{ y<-0 } also true: y<- if(x>3) { 10 }else{ 0 }

L2 For loops

for (i in 1:10){ }   x <- c("a","b","c","d") for (i in seq_along(x)){ print(x[i]) } for (letter in x){ print(letter) }

L4 Functions

set default value: abc = function(a = 10){ } columnmean <- function(y, removeNA = TRUE){ nc <- ncol(y) means <- numeric(nc) for(i in 1:nc) { mean[i] <- mean(y[,i], na.rm = removeNA) } }

L6 Functions

... argument indicate a variable number of arguments that are usually passed on to other funcitons. myplot <- function(x, y, type = "l", ...) { plot (x, y, type = type, ...) } explicityly matching after dot dot dot

L7 Functions could be made dynamically!

 lexical vs. dynamical scoping make.power <- function(n) { pow <- function(x) { x^n } pow }   >cube <- make.power(3) # note cube is a function > cube(3) [1] 27   ls(environment(cube))

L8 code style

indenting 缩进

L10 Date and times

x <- as.Date("1970-01-01") x <- Sys.time()


            here, we can see the sytle is very like windows/dos                 cute assignment sign   same comment sign as bash       complex! Like fortran     interesting...             Attributes! Like NCL!                 be careful with c(), not like other language                               in fact, I think this is really convenient               merge vectors to matrix, this is really user-friendly!                     Impressive! like PHP array, but much easier to understand             Like NCL or MATLAB                 See it? Plenty of data types, very user-friendly. You could imagine how simple when use data frames to process EXCEL type files.                                                                                         Briliant!!!                                       Very like $$ in ncl     Dim down or not, it is a problem