In this notebook, we will learn how to handle groups of data — or, in R terminology, groups of objects.

R provides several ways for grouping objects:
- Vector: Collection of objects with the same data type
- Matrix: Table consisting of objects with the same data type
- List: Collection of objects with possibly different data types

Vector

A vector is a collection of objects with the same data type.

To create a vector, use the c() function:

amino_acids <- c("methionine", "leucine", "alanine", "valine", "glutamine", "threonine")

Accessing items in a vector can be done like so:

amino_acids[1] # Access the first element
[1] "methionine"
amino_acids[2] # Access the second element
[1] "leucine"

We can also combine two vectors into a single vector using the c() function:

addl_amino_acids <- c("glycine", "cysteine", "serine")
many_amino_acids <- c(amino_acids, addl_amino_acids)
many_amino_acids
[1] "methionine" "leucine"    "alanine"    "valine"     "glutamine"  "threonine"  "glycine"    "cysteine"   "serine"    

Matrix

A matrix is a table consisting of objects with the same data type.

To create a matrix, use the matrix() function:

amino_acids_matrix <- matrix(c("methionine", "leucine", "alanine", "valine", "glutamine", "threonine"), nrow=3, ncol=2)
amino_acids_matrix
     [,1]         [,2]       
[1,] "methionine" "valine"   
[2,] "leucine"    "glutamine"
[3,] "alanine"    "threonine"

We can add rows and columns using rbind() and cbind(), respectively:

amino_acids_matrix <-  rbind(amino_acids_matrix, c("proline", "arginine"))
amino_acids_matrix
     [,1]         [,2]       
[1,] "methionine" "valine"   
[2,] "leucine"    "glutamine"
[3,] "alanine"    "threonine"
[4,] "proline"    "arginine" 
amino_acids_matrix <-  cbind(amino_acids_matrix, c("histidine", "phenylalanine", "tryptophan", "selenocysteine"))
amino_acids_matrix
     [,1]         [,2]        [,3]            
[1,] "methionine" "valine"    "histidine"     
[2,] "leucine"    "glutamine" "phenylalanine" 
[3,] "alanine"    "threonine" "tryptophan"    
[4,] "proline"    "arginine"  "selenocysteine"

Accessing items in a matrix can be done like so:

amino_acids_matrix[1, 2] # Accesses the first row, second column
[1] "valine"
amino_acids_matrix[4, 3] # Accesses the fourth row, third column
[1] "selenocysteine"

We can also check if the items in a matrix satisfies a given condition: - any() checks if at least one of the items satisfies the condition - all() checks if all of the items satisfy the condition

numbers <- matrix(c(2, 4, 6, 8, 10, 12, 14, 16, 18), nrow=3, ncol=3)
any(numbers < 6)
[1] TRUE
any(numbers < 1)
[1] FALSE
all(numbers < 20)
[1] TRUE
all(numbers < 5)
[1] FALSE

It is possible to name the rows and columns (also called the dimensions) of a matrix using colnames() and rownames(), respectively.

This can greatly aid in making our matrix more descriptive, especially when we perform some analysis.

colnames(numbers) <- c("Treatment 1", "Treatment 2", "Treatment 3")
rownames(numbers) <- c("Patient 1", "Patient 2", "Patient 3")
numbers
          Treatment 1 Treatment 2 Treatment 3
Patient 1           2           8          14
Patient 2           4          10          16
Patient 3           6          12          18

Finally, we demonstrate some matrix operations:

numbers1 <- matrix(c(2, 4, 6, 8, 10, 12, 14, 16, 18), nrow=3, ncol=3)
numbers2 <- matrix(c(12, 14, 16, 18, 20, 22, 24, 26, 28), nrow=3, ncol=3)

numbers1 + numbers2
     [,1] [,2] [,3]
[1,]   14   26   38
[2,]   18   30   42
[3,]   22   34   46
numbers1 - numbers2
     [,1] [,2] [,3]
[1,]  -10  -10  -10
[2,]  -10  -10  -10
[3,]  -10  -10  -10
numbers1 * numbers2
     [,1] [,2] [,3]
[1,]   24  144  336
[2,]   56  200  416
[3,]   96  264  504
numbers1 / numbers2
          [,1]      [,2]      [,3]
[1,] 0.1666667 0.4444444 0.5833333
[2,] 0.2857143 0.5000000 0.6153846
[3,] 0.3750000 0.5454545 0.6428571
100 * numbers1 # Scalar multiplication
     [,1] [,2] [,3]
[1,]  200  800 1400
[2,]  400 1000 1600
[3,]  600 1200 1800
t(numbers1) # Matrix transpose
     [,1] [,2] [,3]
[1,]    2    4    6
[2,]    8   10   12
[3,]   14   16   18

List

A list is a collection of objects with possibly different data types.

To create a list, use the list() function:

assorted_list <- list("proline", "methionine", 1, 2)
assorted_list
[[1]]
[1] "proline"

[[2]]
[1] "methionine"

[[3]]
[1] 1

[[4]]
[1] 2

Accessing list items can be a bit tricky though.

Observe how assorted_list[1] does not return the item proline per se. It actually returns a list containing proline.

assorted_list[1]
[[1]]
[1] "proline"

If we want the item proline to be returned, we have to use double brackets:

assorted_list[[1]]
[1] "proline"

If we want a summary of the list elements, we can use the summary() function.

summary(assorted_list)
     Length Class  Mode     
[1,] 1      -none- character
[2,] 1      -none- character
[3,] 1      -none- numeric  
[4,] 1      -none- numeric  

  1. De La Salle University, Manila, Philippines, ↩︎

LS0tDQp0aXRsZTogIkdyb3VwcyBvZiBEYXRhIC0gVmVjdG9ycywgTWF0cmljZXMgJiBMaXN0cyINCmF1dGhvcjogTWFyayBFZHdhcmQgTS4gR29uemFsZXNeW0RlIExhIFNhbGxlIFVuaXZlcnNpdHksIE1hbmlsYSwgUGhpbGlwcGluZXMsIGdvbnphbGVzLm1hcmtlZHdhcmRAZ21haWwuY29tXQ0Kb3V0cHV0OiBodG1sX25vdGVib29rDQotLS0NCg0KSW4gdGhpcyBub3RlYm9vaywgd2Ugd2lsbCBsZWFybiBob3cgdG8gaGFuZGxlIGdyb3VwcyBvZiBkYXRhIC0tLSBvciwgaW4gUiB0ZXJtaW5vbG9neSwgZ3JvdXBzIG9mIF9vYmplY3RzXy4gDQoNClIgcHJvdmlkZXMgc2V2ZXJhbCB3YXlzIGZvciBncm91cGluZyBvYmplY3RzOiA8YnI+DQotICoqVmVjdG9yKio6IENvbGxlY3Rpb24gb2Ygb2JqZWN0cyB3aXRoIHRoZSBzYW1lIGRhdGEgdHlwZSA8YnI+DQotICoqTWF0cml4Kio6IFRhYmxlIGNvbnNpc3Rpbmcgb2Ygb2JqZWN0cyB3aXRoIHRoZSBzYW1lIGRhdGEgdHlwZSA8YnI+DQotICoqTGlzdCoqOiBDb2xsZWN0aW9uIG9mIG9iamVjdHMgd2l0aCBwb3NzaWJseSBkaWZmZXJlbnQgZGF0YSB0eXBlcw0KDQoNCiMjIFZlY3Rvcg0KQSB2ZWN0b3IgaXMgYSBjb2xsZWN0aW9uIG9mIG9iamVjdHMgd2l0aCB0aGUgc2FtZSBkYXRhIHR5cGUuDQoNClRvIGNyZWF0ZSBhIHZlY3RvciwgdXNlIHRoZSBgYygpYCBmdW5jdGlvbjoNCg0KYGBge3J9DQphbWlub19hY2lkcyA8LSBjKCJtZXRoaW9uaW5lIiwgImxldWNpbmUiLCAiYWxhbmluZSIsICJ2YWxpbmUiLCAiZ2x1dGFtaW5lIiwgInRocmVvbmluZSIpDQpgYGANCg0KQWNjZXNzaW5nIGl0ZW1zIGluIGEgdmVjdG9yIGNhbiBiZSBkb25lIGxpa2Ugc286DQoNCmBgYHtyfQ0KYW1pbm9fYWNpZHNbMV0gIyBBY2Nlc3MgdGhlIGZpcnN0IGVsZW1lbnQNCmFtaW5vX2FjaWRzWzJdICMgQWNjZXNzIHRoZSBzZWNvbmQgZWxlbWVudA0KYGBgDQoNCldlIGNhbiBhbHNvIGNvbWJpbmUgdHdvIHZlY3RvcnMgaW50byBhIHNpbmdsZSB2ZWN0b3IgdXNpbmcgdGhlIGBjKClgIGZ1bmN0aW9uOg0KDQpgYGB7cn0NCmFkZGxfYW1pbm9fYWNpZHMgPC0gYygiZ2x5Y2luZSIsICJjeXN0ZWluZSIsICJzZXJpbmUiKQ0KbWFueV9hbWlub19hY2lkcyA8LSBjKGFtaW5vX2FjaWRzLCBhZGRsX2FtaW5vX2FjaWRzKQ0KbWFueV9hbWlub19hY2lkcw0KYGBgDQoNCiMjIE1hdHJpeA0KQSBtYXRyaXggaXMgYSB0YWJsZSBjb25zaXN0aW5nIG9mIG9iamVjdHMgd2l0aCB0aGUgc2FtZSBkYXRhIHR5cGUuDQoNClRvIGNyZWF0ZSBhIG1hdHJpeCwgdXNlIHRoZSBgbWF0cml4KClgIGZ1bmN0aW9uOg0KDQpgYGB7cn0NCmFtaW5vX2FjaWRzX21hdHJpeCA8LSBtYXRyaXgoYygibWV0aGlvbmluZSIsICJsZXVjaW5lIiwgImFsYW5pbmUiLCAidmFsaW5lIiwgImdsdXRhbWluZSIsICJ0aHJlb25pbmUiKSwgbnJvdz0zLCBuY29sPTIpDQphbWlub19hY2lkc19tYXRyaXgNCmBgYA0KV2UgY2FuIGFkZCByb3dzIGFuZCBjb2x1bW5zIHVzaW5nIGByYmluZCgpYCBhbmQgYGNiaW5kKClgLCByZXNwZWN0aXZlbHk6DQoNCmBgYHtyfQ0KYW1pbm9fYWNpZHNfbWF0cml4IDwtICByYmluZChhbWlub19hY2lkc19tYXRyaXgsIGMoInByb2xpbmUiLCAiYXJnaW5pbmUiKSkNCmFtaW5vX2FjaWRzX21hdHJpeA0KYGBgDQpgYGB7cn0NCmFtaW5vX2FjaWRzX21hdHJpeCA8LSAgY2JpbmQoYW1pbm9fYWNpZHNfbWF0cml4LCBjKCJoaXN0aWRpbmUiLCAicGhlbnlsYWxhbmluZSIsICJ0cnlwdG9waGFuIiwgInNlbGVub2N5c3RlaW5lIikpDQphbWlub19hY2lkc19tYXRyaXgNCmBgYA0KDQpBY2Nlc3NpbmcgaXRlbXMgaW4gYSBtYXRyaXggY2FuIGJlIGRvbmUgbGlrZSBzbzoNCg0KYGBge3J9DQphbWlub19hY2lkc19tYXRyaXhbMSwgMl0gIyBBY2Nlc3NlcyB0aGUgZmlyc3Qgcm93LCBzZWNvbmQgY29sdW1uDQphbWlub19hY2lkc19tYXRyaXhbNCwgM10gIyBBY2Nlc3NlcyB0aGUgZm91cnRoIHJvdywgdGhpcmQgY29sdW1uDQpgYGANCg0KV2UgY2FuIGFsc28gY2hlY2sgaWYgdGhlIGl0ZW1zIGluIGEgbWF0cml4IHNhdGlzZmllcyBhIGdpdmVuIGNvbmRpdGlvbjoNCi0gYGFueSgpYCBjaGVja3MgaWYgYXQgbGVhc3Qgb25lIG9mIHRoZSBpdGVtcyBzYXRpc2ZpZXMgdGhlIGNvbmRpdGlvbg0KLSBgYWxsKClgIGNoZWNrcyBpZiBhbGwgb2YgdGhlIGl0ZW1zIHNhdGlzZnkgdGhlIGNvbmRpdGlvbg0KDQpgYGB7cn0NCm51bWJlcnMgPC0gbWF0cml4KGMoMiwgNCwgNiwgOCwgMTAsIDEyLCAxNCwgMTYsIDE4KSwgbnJvdz0zLCBuY29sPTMpDQphbnkobnVtYmVycyA8IDYpDQphbnkobnVtYmVycyA8IDEpDQphbGwobnVtYmVycyA8IDIwKQ0KYWxsKG51bWJlcnMgPCA1KQ0KYGBgDQpJdCBpcyBwb3NzaWJsZSB0byBuYW1lIHRoZSByb3dzIGFuZCBjb2x1bW5zIChhbHNvIGNhbGxlZCB0aGUgX2RpbWVuc2lvbnNfKSBvZiBhIG1hdHJpeCB1c2luZyBgY29sbmFtZXMoKWAgYW5kIGByb3duYW1lcygpYCwgcmVzcGVjdGl2ZWx5Lg0KDQpUaGlzIGNhbiBncmVhdGx5IGFpZCBpbiBtYWtpbmcgb3VyIG1hdHJpeCBtb3JlIGRlc2NyaXB0aXZlLCBlc3BlY2lhbGx5IHdoZW4gd2UgcGVyZm9ybSBzb21lIGFuYWx5c2lzLg0KDQpgYGB7cn0NCmNvbG5hbWVzKG51bWJlcnMpIDwtIGMoIlRyZWF0bWVudCAxIiwgIlRyZWF0bWVudCAyIiwgIlRyZWF0bWVudCAzIikNCnJvd25hbWVzKG51bWJlcnMpIDwtIGMoIlBhdGllbnQgMSIsICJQYXRpZW50IDIiLCAiUGF0aWVudCAzIikNCm51bWJlcnMNCmBgYA0KDQpGaW5hbGx5LCB3ZSBkZW1vbnN0cmF0ZSBzb21lIG1hdHJpeCBvcGVyYXRpb25zOg0KDQpgYGB7cn0NCm51bWJlcnMxIDwtIG1hdHJpeChjKDIsIDQsIDYsIDgsIDEwLCAxMiwgMTQsIDE2LCAxOCksIG5yb3c9MywgbmNvbD0zKQ0KbnVtYmVyczIgPC0gbWF0cml4KGMoMTIsIDE0LCAxNiwgMTgsIDIwLCAyMiwgMjQsIDI2LCAyOCksIG5yb3c9MywgbmNvbD0zKQ0KDQpudW1iZXJzMSArIG51bWJlcnMyDQpudW1iZXJzMSAtIG51bWJlcnMyDQpudW1iZXJzMSAqIG51bWJlcnMyDQpudW1iZXJzMSAvIG51bWJlcnMyDQoNCjEwMCAqIG51bWJlcnMxICMgU2NhbGFyIG11bHRpcGxpY2F0aW9uDQp0KG51bWJlcnMxKSAjIE1hdHJpeCB0cmFuc3Bvc2UNCmBgYA0KDQojIyBMaXN0DQoNCkEgbGlzdCBpcyBhIGNvbGxlY3Rpb24gb2Ygb2JqZWN0cyB3aXRoIHBvc3NpYmx5IGRpZmZlcmVudCBkYXRhIHR5cGVzLg0KDQpUbyBjcmVhdGUgYSBsaXN0LCB1c2UgdGhlIGBsaXN0KClgIGZ1bmN0aW9uOg0KDQpgYGB7cn0NCmFzc29ydGVkX2xpc3QgPC0gbGlzdCgicHJvbGluZSIsICJtZXRoaW9uaW5lIiwgMSwgMikNCmFzc29ydGVkX2xpc3QNCmBgYA0KDQpBY2Nlc3NpbmcgbGlzdCBpdGVtcyBjYW4gYmUgYSBiaXQgdHJpY2t5IHRob3VnaC4NCg0KT2JzZXJ2ZSBob3cgYGFzc29ydGVkX2xpc3RbMV1gIGRvZXMgbm90IHJldHVybiB0aGUgaXRlbSBgcHJvbGluZWAgcGVyIHNlLiBJdCBhY3R1YWxseSByZXR1cm5zIGEgbGlzdCBjb250YWluaW5nIGBwcm9saW5lYC4NCg0KYGBge3J9DQphc3NvcnRlZF9saXN0WzFdDQpgYGANCg0KSWYgd2Ugd2FudCB0aGUgaXRlbSBgcHJvbGluZWAgdG8gYmUgcmV0dXJuZWQsIHdlIGhhdmUgdG8gdXNlIGRvdWJsZSBicmFja2V0czoNCg0KYGBge3J9DQphc3NvcnRlZF9saXN0W1sxXV0NCmBgYA0KDQpJZiB3ZSB3YW50IGEgc3VtbWFyeSBvZiB0aGUgbGlzdCBlbGVtZW50cywgd2UgY2FuIHVzZSB0aGUgYHN1bW1hcnkoKWAgZnVuY3Rpb24uDQoNCmBgYHtyfQ0Kc3VtbWFyeShhc3NvcnRlZF9saXN0KQ0KYGBg