Exercises - Vectors
Basics of R for Data Science
Fundamentals of creating vectors
Create a vector named
v0composed by numbers \(9, 15, 2, 121, 4, 8, 7, 11\)Create a vector named
v1composed by all numbers from \(300\) down to \(107\), using both functionseq()and the:operatorUse function
seq()to create a vector namedv2composed by all numbers between \(4\) and \(5\) with an increment of \(0.05\) (i.e., \(4.05\), \(4.10\), \(4.15\), and so on…)Use function
seq()to create a vector namedv3composed by exactly \(12\) equally spaced numbers between \(4\) and \(5\)Use function
sample()to randomly draw \(5\) numbers between \(1000\) and \(1500\) (before doing this, have a look at the help?sample)Use the
sample()function to simulate \(20\) rolls of a \(6\)-sided die (note that, to do this, you must set argumentreplace=TRUEin thesample()function; understand why this is necessary)
Indexing vectors
Select the 2nd element from the previously created vector
v0, using indexing with[]Select the 4th and the 6th element from the previously created vector
v0, using indexing with[]Select the last element from the previously created vector
v0(assume you don’t know its length in advance, so use thelength()function to determine it)Select all numbers greater than \(4.40\) from the previously created vector
v2(you need to use indexing and relational operators)Select all numbers between \(4.40\) and \(4.80\) from the previously created vector
v2(you need to use indexing, and relational and logical operators)Select all numbers smaller than \(4.20\) or greater than \(4.90\) from the previously created vector
v2
Like a data scientist
Use the
rnorm()function to create any0vector containing \(1,000,000\) normally-distributed numbers on an IQ scale (i.e., with \(\mu\) = \(100\), \(\sigma\) = \(15\)) (remember that \(1,000,000\) can be written as1e6in R)Display the first few values of the
y0vector using both thehead()function and the indexing with[]Round all values of
y0to the nearest integer using theround()function, then once again display the first few values to make sure it workedIndex on the vector
y0with[ ]and use themean()function to calculate the average of the IQ values in the range between +1 SD and +2 SD from the mean (i.e., between \(115\) and \(130\))Use the
sd()function and indexing on the vectory0to find the standard deviation of IQ values that are below the mean (i.e., where IQ < \(100\))Estimate the standard deviation of a variable created by adding two normally distributed variables
z0andz1, both with a standard deviation of \(1\) (do this usingrnorm()for simulating values, andsd()for computing the standard deviation)Repeat the previous exercise, this time add a large constant value to one variable and subtract another large constant value from the other before adding them. Verify that this does not affect the final standard deviation
Use
rnorm()to create a vectorx0containing a large number of values simulated from a standard normal distribution (i.e., with \(\mu\) = \(0\), \(\sigma\) = \(1\)); then, create a second vectorx1by applying a linear transformation tox0, such as(x0 + 6) / 11, and observe how the mean value and standard deviation have changed fromx0tox1Use the
cor()function to verify that the previously created vectorsx0andx1(being linear transformations of each other), have a correlation of \(r = 1\)Repeat the previous two points, but now add a random “error term” to
x1, e.g., computex1 = 2*x0 + 0.5 + rnorm(n = length(x0), mean = 0, sd = 0.3), and check that the correlation betweenx0andx1is now smaller than 1. Also, see how increasing thesdof the “error term” decreases the correlation betweenx1andx0