Here are some practice problems

Subset data

P1. Subset the built in iris dataset to only keep the setosa species. Call this subset set.iris

Answer P1

set.iris = subset(iris, iris$Species=="setosa")

P2. Subset the built in iris dataset to only keep the setosoa species and only keep individuals with a Petal.Length over 1.5. Call this subset set.iris.long

Answer P2

set.iris.long = subset(iris, iris$Species=="setosa" & iris$Petal.Length>1.5)

Use online resources

Learning how to use online resources is the skill that will allow you to code independently. Even if you never memorize any code, you will be able to do what you want, because the internet has already “memorized” the code for you. Tip, I usually look for examples on the documentation I’m looking at, since this shows you what code that you are going to run looks like. Information sheets and tutorials often have related, but not immediately relevant information that can get confusing.

R documentation

These documents are created by the developpers. They are th emost accurate and detailed options avaialble to you.

Built in R documentation (help)

P3. Use the command ?cor to pull up the R documentation to run a correlation test. Correlate Petal.Length and Petal.Width in the iris dataset.

Answer P3

cor(iris$Petal.Length, iris$Petal.Width)

Online R doumentation

P4. Go to this website that I found through Google with the search query cor R. Compare the information from the ?cor information you got in R. What do you notice?

Google

P5. o a Google search to for how to run a correlation test in R. I’ve listed my search query and some websites that I would reccomend from my search result in the answer.

Answer P5

run correlation test in R

STHDA https://www.sthda.com/english/wiki/correlation-test-between-two-variables-in-r

Data Camp https://www.datacamp.com/doc/r/correlations

There are other websites that have good information as well, especially for more niche analyses, but STHDA and Data Camp are generally reliable.

P6. Run an ANOVA in R. Use a Google search to figure out how to do this. Compare the mean Petal.Width between the three Species in the iris dataset. My code and output are in the answer section. If you get differnt code that’s okay, the important thing is getting the same numeric output.

Answer P6

aov(Petal.Width~Species, data=iris)

P7. You’ll notice that the ANOVA output from 2 is not very useful. Sometimes, you need to use an additional function summary() to see the summary of the statistics. Try to figure out how to do this, answer below.

Answer P7

a1=aov(Petal.Width~Species, data=iris)

summary(a1)

ChatGPT

P8. Ask ChatGPT for help running a linear regression model comparing Petal.Width between the three Species in the iris dataset.

Answer P8

Note, being able to use AI tools will be as important as email and microsoft office tools like Word (arguably already is) in the workplace. This is often the most efficient route to getting a framework for code. You can ask for very specific things, which make the answer customized to your data. However, there is a major issue, the AI does not know if the answer is correct. You can check by looking at the documentation for the function it suggests yourself (the first few exercises) and/or you can turn on web search (circled in orange), then click on sources (circled in green) at the bottom of the output. It will pull up relevant websites and then you can decide if you think those websites are trustworthy or not.