Descriptive statistics (

Distributions and probabilities (

Correlation and regression (

- tempo (tempo.xlsx)

- syllable durations (stats_class_1.xlsx)

- materials and data for linear models in R (18.06.2018)

Exercises from the Data Camp platform (

1. Elliott, A. C., & Woodward, W. A. (2007).

2. Baayen, R. H. (2008).

3. Eddington, D. (2016).

4. James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013).

5. Litosseliti, L. (2017).

1. Data-scientist-with-r (DataCamp)

2. Data science and machine learning with R (Udemy)

For Dummies (handbook)

1001 statistical problems (exercises for the exam; check the contents here)

*handout*,*visualizations*)- Describe and give examples of data types that can be analysed statistically.
- What kind of information is provided by the mean, standard deviation and variance?
- How median differs from the mean? Which of the two is more stable towards outliers?
- Assume that we analyse a continuous variable X. How would you interpret the fact that the 25th percentile for this variable is 100?
- What is a skewed distribution and how it compares to a normal distribution?
- Describe (or depict) what kind of information can be presented in a histogram.
- Explain how a box-whiskers plot is constructed (e.g., what delimits the box, what the line inside the box indicates, etc.).
- How to determine whether a variable has a normal distribution? (think of visual and quantitative/numerical ways to do that)

Distributions and probabilities (

*handout*)- Describe characteristic features and give an example of a binomial variable.
- Provide the mean, variance and standard deviation of a binomial random variable with n = 18 and p = 0.4.
- Explain what the Empirical Rule (68-95-99.7) says.
- What is the purpose of z-score standardization? Explain it and give the formula.
- Give an example on finding probabilities with the z-table.
- What is the relation between t-distribution and Z-distribution? When the t-distribution is used instead of the Z-distribution?
- What is the meaning and role of standard error (why do we need it at all)?
- Give examples of and compare two variables: one that follows normal distribution and the other that has sampling distribution.

Correlation and regression (

*handout*)- Give the range of possible values that a correlation coefficient can take and how different values are interpreted (e.g. what does a correlation coefficient
*r*= -0.6 indicate?) - Explain the terms dependent/response variable and independent/explanatory variable in the context of a simple linear regression model.

*Data:*- tempo (tempo.xlsx)

- syllable durations (stats_class_1.xlsx)

- materials and data for linear models in R (18.06.2018)

Exercises from the Data Camp platform (

*Foundations of Probability in R*)*Literature:*1. Elliott, A. C., & Woodward, W. A. (2007).

*Statistical analysis quick reference guidebook:*With SPSS examples. Sage.2. Baayen, R. H. (2008).

*Analyzing linguistic data: A practical introduction to statistics using R*. Cambridge University Press. (see:*baayenCUPstats.pdf*)3. Eddington, D. (2016).

*Statistics for linguists: A step-by-step guide for novices*. Cambridge Scholars Publishing.4. James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013).

*An introduction to statistical learning*(Vol. 112). New York: Springer.5. Litosseliti, L. (2017).

*Research methods in linguistics*. Bloomsbury Publishing.*Tutorials:*1. Data-scientist-with-r (DataCamp)

2. Data science and machine learning with R (Udemy)

*Other resources:*For Dummies (handbook)

1001 statistical problems (exercises for the exam; check the contents here)