 # Interviews

The interview is an oral exam, in which the interviewers will assess each candidate’s knowledge and skills, particularly in informatics and statistics, as well as his/her motivation for studying Data Science.

Please find below a catalogue of questions for both areas, as well as some reading recommendations. You will be asked some of these questions during the interview.

## Statistics

1. Explain quantities like expectation, median, variance, quantile, etc. for a random variable X.
2. For a continuous random variable X explain how the density is related to the distribution function.
3. Assume that X is normally distributed with mean m and variance s2. Explain what this means, and state some properties of the normal distribution. How can we calculate P(X < x)?
4. Explain the principle ideas of Maximum Likelihood. Use Maximum Likelihood to estimate the parameter of a Bernoulli (p) distribution when given n independent 0/1 i.i.d. observations.
5. Explain the meaning of a confidence interval.
6. What is the meaning of covariance and correlation? Give a mathematical definition. What is the difference between the theoretical and the empirical covariance?
7. What is the purpose of a regression model. Give an example.
8. Explain how a regression model is used for forecasting.
9. Give an example for a 2x2 table. How is the odds ratio defined?
10. What is the aim of a cluster analysis?

### Literature

• F.M. Dekking, C. Kraaikamp, H.P. Lopuhuä, L.E. Meester. A modern introduction to probability and statistics. Springer.
• G. Grimmett, D. Stirzaker. Probability and random processes. Oxford Univ. Pr.
• A.M. Mood, F.A. Graybill, D.C. Boes. Introduction to the theory of statistics. McGraw-Hill.

## Informatics

1. Name elementary data types and how they are represented within computer memory.
2. What is an algorithm and which types of control structures are generally used to implement algorithms?
3. What is a function and how can it be used for implementing recursion?
4. What are arrays, compound data types (aka records) and references (aka pointers)?
5. What is the difference between compiled and interpreted programming languages?
6. What are the time complexity and the memory complexity of an algorithm?
7. Name at least two commonly used sorting algorithms and explain their computational complexity.
8. Name two approaches to represent a graph in a data structure? Name a method to find out whether a graph is connected based on one of these structures?
9. Name the advantages of a database system compared to storing data in plain text files?
10. What are a relation, a tuple and a primary key in the relational data model?

### Literature

• Thomas H. Cormen. Introduction to algorithms. MIT press, 2009.
• David J. Eck. Introduction to programming using Java, 7th ed. Version 7.0, August 2014 (Version 7.0.2, with mostly typographical corrections, December 2016). Web: http://math.hws.edu/javanotes/
• Cay S. Horstmann. Computing concepts with Java essentials, 3rd ed. Wiley, 2003.
• R. Ramakrishnan, J. Gehrke. Database management systems, 3rd ed. McGraw Hill, 2002.

Last update: 27 July 2018