R Code Master Thesis
Data Scientists spend a lot of their time (up to 80%) to bring the data into the right form. This crucial step is absolutely necessary, otherwise a data analysis cannot be performed. The Swiss Household Panel, which is the dataset I used for my empirical analysis in my Master thesis, is a good example to illustrate my experience with data processing. For instance, I had to:
- re-name & select the variables (derived from the theoretical models that I used),
- transform the data from wide- into long-format (and vice versa) in order to distinguish the different family members of a household,
- deal with missing values, but also
- create new variables based on already existing variables.
As these points illustrate, a considerable part of my Master thesis consisted - among other things - putting this uncleaned dataset into a form that was adapted for the use of the quantitative methods I had selected.
DOWNLOAD MY CODE