Have you ever read a massive report with lots of figures and wondered how the authors chose what to analyze? (For an example: Education For All 2000-2015: achievements and challenges). I admittedly, had never pondered this question, when a report is over 500 pages I assume that everything that can be analyzed was analyzed. My thinking on this topic changed when I began to analyze the results of a 380-question survey. (Don’t worry a survey respondent didn’t answer all those questions, they were filtered based on previous answers.) The survey data analysis is part of an evaluation of the Nal’ibali reading supplements, which I discussed in my last post. Quick recap: Nal’ibali’s mission is to spark children’s potential through storytelling and reading. It produces multi-lingual reading materials to encourage reading in all of South Africa’s national languages.
When reviewing the results of the survey, I quickly learned that answering one question quickly leads to more questions and further analysis. For example, all the demographic information we collected had to be compared to national census data to determine how different or similar respondents were to the general population of South Africa. We also received feedback from Nal’ibali about what parts of our preliminary findings they wanted us to investigate further. For example, survey respondents were asked to rate their agreement with the statement, “It is more important to learn to read in English than my home language.” The results were not what the funder expected (sorry can’t tell you what the results were – that’s for the funder to know. If you really want to know you will have to pay JET Education Services, where I intern, to conduct another survey for you!) As a result, Nal’ibali asked us to analyze the question further looking at how responses to the question differed by home language (1st language/mother tongue) and province of respondents. This deeper analysis will produce valuable information for Nal’ibali but it also means for one question out of the 380 in the survey will have been analyzed three different ways. If we did that for every question we could easily produce a report hundreds of pages long.
However, the current page limit for the report is 100 pages and the survey wasn’t the only part of the evaluation. We also conducted interviews with reading supplement distributers and focus groups with reading supplement users that need to be integrated into the findings of the report. Therefore, we are either going to 1.) Make some tough choices about what to include in the report or 2.) Include a whole lot of annexures. Option two is more likely as Nal’ibali seems incredibly interested in all the minutia of the data. On Friday myself and other members of JET presented preliminary findings at Nal’ibali’s office for two hours followed by an hour of discussion and everyone at the meeting (over a dozen Nal’ibali staff) were highly engaged and inquisitive throughout the three hours. As one of the primary report writers it was great to see the funder so engaged with our work, but it also means I have a lot more work complete!
Data crunching has been my sole focus at work for the past two weeks and will continue consume all my working hours through the end of this week when our draft report is due to Nal’ibali. I am incredibly grateful for my previous work experience using Excel and Stata without it my progress on this assignment would be going a whole lot slower. To future IEDP students, learning computer software programs may not seem cool but the skills will definitely help to impress your bosses!
As a reward for getting through this post about data analysis here are some pictures from the hikes I have been taking while not crunching data.




One thought on “Data Crunching”