Analysis involves finding patterns and themes in the data you have collected for your evaluation to make sense of it. Analysing your data will help you report on it effectively and use it to make decisions.
You may have started your evaluation with questions you wanted to answer – for example, have we achieved our intended outcomes, or have we reached the individuals and organisations that we expected to? Analysis will help you to answer these questions.
Quantitative data is numerical – for example, responses to multiple choice or rating scale questions in a questionnaire. Analysing quantitative data will help you generate findings on how much change has occurred as a result of your work and who has experienced change.
Things you'll need
- your quantitative data
- your evaluation framework or theory of change
- your evaluation questions
- some form of software to manage the data.
Prepare your data
Start by making sure your data is in a format you can analyse. If you have paper forms or questionnaires, you will need to enter these into a spreadsheet or database. Microsoft Excel or equivalent is good enough for most purposes. Some survey tools such as SurveyMonkey and Google Forms export easily into Excel or CSV formats.
Next, clean your data. You need to:
- remove any completely blank responses
- remove any duplicates
- remove any obvious errors – for example, someone ticking two boxes when they were asked to tick one.
You should also make sure that each of your variables is in the right number format, especially if you are using Excel or statistical software. Make sure that dates are formatted as dates, numbers as numbers, amounts of money as currency and so on.
Decide what statistics to use
Statistics help to organise and understand numerical data so you can present it clearly.
Your evaluation framework will help you decide which statistics to use. For example, you may want to report on the proportion of people who have experienced an outcome (percentage) or the type of people who have benefitted most and least from your work (cross-tabulation).
Descriptive statistics help to illustrate and summarise data. Inferential statistics, such as regression analysis, help to understand connections between variables, decide whether something could have happened by chance or generalise beyond your sample. We concentrate on descriptive statistics here.
Frequencies and percentages
Frequency is how often something happens. You might use frequency tables to demonstrate how often your services have been accessed or how many campaigning activities you have delivered.
You can often present frequencies clearly by expressing them as percentages of a total. Here is a simple frequency table showing attendance at training courses, with numbers and percentages.
Which of the following courses did you attend? | ||
---|---|---|
Answer options | Response % | Response count |
Defining your outcomes | 29.7 | 52 |
Collecting good outcome data | 22.2 | 39 |
Good data analysis | 23.4 | 41 |
Good data reporting | 24.6 | 43 |
answered question |
175 |
Tips for using percentages
- Don’t use percentages when presenting data from small samples as it’s easy to be misleading. As a rule of thumb, avoid percentages for samples of fewer than 50. You can use them for samples of 50-100 but don’t draw firm conclusions based on small differences in your data. Learn more about sampling
- Make sure you refer to the correct number of respondents when calculating percentages. For example, if you’re analysing data from a survey, use the number of people who have responded to a specific question to calculate percentages in the data for that question (not the number of people responding to the whole survey).
Measuring how much change there has been
If you have asked the same questions before and after your intervention, you can compare responses to find out how much change individuals or organisations have experienced. For each respondent, subtract their ‘before’ score from their ‘after’ score. Then put data from all your respondents into a frequency table. You can then work out the average change for your whole group or for sub-groups, or what percentage of respondents experienced positive or negative change.
Comparing who has experienced change
Cross-tabulation is a way of comparing results for different types of people or organisation you have worked with. For example, if you want to know if your intervention is more effective for people who are unemployed or in employment, or for different ethnic groups, you could use cross-tabulation to compare their experiences. You could also use it to compare how people rated different interventions or different aspects of an intervention. You can do this using pivot tables in Microsoft Excel. Here is an example of a cross-tabulation:
How would you rate the course? | |||||
---|---|---|---|---|---|
Role in organisation | Excellent | Good | Fair | Poor | Response Count |
Frontline worker | 35% | 52% | 9% | 4% | 170 |
Manager | 31% | 49% | 15% | 5% | 170 |
Total | 33% | 50% | 12% | 5% | 340 |
answered question |
340 |
From this, you can see that frontline workers rated the course more positively than managers.
Averages
Averages are used to summarise a whole data set in a single number which represents the middle of the distribution. They can be used to report on the average experience of the individuals or organisations you have worked with. There are three main types of average.
Mean
The mean is what we normally mean when we say ‘average’. If you have used a rating scale with ratings of 1-5 or 1-10 (for example, to understand levels of wellbeing or confidence), the mean is the most useful average to use.
To find the mean, add up all the values and divide by the number of responses. So, in the string of numbers 1, 1, 1, 1, 1, 2, 2, 3, 4, 4, 4, 5, 5, 5, 5, the mean would be 44 divided by 15, which is 2.9.
The mean is less helpful if your data is skewed (if the top or bottom values have a higher frequency than the middle value). It’s particularly unhelpful if your data has outliers (values far above or below the bulk of values in the data set). Imagine what would happen to the mean if our list of numbers was 1, 1, 1, 1, 1, 2, 2, 3, 4, 4, 4, 5, 5, 5, 670.
Median
The median is the value in the middle of a data set arranged from smallest to largest. It may be more helpful than the mean if your data is skewed. The median is commonly used when reporting income or wealth as the data tends to be highly skewed, with a few very high salaries at the top.
In our string of numbers 1, 1, 1, 1, 1, 2, 2, 3, 4, 4, 4, 5, 5, 5, 5, the median is 3. Even if the top values are massive outliers, the median is still 3. If you have an even number of values, then you take the two middle values and divide by two.
If your data is skewed, you may also want to report quartiles or percentiles. For the bottom quartile, you use the value that is one quarter of the way through the data set, arranged smallest to largest. For the top quartile, you use the value three quarters of the way through. Percentiles work in the same way.
Mode
The mode is the value in a data set that occurs most frequently – it’s useful to report on what most individuals or organisations you worked with have experienced. So in the group of numbers 1, 1, 1, 1, 1, 2, 2, 3, 4, 4, 4, 5, 5, 5, 5 the mode would be 1. The main disadvantage of using the mode is that there might be two modes in the same data set.
Measures of dispersion
These are single numbers that tell you how much variation there is in your data set. They are usually used alongside an average to give a summary of the data set.
Range
The range is simply the difference between the smallest and largest value in your data set. You subtract the smallest number from the largest to get the range.
You can also use an inter-quartile range to tell you about the distribution of the middle 50% of values in your data set. This is where you subtract the bottom quartile from the top quartile.
Standard deviation
The standard deviation indicates the average distance between a value in the data set and the mean value. It shows how well the mean represents a data set. The higher the standard deviation, the more dispersed the data set is. Excel and statistical software packages will calculate standard deviation for you.
Decide how to present your data
Present your quantitative data clearly and succinctly to make it easy to understand.
Try combining categories
For some data sets, you can combine categories for simplicity. For example, if you have used an agree/disagree rating scale, you may want to combine ‘strongly agree’ and ‘agree’ into a single category (unless this would lose important detail).
Consider tables and charts
Presenting your data in a table or chart emphasises its importance, so use tables or charts for the data that’s most important for people using your evaluation to understand. Tables and charts also help readers understand complex findings.
It’s important to choose the right chart for your data. Here are some initial suggestions.
What do you want to present? | Which chart to use |
---|---|
Comparison of two or more categories | Bar chart, circle chart |
Binary data (eg yes/no responses) | Pie chart, bar chart |
Change over time | Line graph, area graph |
Correlation between two variables | Scatter graph |
Frequency | Bar chart |
Percentages | Bar chart, pie chart |
Dispersion (how spread out your data set is) | Box and whisker plot, bar chart |
Ann K Emery’s Essentials website compares types of chart and is helpful for deciding which to use.
Be clear and transparent
Always report your sample base. This is the number of respondents that answered a particular question or the number of people in your sample (sometimes called n). Whenever you report a percentage, it’s good practice to report the sample base to make clear to the reader how many people you are talking about. For example, ‘80% of participants who completed the survey (n=250) said that they were more confident after the training course’.
Report any limitations. If your sample is small or biased in any way, or if you weren’t able to reach particular target groups, it’s important to report this. It is a strength of your analysis, rather than a weakness.
Think critically about your data
Once you have decided what statistics to use and have done the calculations, look again at your data to draw out key findings – don’t assume the data speaks for itself! Here are some things to consider:
- Is 80% (for example) good or bad? How do you know? You may be able to decide on this by comparing your data to the previous year’s data, or to other similar interventions.
- Are there any other patterns, themes or trends? For example, does one group consistently achieve more, or less, than other groups?
- Can you explain some of the less common responses? You may need some qualitative analysis to help you here.
- Is there anything in the data that has surprised you?
- Do you know anything about why some of the results are as they are? For example, can you link your percentages to qualitative data that explains why some people achieved an outcome while others did not?
Use your data analysis
Now you’re ready to bring together your data analysis into a report or other presentation format. Read our guides on writing an evaluation report and using creative reporting formats for evaluation.
Further information
This how-to was contributed by NCVO Charities Evaluation Services.
Contributors
- (We don't know this user now)
- Callum Metcalfe
- Mark Barratt