All Indicators > Indicator RLC1: Social capital
| Definition | Levels of social capital |
| Dimension | Root causes |
| Sector | Local conditions (intermediate) |
| Components |
|
| Source | Various - see component details |
Component RLC1_1: Community stability
| Definition | Measure of the size of out-migration from an area |
| Source Numerator | 2001, 2001 Ethnic, 2003, 2005: Moved out of the area, 2001 Census |
| Source Denominator | 2001, 2001 Ethnic, 2003, 2005: Total Population, 2001 Census |
Component RLC1_2: Can people be trusted?
| Definition | Modeled estimate of proportion who trust their neighbours |
| Source | 2001, 2001 Ethnic: Health Survey for England 1998-2001. |
| 2003: Health Survey for England 2001-2003. | |
| 2005: Health Survey for England 2003-2005. |
Additional details
In the absence of any suitable administrative or Census data, survey data was the only source of information available to construct an indicator of trust in neighbours. However there are a number of problems associated with using survey data to produce Local Authority District (LAD) estimates, including small or non-existent samples in some areas leading to large variances and unstable estimates and biases introduced by particular sampling strategies.
A great deal of work, particularly in the last twenty years, has gone into addressing these issues. Although a number of different approaches have been used, all the methods tend to fall somewhere on a continuum between using direct estimates, suitably weighted for sample design, and a modelling approach using local area covariates to estimate the indicator of interest. Some are based on only one or other of the methods. However the two methods each have their own particular problems. Direct estimates, weighted as necessary, are unbiased but may have large variances; on the other hand the modelled estimates will have small variances but will be biased. Hence many estimates attempt to combine information from both in order to solve the common problem of minimising the Mean Square Error of the final estimate.
The method used in the HPI required that a well-fitted micro level model could be identified. It also assumed that the important ways in which a group may have been over-sampled in a survey sample can be captured by covariates available in the survey and at a small area level. It involved combining all surveys available for the required year with the necessary dependent and independent variables (e.g. socio-economic status, age, gender and ethnicity).
One year of the Health Survey for England was used and the indicator was based on the question 'Can people be trusted?'
Step 1
Using combined survey data, with LAD geocoding, a multi-level, variable intercepts, logistic model was run, with level one being the individual i, level two the primary sampling unit j and level three the LAD k. Covariates from within the survey, shown in lower case, and LAD level data, shown in upper case, were used to predict the individual level behaviour.
Logit (Pijk) = Xijk B + Ujk + Vk + Eijk
Where P is a vector of probabilities associated with individual i in Primary Sampling Unit (PSU) j within LAD k, B a vector of regression coefficients, X a matrix of covariates associated with the individual measured within the survey, U a random vector of area effects associated with the PSU and V the LAD and E is a vector of independent random 'noise' elements. The matrix of covariates included PSU area measures, based on aggregated individual level survey counts within the PSU. These covariates are given in the table below:
2001, 2001 Ethnic and 2003 - Proportion
of the population who feel that most people can be trusted.
|
||
| Covariates | ||
| Constant | -0.455 | |
| Bangladeshi | 0.486 | |
| Black African | 0.606 | |
| Black Caribbean | 0.773 | |
| Chinese | 0.193 | |
| Indian | 0.436 | |
| Pakistani | 0.436 | |
| Individual effects (x) | 20-24 years | 0.296 |
| 25-29 years | 0.302 | |
| 30-34 years | 0.069 | |
| 35-39 years | 0.03 | |
| 40-44 years | 0.095 | |
| 45-49 years | 0.022 | |
| 50-54 years | 0.061 | |
| 55-59 years | 0.025 | |
| 60-64 years | -0.095 | |
| 65-69 years | -0.128 | |
| 70-74 years | 0.055 | |
| 75+years | -0.008 | |
| Male | 0.074 | |
| LAD Area Effects | Social class I, II and IIIA | 0.274 |
Step 2
The fixed effects part of the model were then taken and applied to the matrix
of small area covariates X held by SDRC for 100% of individuals
and LADs across England, the random LAD area effect added (where it was available
for an LAD), and the anti-logit applied. The probability was then summed
and averaged over the LAD to produce a vector of synthetic LAD level estimates:
Yk = 1 / Nk x Sum ( anti-Logit ( Xijk B + Vk ) )
This method does not use weighting to remove bias in the parameter estimators introduced by unequal selection probabilities in the survey sampling schemes. Instead important characteristics of the sample are included in the model as covariates. The sample indicator variable S will therefore be unrelated to Y conditional on these covariates. In this case the sample can be viewed as uninformative and ignorable. There is little conflict in including theses covariates because they are, by definition, predictors of Y and so should be included in the model. If they were not, the sample design would not bias the standard estimators of the parameters.
Included in our models are measures of non-manual social classes and a 'level' for the primary sampling unit. Together these will capture, to a great extent, the unequal selection probabilities associated with the sample design. Other variables such as age will ensure that where a question or measure was taken of only a particular age group in a specific survey year, the estimates will not be biased.


