The quality of administrative data for census variables: Strengths, limitations, and opportunities
This paper summarises the census transformation programme research to date on the quality of census attribute information derived from administrative (admin) data sources.
Download the PDF document or read the summary below.
Download document
The quality of administrative data for census variables
Adobe Acrobat PDF file, 845 KB
Summary of key points
Stats NZ’s census transformation programme has undertaken a series of investigations based on the 2013 Census looking at the ability of admin data sources to provide census-type information. These investigations aimed to assess the potential for linked admin data sources to meet information requirements for social and economic characteristics (attributes) measured in the census. This is important in the context of moving towards a greater use of admin data in the census. The results of this programme of work have already provided benefits as they were integral to the use of admin data to fill in missing data due to non-response in the 2018 Census.
This report summarises the attribute investigations to date. We collate the key findings from comparisons of admin data and census responses where the assessments were based on a quality framework approach. Results are summarised in terms of accuracy, assessed by representativeness (does admin data include the right people or dwellings) and errors of measurement (are the right things being measured). The results are described at a high level, and general themes that have emerged are discussed. We found a broad spectrum of results across the variables considered.
The level of coverage for many of the variables that can be obtained from admin sources (sometimes combined with previous census data), reaches a similar level to that achieved by the full field enumeration census. Several of the admin-derived variables investigated were highly accurate, and a number of variables showed good potential for providing census-type information, but there are caveats either in coverage or measurement error. At the other extreme, around one-third of variables have limited, if any, admin data potential and in the absence of new data sources will continue to rely on survey collection.
This variation highlights the need to consider each census variable on its own merits, and to ensure the detail in each admin source is understood well enough to apply the data in the census context. Admin error structures are quite different from those from a field collection and may affect particular variable categories, specific age groups, or populations such as new migrants. A combination of imputation or statistical models combining admin and survey data will be needed to provide unbiased estimates.
The reasons for quality concerns are varied, and there may be potential for improving quality. Critical areas of focus for improvements in admin sources are the place of usual residence (particularly for young adults), family and household data, and iwi affiliation. The new Data and Statistics Bill, once passed into law, will facilitate the collection and use of administrative data for the benefit of the wider statistical system.
There is now a significant body of census-type information that can be collated from admin data for most of the New Zealand population. Next steps include the release of an experimental ‘Admin Population Census’ as an annual time series, progressively adding more census variables derived from admin sources to the admin-based resident population dataset. Our aim is to demonstrate the breadth of information available and to provide a focus for discussion with customers about the quality issues and benefits associated with an admin-based census.
ISBN 978-1-99-003232-5 (online)