• Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+
Data integration guidelines

Stats NZ staff, secondees, and contractors use the Data integration guidelines to apply the Information privacy, security, and confidentiality policy to data integration processes.  

Context

Data integration is when data from separate data sources (designed and collected primarily without the intention of being used together) are linked together. Stats NZ is a world leader in integration of administrative data to enable analytics, while protecting the identities of individual people and organisations. This creates a rich resource for research to answer complex questions to improve outcomes for New Zealanders.

When one or more datasets are integrated, the risks of identification and adverse public reaction are likely to be greater, especially when the different data sources are designed and collected without the intention of being used together. This is because generally the data (which can be about individual people, households, and organisations) would have been collected for different purposes, and combining the data could create public concerns about the extent and nature of the information creating a very detailed picture for a research or statistical purpose they were unaware of when supplying the data.

Stats NZ has two large integrated research databases of microdata. The Integrated Data Infrastructure (IDI) which contains microdata about people and households and the Longitudinal Business Database (LBD) which complements the IDI with microdata about businesses.

Public willingness to provide information is central to achieving our goals and is enabled by the high level of trust and confidence in the way we secure this information. We must be mindful of public expectations about information privacy, security, and confidentiality in order to maintain this critical level of trust and confidence. We are committed to ensuring our policies and processes for the collection, use, storage, security and disposal of personal and other confidential information, and the technology we use to support our processes, not only comply with all relevant legislation and statistical principles and protocols, but also meet public expectations, and are effectively implemented.

At the same time we must aim to get maximum benefit for New Zealand from the data we manage. Integrating data is a way we achieve this.

You might also be interested in our policy and guidelines aimed at protecting against the potential harm posed from the most at-risk data we hold:

Principles for data integration

We will consider integrating data when all four of the following principles are met: 

  1. The public benefits of integration outweigh both privacy concerns about the use of data and risks to the integrity of the statistics system, the original source data, and/or other government activities. 
  2. Integrated data will only be used for statistical or research purposes. 
  3. Data integration will be conducted in an open and transparent manner. 
  4. Data will not be integrated when an explicit commitment has been made to respondents that prevents such action.

Data integration process

Follow the process outlined below, to ensure you comply with the data integration principles and the IPSaC policy. This includes data integration for the purpose of producing aggregated statistics, other statistical products, or microdata research.

Identify risks and benefits before beginning data integration

Identify benefits and risks before you start any integration project. You might need some carefully controlled exploratory integration in order to properly explore the value that may be derived. This should be clearly separate from any subsequent full scale integration, which should be authorised separately.

Steps to take before you start the data integration:

1. Identify the potential benefit to New Zealand from the integration

Data integration can generate many benefits, but may also be perceived negatively, and can potentially generate real privacy issues (given the person supplying the information is likely to be unaware of the integration and statistical/research use or that such a detailed picture may have been created about them). Because of this, it is important to clearly identify, describe, and communicate both the benefits and risks that are associated with a proposed data integration.

Avoid general statements that will not be able to be used as a basis for evaluating the benefits of the proposed integration if it goes ahead.

2. Prepare a risk assessment and compliance with all relevant legislative and regulatory obligations required to manage risk for data integrations, as part of the business case for the integration

Do a thorough assessment of the risks of identifying individuals, households, or organisations from the statistical or research process and the information released, and the procedures designed to protect against such a disclosure.

Note that all data collected directly by Stats NZ has been explicitly collected for statistical and research purposes defined in the Statistics Act 1975. Integrated data created by combining survey data to fulfil the same core statistical purpose is likely to create less risk than with integrating data collected for different purposes. When one or more of the datasets we integrate comes from another organisation and is an administrative, survey, or commercial dataset, risk is likely to be greater because of the volume and nature of the data, and the fact it was collected for a different purpose than the proposed research or statistical purpose.

In order to protect against unauthorised disclosure of confidential information by any reasonably foreseeable means, consider whether: 

  • confidential information can be adequately protected from unauthorised disclosure 
  • there could be harm to people, households, iwi, or organisations, even if they are not able to be identified from the disclosed information 
  • adding different types of data could create public concerns about the extent and nature of all personal information, creating a picture of a person’s life that over time is too intrusive 
  • adding different types of data could increase the risk of re-identification of a person, household, iwi, or organisation 
  • trust and confidence in Stats NZ, agencies providing datasets, and government as a whole, could be diminished.

Focus your analysis of proposed data integration activities on the benefits to be gained, and identify all reasonably foreseeable harm and ways to protect against harm. Include your analysis in the overall proposal for the data integration. This will ensure that potential benefits and risks are considered at the outset.

Use the following templates and documents (see Guidelines and procedures to access them):

  • Brief privacy and confidentiality impact analysis template – a streamlined version of a full privacy and confidentiality impact assessment.
  • Privacy and confidentiality impact assessment guidance 
  • Full Privacy and Confidentiality Impact Assessment Template – use this if your brief analysis identifies risks that may not be able to be adequately mitigated.

Use confidential information only for authorised purposes

Ensure confidential information is to be used only for authorised purposes.

3. Ensure integrated data is be used only for statistical or research purposes

Only data that is for analytical, statistical, and research purposes should be integrated.

Ensure organisations that collected the data understand this, and that individuals, households, and organisations will not be able to be identified. Outputs may be used to inform policy or operational decisions, but must not be used to target individual people or organisations (for example, can’t be used for case management), unless these individuals have given consent.

Document criteria for access, and any special considerations specific to the proposed data integration. Consider if some people, or groups of people, may object to the way confidential information is shared or used in the data integration. Consider how concerns might affect Stats NZ’s reputation and that of the source data provider.

4. Apply normal release practices and microdata access practices to integrated data

Datasets created through integration are not exempt from normal practices covering the release of aggregate statistical outputs and microdata access, see Microdata access guidelines.

Be transparent and inform people on how their data will be used and provide access to their data if possible

5. Ensure data integration is conducted in an open and transparent manner

Be transparent about data integration activities and decisions, and proactive about telling people why and how we integrate data. The minimum requirement is to publish information about each data integration our website, including information about our risk assessment.

We provide access to people who request their own information where it is readily retrievable and where we can be certain the information belongs to them. The nature of data integration may mean we cannot retrieve the information and/or be certain that the information belongs to them. In these cases we direct the person to request access to their information from the source agencies, where the information is likely to be more easily accessed.

6. Find out how the original information was collected, and what people were told about how it would be used

Consider the context in which the information was originally collected and what people were told at the time about how information was intended to be used (for both our own datasets and those of other organisations).

When confidential information is provided to us by other organisations, we should ask those organisations to inform people their information may be shared with us, and that it will only be used for analytical, statistical, or research purposes.

Often we propose to use data for a purpose that would not have been foreseen at the time of collection. This should not preclude us from undertaking the data integration, provided we carefully consider how people may view the proposed use.

Data integration approval

Follow this process to give approval for integrating datasets either into or outside the IDI or LBD

7. Approval for integrating datasets INTO the IDI or LBD

We prioritise all proposals to integrate new datasets into the IDI or LBD, and then ensure they are supported by a business case and a privacy and confidentiality impact assessment.

The assessments for each proposed addition to the IDI or LBD are reviewed by the senior advisor, strategy, performance and privacy (who may consult the Office of the Privacy Commissioner) for review, before the proposal is put to the Government Statistician or person delegated to approve.

Ensure the Microdata access guidelines are followed and applied for access to the integrated data.

8. Approval for integrating datasets OUTSIDE the IDI or LBD

Ensure all proposals to integrate datasets outside the IDI or LBD include either the Brief Privacy and Confidentiality Impact Assessment and/or a full Privacy and Confidentiality Impact Assessment.

Send the assessment to the senior advisor, Strategy, Performance and Privacy (who may consult with the Office of the Privacy Commissioner) for review before the proposal is put to the Government Statistician or delegate to approve. 
Ensure the Microdata access guidelines are followed if microdata access is requested as part of the integration.

Stakeholder consultation

9. Consultation on the benefits and risks

Ensure consultation on the benefits and risks of the proposed data integration occurs and the results and conclusions are documented. Consultation may include the following stakeholders:

  • the general public (for example by providing information on the Stats NZ website) 
  • external data providers and their relationship managers (when integration involves administrative, commercial or survey data from external providers)
  • appropriate subject matter areas and data custodians responsible for data collected directly by Stats NZ
  • related or affected subject matter areas
  • the appropriate Deputy Government Statisticians and senior managers
  • senior advisor, Strategy, Performance and Privacy
  • Office of the Privacy Commissioner
  • Stats NZ's respondent advocate
  • Statistical Methods (to determine whether appropriate confidentialisation and integration techniques can be implemented).

top

Definitions

anonymized
Term most commonly used to refer to data from which direct identifiers have been removed (de-identified data) but is sometimes used to refer to confidentialised data. It is not a term used in these guidelines.

availability
Ensuring authorised users, including staff, contractors, and researchers, can access data and information for authorised purposes at the time they need to do so.

confidential information
Data and information about a person, household, iwi, or organisation that we should not disclose to people who are not authorised to have access to it. Confidential information may be obtained from respondents, other organisations, customers, staff, or other people we deal with. Confidential information also includes embargoed releases and Stats NZ operational information that is not already publicly available.

Note: ‘confidential’ is a classification used by the New Zealand Government in its classification system for information pertaining to national security. Stats NZ does not hold or store any information classified confidential or any other information pertaining to national security, therefore we use the common English definition of confidential. For further information about the government information classification system, see Protective Security Requirements.

confidentialisation
The statistical methods used to protect against confidential information being disclosed to people who are not authorised to have access to it, in a way that could identify an individual, household or organisation. The statistical methods used provide a level of protection against identification that cannot be obtained from de-identification.

confidentiality
The protection of information provided by people and organisations to us and ensuring it is not disclosed or made available to people or organisations who are not authorised to access it. Authorisation should ideally be given by the person providing the information, but may also be through legislation.

data integration
The linking of data about the same person or organisation (or unit) from two or more unit record datasets, originally collected for different purposes.

de-identification
The process of removing information from microdata to reduce risk of spontaneous recognition. It typically includes removing names, exact dates of birth or death, and exact addresses.

information security
The measures put in place to protect against data and information being disclosed to unauthorised people or organisations, and to ensure appropriate availability and integrity of information.

Integrated Data Infrastructure (IDI)
Database containing de-identified people-centred microdata from a range of government agencies, Stats NZ surveys and non-government organisations.

integrity
Assurance about the accuracy and consistency of data and information and that it is authentic and complete. It includes assurance that data and information has been properly created and has not been tampered with, damaged, or subject to accidental or unauthorised changes.

Longitudinal Business Database (LBD)
Database containing microdata about businesses from Stats NZ surveys and a range of administrative data sources.

microdata
Data about individual people, organisations, households, or other units in a population.

personal information
Data and information about a person that we should not disclose to people who are not authorised to have access to it. It is a subset of confidential information.

privacy
The individual’s rights relating to control of the provision, use, and disclosure of information about themselves, commonly called their personal information.

top

Responsibilities

Here is a summary of who is responsible for what under the data integration guidelines.

All Stats NZ staff, secondees, and contractors

  • Understand the principles, policies, and procedures relating to the security and management of confidential information.
  • Apply these as appropriate to their role.
  • Report breaches, incidents, and near misses to the security and privacy teams.

Chief digital officer

  • Fulfil the role of chief information security officer as defined in the New Zealand Information Security Manual (GCSB, 2016).
  • Develop a security strategy and security risk management programme. 
  • Maintain appropriate security measures to protect the information gathered, stored, and transmitted by Stats NZ. 
  • Manage and maintain organisation-wide information security policies. 
  • Manage and maintain certification and accreditation processes. 
  • Act as an escalation point on security-related matters.

Chief methodologist

  • Manage and maintain policies and standards relating to statistical confidentialisation.
  • Approve confidentialisation and/or de-identification procedures before information is released by subject matter areas.
  • Assist in managing confidentialisation-related breaches. 
  • Assess data integration proposals to ensure there are no major methodological concerns with the analysis proposed, and that confidentiality risks can be adequately mitigated.
  • Provide advice and training to subject matter areas on confidentialisation methods and practice.
  • Provide confidentialisation advice to partner organisations.

Chief privacy officer

  • Maintain and manage the information privacy, security, and confidentiality policy, and any other related policies. 
  • Act as final escalation point on privacy and other confidentiality-related matters.

Chief security officer

  • Act as final escalation point on security-related matters.

Deputy government statistician

  • Approve data integration that only uses data collected directly by Stats NZ.

General manager, customer support and development

  • Consider any wider, strategic implications for inclusion in the application and assessment advice for new datasets to be added into the IDI or LBD. 
  • Endorse data integration applications for the IDI or LBD are ready for the Government Statistician’s approval.

Government statistician

  • Approve data integration proposals and escalated microdata access applications. 
  • Approve use of any exemptions under clauses 37A to 37F of the Statistics Act 1975 or delegating approval authority.

Information Privacy, Security, and Confidentiality (IPSaC) Governance Group

  • Provide governance oversight of privacy, security, and confidentiality policies. 
  • Agree policy implementation work programmes. 
  • Drive implementation of the work programmes.

Manager and data custodian responsible for releasing data

  • Undertake risk assessment, specify risks to be mitigated, and collaborate with Statistical Methods and data specialists to determine appropriate confidentialisation and de-identification techniques. Gain the approval of the chief methodologist for application of those techniques. 
  • Ensure analysts and researchers in their area are trained in how to apply the approved confidentialisation and/or de-identification procedures and that those procedures are applied to information prior to release.

Manager, information management

  • Advise and provide education about correct management, retention, and disposal of confidential information in accordance with the Public Records Act 2005 and approved disposal authorisations.

Manager, Integrated Data Infrastructure (IDI) System

  • Develop and apply guidelines and processes for data integration in the Integrated Data Infrastructure system (IDI) and assessing IDI integrations for approval.

Manager, microdata access

  • Develop and apply processes for assessing research and researchers to determine whether researchers and projects should be recommended for approval. 
  • Ensure only approved researchers and approved projects have access to microdata but only to microdata approved for their project. 
  • Ensure approved researchers undertake confidentiality training and sign all relevant documentation before being granted access to microdata.
  • Ensure all outputs from microdata access projects are confidentiality checked before they are released.

Respondent advocate

  • Provide a respondent perspective when policies and procedures relating to privacy and confidentiality are developed and implemented.

Security manager

  • Fulfil the role of information technology security manager (ITSM) as defined in the New Zealand Information Security Manual (GCSB, 2016).
  • Provide leadership, advice, and consultation on security related issues. 
  • Manage the implementation of security measures.
  • Lead the management of security breaches and incidents.
  • Lead security education and awareness activities.

Senior advisor, strategy, performance and privacy

  • Design and implement approaches to implement the information privacy, security, and confidentiality policy, including education and awareness activities. 
  • Lead management of privacy-related breaches and incidents.
  • Lead management of confidentiality-related breaches and incidents.
  • Provide leadership, advice, and consultation on privacy and confidentiality related issues, including privacy and confidentiality impact assessments.
  • Consult with the Office of the Privacy Commissioner when required.

Senior manager, integrated data

  • Approve ad-hoc loads to the IDI, where:
    • the additional data is already covered under an existing business case and Privacy and Confidentiality Impact Assessment, and
    • one-off integrations for feasibility purposes where access will be limited to those undertaking the feasibility assessment (a business case and Privacy and Confidentiality Impact Assessment would be required to move beyond the feasibility stage).
  • Confirm the assessment advice for other new datasets to be added into the IDI or LBD are ready for executive level approval.

Senior managers and general managers

  • Consider any wider, strategic implications for their areas of data integrations outside IDI or LBD. 
  • Endorse data integrations outside the IDI and LBD are ready for the Government Statistician’s approval.

The Confidentiality Network

  • Provide support, advice, and build capability across Statistical Methods, Stats NZ, and the Official Statistics System in confidentiality methodologies and practices. 

top

Related documents

Guidelines and procedures

Statistics NZ (2016). Brief privacy and confidentiality impact analysis template. Available from senior advisor, strategy, performance and privacy, email: info@stats.govt.nz.

Statistics NZ (2016). Full privacy and confidentiality impact assessment template. Available from senior advisor, strategy, performance and privacy, email: info@stats.govt.nz.

Statistics NZ (2016). Privacy and confidentiality impact assessment guidance. Available from senior advisor, Strategy Performance and Privacy, email: info@stats.govt.nz.

Statistics NZ (2016). Privacy, security, and confidentiality incident procedures. Available from security and privacy teams, email: info@stats.govt.nz.

Stats NZ (2017). Microdata access guidelines. Available from www.stats.govt.nz.

Stats NZ (2017). Privacy and confidentiality guidelines. Available from www.stats.govt.nz.

Other documents

Government Communications Security Bureau (2016). New Zealand information security manual (NZISM). Available from www.gcsb.govt.nz.

Protective security requirements. Available from www.protectivesecurity.govt.nz.

Statistics NZ (nd). Our privacy commitment (poster). Available from Stats NZ, email: info@stats.govt.nz.

Statistics NZ (nd). Security policies and standards. Available from Stats NZ, email: info@stats.govt.nz.

Statistics NZ (2007). Principles and protocols for producers of Tier 1 Statistics. Available from www.stats.govt.nz.

Statistics NZ (2013). Information and data management policy. Available from Stats NZ, email: info@stats.govt.nz.

Stats NZ (2017). Information privacy, security, and confidentiality policy. Available from www.stats.govt.nz.

United Nations (2014). UN fundamental principles for official statistics (Principle 6). Available from https://unstats.un.org.

Legislation

Official Information Act 1982. Available from www.legislation.govt.nz.

Privacy Act 1993. Available from www.legislation.govt.nz.

Public Records Act 2005. Available from www.legislation.govt.nz.

Statistics Act 1975. Available from www.legislation.govt.nz. 

Owner and review

The general manager of customer support and development is the owner of Data integration guidelines. The 2017 guidelines resulted from a review in 2016 and replace the 2012 Data integration policy. The guidelines will be reviewed annually.

Citation
Stats NZ (2017). Data integration guidelines. Retrieved from www.stats.govt.nz.   

ISBN 978-0-908350-99-5 (online)
Published 9 May 2017 

  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+
Top
  • Share this page to Facebook
  • Share this page to Twitter
  • Share this page to Google+