MetroLab

Definitions and Data Classifications

06.20.23 | 7 min read

This is a section of the Model Data Governance Policy & Practice Guide for Cities and Counties. Learn more about the report and find the other sections here.

Section Notes

Purposes. Agreed upon definitions are key to any legal or policy regime. Definitions allow practitioners to classify technologies and standardize operations. A core set of definitions reflecting municipal uses of Data will be vital to standardizing practices across departments and jurisdictions. This Section seeks to establish definitions and Data classifications to standardize language and approaches to interdepartmental, inter-jurisdictional, and other external data sharing.  

Prominent Challenges Addressed. The initial working group that led to the MetroLab Data Governance Task Force identified several scenarios, challenges, and considerations regarding  “definitions” and “data classifications,” including:

Definitions

For purposes of this Policy, the following terms shall have the following respective meanings: 

Applicable Third Party: An individual or organization, other than a Jurisdiction employee, engaged by contract or otherwise working with or for the Jurisdiction in any one or more aspects of Data Handling. 

Chief Data Officer: A Jurisdiction employee designated by the Controlling Authority to perform the  functions of a “Chief Data Officer” set forth in Section 5.

Community Advisory Board (sometimes herein referred to as the “CAB”): The group established  and maintained to provide well-informed, timely, and independent advice to the Jurisdiction on  significant Data Handling matters in accordance with Section 5 of this Policy. 

Community End User Testing Group (sometimes herein referred to as the “CEUTG”): The group  responsible for providing feedback regarding the use and accessibility of the Data resources,  websites, applications, and other citizen interfaces, through an Open Data Program or otherwise,  as described in Section 5.1

Controlling Authority: The individual(s), body, or other entity with the legal authority to make a decision on behalf of the Jurisdiction with regard to adopting a policy, designating an individual, body or other entity to serve a function, or other significant matter described in this Guide. 2

Convener: The person or institution designated to lead the administration of the Community  Advisory Board as provided in Section 5. 

Data: A subset of information, whether quantitative or qualitative, that is regularly used by, maintained by, created by or on behalf of, and possessed, owned, or licensed by the Jurisdiction in non-narrative, alphanumeric,  or geospatial formats. Data are an asset independent of the systems or formats in which they reside.3

Data Governance: The policies, practices, and mechanisms adopted by a Jurisdiction to manage its Data Handling.

Data Governance Oversight Committee: The committee established and maintained as such in  accordance with Section 5.  

Data Governance Principles: The principles set forth in Section 2 of this Guide and such other principles regarding governance of Data Handling that the Jurisdiction adopts.

Data Governance System: The processes and procedures set forth as such in Section 5. 

Data Handling: The collection, creation, storage, use, transfer, dissemination, and disposal of Data, and  use of Data Platforms, and related security, risk mitigation, and breach damage containment  measures. 

Data Intermediary: An individual or organization, other than an employee or unit of the Jurisdiction, that assists the Jurisdiction in collecting, storing, disseminating, communicating, analyzing, or disposing of Data sought for use or sharing by the Jurisdiction.4

Data Security Policy: The “Data Security Policy” described in  Section 3.B.2.

Dataset: A collection of Data organized or formatted in a specific or prescribed way. Typically, a  Dataset consists of one or more tables and is stored in a database or spreadsheet. Files of the  following types are not Datasets: text documents, emails, messages, videos, recordings, image  files such as designs, diagrams, drawings, photographs, and scans, and hard-copy records. 

Data Platform: The methods, machinery, software, and related tools and systems utilized by the  City or Applicable Third Parties to collect, store, use, or make public any Dataset, including,  without limitation, those utilized in any Open Data Program. 

De-Identify: To remove all Personally Identifiable Information from Data.5

Encrypted: Any Data format with content designed to be protected and accessible only by private  parties specifically intended as an audience. 

Machine-Readable: Any Data format in which a computer can read and process information.  

Open Data: Data made open and freely available to all online in a Machine-Readable, open  format that can be easily retrieved, downloaded, and reused utilizing readily available and free Web search applications and software.6

Open Data Program: A City program dedicated to making specific Datasets available as Open  Data to the public, including, without limitation, programs that engage civic technologists, the research community, and other partners to make use of such Datasets in support of the  program’s goals.7

Open Data Programs Manager – The Jurisdiction employee designated by the Controlling Authority to  manage the City’s Open Data Programs and to perform the functions pertaining thereto described in Section 5.

Payment Card Industry (PCI) Data Security Standard: Standards adopted by the Payment Card Industry Security Standards Council to protect payment information for safe financial transactions.8

Personally Identifiable Information (“PII”):  information that identifies, relates to, describes, is reasonably capable of being associated with, or could reasonably be linked, directly or indirectly, with a particular individual (sometimes shortened to “personal information”). Examples include but are not limited to:

Note: Cybersecurity insurance or other policies may have different definitions of PII that could impact policies and processes. 

Principal Data Handling Administrator: The Chief Data Officer or other individual designated by the Controlling Authority to  be primarily responsible for oversight of adherence to this Policy. 

Privacy Laws: All laws containing provisions for the protection of a person’s privacy by regulation  of the collection, storage, use, and/or release of any PII of such person. 

Public Disclosure Law(s): All open meetings, open records, public records, freedom of information, or  similar laws pertaining to disclosure, notice, or other transparency requirements to which any  Data Handling activities of the Jurisdiction are subject.  

Re-Identify: To convert anonymized or De-Identified Data into PII. 

Sensitive Data: Information that the Jurisdiction determines should be safeguarded and protected  against unwarranted disclosure for legal or ethical reasons, for reasons pertaining to personal  privacy, or for proprietary considerations, and includes, without limitation, PII. 9

Unit Data Steward: The Jurisdiction employee designated by the Chief Data Officer as the person in a  Jurisdiction agency or department responsible for performing the functions of a “Unit Data Steward”  described in Section 5. 

Data Classifications Recommendations

Note: The following Data classifications recommendations in this subsection assume that the Jurisdiction’s Data Handling  experience is fairly mature.  An alternative set of recommendations for Jurisdictions with less mature Data Handling experience is presented at Alternative Data Classifications for Less Mature Data Handling Systems

If Data is not already classified by a third party, cities and counties should establish Data classifications by level of sensitivity. Sensitivity levels inform data collection, retention, storage, dissemination, and disposal. Classifying data protects privacy, limits data misuse, maximizes data usage, and facilitates sharing of open data sets.  The following suggested classifications could be established by rule or practice and incorporated into training, security measures, and data-related decision-making.  Data classifications should be reviewed regularly and updated as necessary. These classifications will also inform the parameters of a local government’s Data Security Policy.  

Level 0—Open  

Any Dataset regularly published in Machine-readable format by Jurisdiction or its Units on the Jurisdiction’s website, or otherwise treated as Open Data is considered “Level 0—Open” unless the Jurisdiction or a Unit makes a proactive determination to raise the classification. 

Level 1—Public, Not Proactively Released 

Data available for public access or release, not subject to any restrictions under any Public Disclosure Law or Privacy Law. 

Level 2—For Internal Government Use 

A Dataset that the Jurisdiction determines is subject to one or more Public Disclosure Law exemptions, but is not highly sensitive, and may be distributed within the Jurisdiction government without restriction by law, regulation, or contract. Data that is normal operating information but is not proactively released to the public. Viewing and use is intended for employees; it could be made available Jurisdiction-wide or to specific employees in a department, division, or business unit. Certain data may be made available to external parties upon their request. 

Level 3—Sensitive  

Data intended for release on a need-to-know basis. Data regulated by privacy laws or regulations or restricted by a regulatory agency or contract, grant, or other agreement terms and conditions. 

Level 4—Protected 

Data that triggers a requirement for notification to affected parties or public authorities in case of a security breach. 

Level 5—Restricted 

This data poses direct threats to human life or catastrophic loss of major assets and critical infrastructure (e.g., triggering lengthy periods of outages to critical processes or services for residents). Before classifying data as Level 5 Restricted, you should speak with leadership in your Unit and the Jurisdiction’s Chief Data Officer.

A Data classification flowchart and examples of each category of Data follows below:10

Data Classification Examples
Data classificationExamples
Level 0 Open• Open Data
• Public websites
• Press releases
• Job announcements
• Public reports
• Bid/contract/RFP announcements
Level 1 Public, Not Proactively Released• Certain financial data and reports
• Health or building inspection information
• Notices about future construction projects
• Organizational charts
• Internal memos
Level 2 Internal City Government Use• Employee phone directory
• Draft reports, memos, and meeting minutes
• Internal project documents
• Intranet
• Fuel consumption/fleet management data
• Learning management data
• Some financial data
• Some audio and video recordings
• License plate numbers
Level 3 Sensitive• Personnel records (including employee name + employee number, performance appraisals)
• Personally identifiable information (PII) not triggering statutory notification requirements
• Certain public safety/criminal record data
• Sensitive Security Information (SSI)
• Physical security access logs
• Investigative data (e.g., related to citations, legal proceedings)
• Trade secrets/proprietary/commercially sensitive data
• Internal risk management and mitigation data
• Central property management information
• Browser history
• Privileged communications
• Biometric information
Level 4 Protected• Social security number
• Driver’s license number
• State ID number
• Payment Card Industry (PCI) data and other customer financial information
• Protected health information (PHI)
•Password and PIN numbers
•Student records (FERPA)
• Federal tax information
• Some criminal justice information
Level 5 Restricted• Certain network/infrastructure information
• Certain water infrastructure
• Some emergency response information
• Some data obtained from federal government
1
Inspired by the Chicago Tech Collaborative’s Civic Design & User Testing initiative (“CUTGroup”)—see https://www.citytech.org/resident-engagement.
2
A jurisdiction may want to add to such a definition provision for the possibility of duly authorized “designees”—for example, if the City determined the primary authority for a decision or action normally assigned to the Controlling Authority should be the Mayor, the City Manager, or the City Council or similar body, but such Controlling Authority has discretion to delegate such authority, there could be language included in the definition along the lines of “or the designee to which such authority duly assigned responsibility for the particular decision or action in question.”
3
Based largely on the corresponding definition in District of Columbia Data Policy available at https://octo.dc.gov/page/district-columbia-data-policy.
4
There are many examples of definitions of the term “Data Intermediary” in various contexts. See, e.g., Civic Switchboard Guide, Defining a data intermediary at https://civic-switchboard.gitbook.io/guide/context-and-concepts/defining-a-data-intermediary; How to know you are a ‘data intermediary” under the Data Governance Act, posted April 27, 2021 on the International Association of Privacy Professionals (IAPP) website at https://iapp.org/news/a/how-to-know-you-are-a-data-intermediary-under-the-data-governance-act/ (in the context of European Union then proposed regulation); The one included in this Guide is for purposes of describing a role to be taken into account in Data Governance recommendations offered herein.
5
Some jurisdictions may want to adopt a more robust definition, such as the following from the California Consumer Privacy Act (“CCPA”): “Deidentified” means information that cannot reasonably identify, relate to, describe, be capable of being associated with, or be linked, directly or indirectly, to a particular consumer, provided that a businesses that uses de-identified information: 1. Has implemented technical safeguards that prohibit reidentification of the consumer to whom the information may pertain. 2. Has implemented business processes that specifically prohibit reidentification of the information. 3. Has implemented business processes to prevent inadvertent release of de-identified information. 4. Makes no attempt to re-identify the information.
6
Based on Current Kansas City Policy, Section 2-2130 KC, in Chapter 2 of its Code of Ordinances at https://library.municode.com/mo/kansas_city/ordinances/code_of_ordinances?nodeId=740307 (“KCMO Open Data Policy”).
7
Based in part on definition of “City of Seattle Data” in Seattle’s Open Data Policy V1.0 (Feb. 16, 2016) available at https://www.seattle.gov/Documents/Departments/SeattleGovPortals/CityServices/OpenDataPolicyV1.pdf (hereinafter “Seattle Open Data Policy”).
8
See https://www.pcisecuritystandards.org/standards/.
9
Based largely on definition of Sensitive Data University of North Carolina University Libraries Data Security: Policies and Regulations Impacting Research Data: Definition at https://guides.lib.unc.edu/datasecurity/definition#:~:text=Sensitive%20data%20are%20defined%20as,be%20protected%20against%20unwarranted%20disclosure
10
Modeled after Washington D.C. approach at https://opendata.dc.gov/pages/data-policy#definitions and San Francisco approach at https://sf.gov/sites/default/files/2021-05/DataClassificationStandard_FINAL_0.pdf