On a recent trip to Vegas I could not stop thinking about the Nevada city’s popular tourism slogan – ‘What happens in Vegas stays in Vegas’. If you think about it this is just a part of an overall data confidentiality classification scheme that visitors should think about before posting to social networking sites. This got me thinking about how enterprises should revisit or establish such schemes.
With more enterprises adopting cloud and ingesting data from multiple private and public sources, I have seen an increase in the need for addressing a classification scheme for data availability. In the past I have always used a classification schema for data integrity and confidentiality. But as cloud vendors offer more and more secure and fault tolerant environments I thought it would be good to illustrate an example classification scheme to include not only integrity and confidentiality but also availability.
In terms of security requirements, not all data are equal. Some are freely available in the public domain, while others have varying degrees of sensitivity. An important aspect of data security is the evaluation of data to determine their appropriate security requirements. In order to make the process manageable, organizations often adopt a simple classification scheme.
Confidentiality
One means of classification addresses confidentiality., which can range from public to top secret. Lets look at an example with some of my personal observations:
Classification | Internal Use Only |
Top Secret | I am still seeing organizations leave these documents on-premises rather than deploying them to services such as Office365 or a cloud vendor document management systems. While trust in cloud security is growing it does seem to have its limits. This category covers documents that cover highly sensitive internal documents e.g. pending mergers or acquisitions; investment strategies; plans or designs; that could seriously damage the organization if such information were lost or made public. Information classified as Top Secret has very restricted distribution and must be protected at all times. |
Highly Confidential | Some of this information has made its way to the cloud but businesses can not delegate thier responsibility in adhering to industry regulations (e.g. HIPAA, GDPR). This category covers Information that, if made public or even shared around the organization, could seriously impede the organization’s operations and is considered critical to its ongoing operations. Information would include accounting information, business plans, sensitive customer information of bank’s, solicitors and accountants etc., patient’s medical records and similar highly sensitive data. Such information should not be copied or removed from the organization’s operational control without specific authority. |
Proprietary | This category covers Information of a proprietary nature; procedures, operational work routines, project plans, designs and specifications that define the way in which the organization operates. There are many SaaS services that include or support this category of information with the appropriate encryption, access and authorization, and therefore has been adopted widely. |
Internal Use Only | Information not approved for general circulation outside the organization where its loss would inconvenience the organization or management but where disclosure is unlikely to result in financial loss or serious damage to credibility. Examples would include, internal memos, minutes of meetings, internal project reports. Security at this level is controlled but normal. For this reason these documents are primarily in the cloud these days. |
Public | Information in the public domain; annual reports, press statements etc.; which has been approved for public use. This category of information has all but moved to the cloud as more and more enterprises move their marketing, and web sites to the cloud. As this is information in the public domain security at this level is minimal. |
Integrity
In addition to confidentiality, data should also be classified in terms of integrity and availability. For integrity, one must consider the degree to which data is guaranteed to be accurate and correct. It includes integrity of data in all three states, (at rest, in motion, and while being processed).
Here is an example data integrity classification scheme
Classification | Classification |
Guaranteed | Used for data with non-repudiation requirements. Data is guaranteed to be correct and the identity of its creator, and that of any modifier, cannot be refuted. Complete change history is available and is also guaranteed. Trust relationships are at a personal / individual level. This classification is used when the integrity of\ information must meet the strictest qualifications including litigation. Due to these underlying functional requirements for this category, information tends to be housed in a document/content management system whether it is on-premise or in a cloud vendor. |
Verified | Information is verified to be correct in a manner that detects any corruption during transmission, processing, and storage. Access to information is controlled, and changes are audited. Audit logs are protected from tampering. Cloud vendors offer object storage services which is a good fit here as they are secure, strongly consistent, and have a high-level of integrity as information is actively monitored using checksum and corrupt data is detected and automatically repaired. |
Audited | Write access is managed and all changes to persisted information are recorded. Changes will be detected, and can be corrected, unless audit logs are also compromised. Therefore data integrity is a function of the integrity of the systems, personnel, and the networks that process and transmit data. Cloud vendors offer a variety of audit logs where alerts can be defined. In addition many cloud vendors have obtained certification such as ISO27000. |
Managed | Information updates are protected by access control. Data integrity is directly related to the integrity of the users given access and the systems and networks that process and transmit data. Data from departmental applications or data not directly tied to revenue generally fall into this category. Many cloud vendors have obtained certification such as ISO27000. But the business has the responsibility to make sure only the relevant employees have access to the information. |
As-Is | Information open to anonymous creation and use. No guarantee of authenticity, accuracy, or correctness. Examples include public postings such as comments, open wikis, etc. This is a perfect fit for the cloud as vendors offer highly available elastic storage as well as easy access (e.g. REST). |
Availability
Availability classifications address the ability to avoid loss of data either via malicious or accidental means. Availability is often applied to the system as a whole, including all hardware, software, networks, data, etc. required to ensure proper operations. This sample classifications have been worded to more specifically address data.
Classifications | Description |
Fail-Safe | Systems are fault-tolerant across multiple sites to support geographic redundancy. Availability ensured even in the event of a regional catastrophic event such as a natural disaster or terrorist attack. Data redundancy across multiple sites and offsite backup storage. Mechanisms actively detect and thwart DoS attacks. Many enterprise either did not have the skillset or the budget to apply this level of availability. But in the cloud the architecture to support these requirements is relatively affordable from OOTB DDOS prevention, backups and archival services, batch and realtime replication of data across regions. |
Fault-Tolerant | No single point of failure exists within the system environment. Redundant copies of data are maintained online. Mechanisms are in place to support rapid rollback of both intentional and accidental data changes. Stringent access restrictions all but eliminate the threat of malicious or accidental data loss. Mechanisms actively detect and thwart DoS attacks. Many cloud vendors today have multiple data centers in a single region enabling businesses to define a highly available application and data stores. |
Managed | Service level agreements in place. Systems have been designed with the appropriate level of redundancy and resiliency to meet availability requirements and are managed accordingly. Access is controlled to minimize the likelihood of unavailability due to malicious or accidental activity. Some cloud vendors do publish service level agreements for some of their services. Business will need to confirm their requirements with what is being offered by the cloud vendor. |
Minimal | Data are managed in a controlled environment and regularly backed up. No SLAs established and no redundancy built into the system. While some cloud services offer high available and redundancy it is still the responsibility of the business to make sure backups are being performed. |
None | No guarantees or service level agreements in place. No steps taken to manage data or maximize availability. Backups may exist but are seldom verified. |
As business continues to adopt cloud services they have access to functionality that was either cost prohibited (e.g. DR data centers in multiple regions), or lacked the skill set to implement (e.g. Oracle RAC DB). Businesses need to revamp their data classifications to take advantage of these newly available functionality.
I have worked with many businesses over the years with respect to defining, adopting and executing a cloud strategy and the two key areas that need extra attention with respect to data classification schemes are:
- Security and Compliance – is a shared responsibility between the cloud vendor and the business. Read your contract and SLA to identify your responsibilities and your role.
- Multi-cloud complexities – as more and more businesses adopt a multi-cloud environment it is imperative that the data classification scheme is looked at from a holistic perspective.
Good Luck, Now Go Architect Classify That Data