• Jim Seaman

Business: Identify Yourself


The introduction of more stringent data privacy regulations has increased the pressure on businesses to do the right thing by their consumers or employees, in regard to the correct use and safeguarding of their personally identifiable information (PII) or personal data (PD).

Unfortunately, there are numerous components that could be considered to be PII or PD. However, the value of these data assets can be aggregated based upon their associated connections.

As a business, you need to ensure that all these data assets are being used in a legal or business justified manner and that they are adequately protected across their data life-cycles.

Failure to understand what constitutes PII or PD is essential before you can start to educate your personnel and to identify where each piece of PII or PD is is being stored, processed or transmitted.


First of all we need to look at the definitions to start to understand what may be considered as PII or PD.


The NIST SP800-122 (Guide To Protecting The Confidentiality of PII) provides the following definition:

  • Any information about an individual maintained by an agency, including:

  1. Any information that can be used to distinguish or trace an individual‘s identity, such as name, social security number, date and place of birth, mother‘s maiden name, or biometric records; and

  2. Any other information that is linked or linkable to an individual, such as medical, educational, financial, and employment information.


Under article 4 of the European Union's General Data Protection Regulation (EU GDPR), PD is defined as:

  • Any information relating to an identified or identifiable natural person (‘data subject’);

  • An identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.

As a consequence of these broad definitions, it is increasingly more difficult for businesses to appreciate the value of their PII/PD data processes so that they can apply appropriate safeguards.

However, the easiest way to look at your PII/PD data processes are like having multiple sized jigsaw puzzles, each having differing numbers of pieces to complete the puzzle. As an organisation, you are responsible for ensuring that these jigsaw puzzles are played with correctly and that the pieces are safely handled, returned and sealed in the box when not being played with, and returned to the play cupboard when no longer wanting to be played with.

Ultimately, whether your safeguarding the pieces to your 'Waldo' or your 'Wally' jigsaw puzzles the principles are pretty much the same - to prevent harm.

Piecing together the jigsaw

Much like a jigsaw, the different data types have differing values and the more pieces you are able to connect together, the greater their potential value.

Anyone that has ever put together a jigsaw will understand the different values of the pieces:

  • Corner pieces,

  • Straight edged pieces,

  • Identifiable pieces (e.g. Sky (blue & clouds) pieces.

  • Inter-connecting/related pieces.

This is exactly the same principle that you should apply to your data protection efforts, in order to ensure that you are appropriately protection your personal data jigsaw pieces.

For example, let's take a look at the components of a customer record.


  • Post Code. On its own not really an issue, as this is an area code that could relate to hundreds of people.

  • Town. Even less value than the Post Code, as this covers a much greater area.

  • Street Name. This now increases the value, as this starts to narrow down the number of individuals that may be associated with this street.

  • Property Name/Number. Yet again, the value is increased as the number of individuals associated with a single property is extremely limited.


  • Surname. This is a relatively low value, as there may be numerous people with the same Surname.

  • Forename. Much the same as the Surname, there may be multiple persons with the same Forename. However, once associated with the Surname this starts to increase the value by narrowing the number of persons that could have the same Surname and Forename.

  • Personal Title. On its own, this has an extremely limited value but when linked to a Surname & Forename, this increases the value as this identifies the gender and potential relationship status.

  • Year of Birth. As a singular record, this has a limited value, as there are multiple people born in a particular year.

  • Month of Birth. The same applies to this data set, as there are still multiple people associated with being born in the same month.

  • Day of Birth. Now, we are starting to increase the value of the data set, as this narrows down the number of persons associated to being born on a particular day of their birth.

The greater the number of linked data sets you can identify, the greater the value and potential impact for the associated individual.

In addition to these standard data sets, you also have the higher impacting data sets that are deemed to directly identify or impact an individual:

  • Payment Card Number.

  • National Insurance/Social Security Number.

  • Medical history.

  • Financial account information.

All these data sets need to be identified, categorised and risk assessed based upon the potential impact to the individual. Consequently, it is essential that higher risk PII/PD business processes are assessed for their potential impact (aka. Privacy Impact Assessment (PIA) or Data Protection Impact Assessment (DPIA)).


For many years, businesses have freely used PII and PD with little consideration for legal/legitimate use or the potential impact on individuals, in the event of these data sets being compromised and used for criminal use.

It is important to respect the fact that consumers and employees are entrusting their PII/PD for use by business and it has never been more important for businesses to respect this. Failure to do this, provides an opportunity for criminals to profit from the use of such data and increases the levels of distrust.

Consequently, the success or failure of any data privacy and security must start with the identification, classification and risk assessment of your jigsaw pieces, to ensure that you are able to appreciate the true value of your jigsaw puzzles.

  • Identify your PII/PD flows.

  • Identify all your jigsaw puzzles.

  • Understand the types of jigsaw puzzles.

  • Identify the value of your jigsaw pieces.

  • Identify the systems that support the processing, transmission or storage of the jigsaw pieces/puzzles.

  • Ensure that high value jigsaw pieces and supporting systems are appropriately safeguarded.

  • Identify the aggregated values of your jigsaw pieces.

  • Identify all the locations of your jigsaw pieces.

  • Ensure that all pieces of the jigsaw are retained in their boxes.

  • Identify the personnel authorised to handle PII/PD.

  • Ensure that access is restricted to only authorised personnel and to educate these personnel on the value, and to ensure that this data is handled appropriately.

  • Ensure that all pieces of the jigsaw are retained in their boxes.

Consider the use periodic automated data discovery scanning, as well as team engagement, to identify the locations of the pieces of the jigsaw. Without this component, it is extremely difficult for you to truly appreciate the extent of the data operations, the potential associations/linkages and ultimately the true risks/impacts of these business processes.


Cases of identity theft and cyber attack continue to be on the rise and unless you understand the locations of the jigsaw pieces, how they interconnect and how these associations increase the value of these jigsaw pieces.

Consequently, unless businesses actively self-identify they will not be in a position to appropriately safeguard their jigsaw pieces/puzzles and will increase the risk of compromise or misuse.

The importance of self-identifying is clearly evident, year on year, through the Ponemon Institute's research into the average costs of a data breach:

  • USA - $8.19 Million

  • Middle East - $5.97 Million

  • UK - $4.88 Million

  • Germany - $4.78 Million

  • Japan - $3.75 Million

In addition, the importance of identifying, categorising and risk assessing your data assets was clearly seen as a causal factor in the Equifax data breach (Estimated costs - $1.14 Billion), which led to a payment card database not being protected by the PCI DSS controls:

"Canadian credit card information of individuals who purchased certain direct-to-consumer products or fraud alerts by phone was, at the time, held by Equifax Inc. in a database that had not been included in the scope of Equifax Inc.’s annual Payment Card Industry Data Security Standard (PCI-DSS) certification".

©2018 by IS Centurion. Proudly created with Wix.com