Does data anonymization really matter?

In a previous article, I talked about how to drive modern personalization initiatives through machine learning by using customer data from different channels. Phy-gital (physical + digital) shoppers share personal data as a form of currency in exchange for free services and better experiences. This means organizations are facing the challenge of striking the right balance between tailored, personalized experiences and the inherent risks associated with managing large amounts of customer data. Organizations must be more transparent about how they use this data by giving customers more control over what is used—and they should go beyond current regulations to ensure the highest data security.

To continue the conversation, today we’ll discuss the types of anonymization approaches that will help companies reach their personalization objectives without compromising on customer privacy. Businesses collect a wide range of customer data: personal, interaction and transaction. Also known as “personally identified information,” personal data is used to identify an individual in context. In the anonymization approach, this data is segregated from interaction and transactional data—and it’s treated differently. Common anonymization techniques include:

  • Conversion of identity data into nonidentifiable data: Some identity fields are converted into nonidentifiable attributes, such as changing birthdate to age group, address to postal code, or phone number to area code. Identity attributes are also removed from the data. This technique is used by scientists to access data for exploration and analysis for segmentation and building machine learning models, which are then used to offer a personalized experience to target customers through cookies or to identify similar customers through look-alike modeling.
  • Data encryption or masking: Identity data is encrypted or masked with nonidentifiable data to ensure that it’s not readable. In this case, the identity data is necessary to reach the customer. An example would be where the campaign management application needs an email address to send an email promotion or a postal address to send a coupon book in direct mail. The users can create audience segments for their campaigns through nonidentifable attributes without accessing encrypted identity attributes. While, the campaign management application will decrypt the identity data to send the promotion or coupon to the customer.
  • Segregation of identity data: Identity data is removed from the overall customer data and kept in separate protected storage. The data is secured in such a way that access to it is kept programmatic, with very minimal direct access for troubleshooting. A customer service application or e-commerce application would be able to access identity data primarily for verification or solicitation purposes. In this case, the identity data store also has a link or identifier that will allow connection to the nonidentifiable data. In this technique, the application can connect to nonidentifiable data from identity data but will not be able to connect to identity data from non-identifiable data.

These anonymization techniques are not exclusive and may be combined. They’re generally used in the context of data stores. Analytical data stores convert nonidentifiable data, and campaign management stores use data encryption or masking.

Customer trust is an essential building block for an omnichannel personalization framework. Consumers must believe that your business has a code of honor for their data—and that this code is upheld. Anonymization is a key component of data privacy, security and governance offerings. Business must include it in their data initiatives that drive personalization. To learn more, read about the four steps to personalization in our latest e-book.

Contact Mindtree to schedule a conversation about aligning your data analytics initiatives with data privacy, security and governance standards.