Data Loss Prevention (DLP), per Gartner, may be defined as technologies which perform both content inspection and contextual analysis of data sent via messaging applications such as email and instant messaging, in motion over the network, in use on a managed endpoint device, and at rest in on-premises file servers or in cloud applications and cloud storage. These solutions execute responses based on policy and rules defined to address the risk of inadvertent or accidental leaks or exposure of sensitive data outside authorized channels.
DLP technologies are broadly divided into two categories – Enterprise DLP and Integrated DLP. While Enterprise DLP solutions are comprehensive and packaged in agent software for desktops and servers, physical and virtual appliances for monitoring networks and email traffic, or soft appliances for data discovery, Integrated DLP is limited to secure web gateways (SWGs), secure email gateways (SEGs), email encryption products, enterprise content management (ECM) platforms, data classification tools, data discovery tools, and cloud access security brokers (CASBs).
Understanding the differences between content awareness and contextual analysis is essential to comprehend any DLP solution in its entirety. A useful way to think of the difference is if content is a letter, context is the envelope. While content awareness involves capturing the envelope and peering inside it to analyze the content, context includes external factors such as header, size, format, etc., anything that doesn’t include the content of the letter. The idea behind content awareness is that although we want to use the context to gain more intelligence on the content, we don’t want to be restricted to a single context.
Once the envelope is opened and the content processed, there are multiple content analysis techniques which can be used to trigger policy violations, including:
There are myriad techniques in the market today that deliver different types of content inspection. One thing to consider is that while many DLP vendors have developed their own content engines, some employ third-party technology that is not designed for DLP. For example, rather than building pattern matching for credit card numbers, a DLP vendor may license technology from a search engine provider to pattern match credit card numbers. When evaluating DLP solutions, pay close attention to the types of patterns detected by each solution against a real corpus of sensitive data to confirm the accuracy of its content engine.
Best practices in DLP combine technology, process controls, knowledgeable staff, and employee awareness. Below are recommended guidelines for developing an effective DLP program: