Lawyers handle tremendous amounts of sensitive information every day: their clients’ personal data, including both personally identifiable information (PII) and protected health information (PHI), intellectual property, trade secrets, financial information, and much more. At the same time, lawyers are often required to provide information to opposing counsel, the courts, regulatory agencies, and, under some circumstances, citizens making requests for personal data or governmental records. The trick is to share everything you’re supposed to and nothing you’re not.
Redaction—obscuring or hiding text—is the means by which legal teams remove confidential information from otherwise disclosable records. There are two major challenges around redaction: efficiently identifying the pieces of confidential information that may be hiding within reams of disclosable data and thoroughly redacting that information prior to production.
In this blog post, we’ll start by defining a few key terms: confidential information, inadvertent disclosure, and redaction. We’ll then discuss the risks involved with redaction and review some best practices for completing redactions quickly and effectively without manually redacting the same information over and over again.
- Don’t rely on forms to locate confidential information
- Use technology to identify which documents need to be classified
- Include a reason code for each redaction
- Ensure that confidential information is removed, not just covered
- Remove confidential information from text files and metadata
What is confidential information?
Confidential information is information that should be protected from view because it is private, confidential, privileged, or otherwise secret—which means that whether information is confidential depends on the audience to whom it will be disclosed. Generally speaking, confidential information may be:
- personal data about an individual, which may be personally identifiable information (PII) like a name, date of birth, or Social Security Number or protected health information (PHI) such as a medical diagnosis;
- privileged information that is protected under the attorney-client privilege, as attorney work product, or via another type of privilege; or
- confidential information that involves internal organizational strategy, intellectual property, trade secrets, or other protected information.
Confidential information often occurs within the context of documents that should (or even must) be disclosed. That disclosure may occur in the context of eDiscovery, court filings, or elsewhere in the course of litigation. However, legal teams may also need to compile information for disclosure pursuant to the federal Freedom of Information Act (FOIA), or its corollaries in state law, known interchangeably as sunshine laws, open records laws, or public records laws. Under federal and state open records laws, citizens are entitled to obtain information about how their government operates.
While both litigation and open records laws carry a presumption that responsive information should be disclosed, there are exceptions for confidential information. What happens when information that should have been protected slips through the cracks and is accidentally disclosed? That’s what we’ll turn to next.
What are classified documents?
Classified documents refer to materials that a government body considers confidential information and therefore requires to be protected against unauthorized disclosure in the interests of national security.
What are the three levels of classified information?
In the United States, there are three levels of security classification: Confidential, Secret, and Top Secret.
Confidential is applied to information, unauthorized disclosure of which is expected to cause damage to national security. Examples of confidential information include insurance policy numbers, driver’s license numbers, credit card numbers, etc.
Secret is applied to information, unauthorized disclosure of which is expected to cause serious damage to national security.
Top secret is applied to information, unauthorized disclosure of which is expected to cause extreme damage to national security.
Why is information classified?
The purpose of classifying information is to protect it from getting disclosed. The higher the security classification level, the more it might endanger governmental security if it lands in the wrong hands. According to ‘The provisions of Executive Order 12958’, the authority to classify information in the US may be exercised by:
- the President & the Vice President;
- the agency heads and officials designated by the President in the Federal Register;
- U.S. Government officials delegated this authority.
How long can documents be classified?
As per the U.S. President’s Executive Order 13526 section 1.5(d), no information may remain classified indefinitely. Information that is clearly expected to reveal the personal identity of a confidential human source or a human intelligence source, marked 50X HUM and WMD, is designated a classification duration of up to 75 years.
In other cases, a document classification can be extended for up to 25 years from the original date of the document’s creation and is automatically declassified after that timeframe.
What is inadvertent disclosure of confidential information?
Inadvertent disclosure occurs when confidential information that should have been withheld for privacy, confidentiality, or other reasons is accidentally included within a disclosure. Generally speaking, inadvertent disclosures are accidental oversights—mistakes that occur during either the identification of sensitive information or the application of a redaction.
We’ll consider the risks of inadvertent disclosure in just a moment, but before we do, let’s define that last term.
What is redaction?
Redaction is the process by which sensitive information is fully removed from disclosed records, whether those records are being disclosed in eDiscovery, in a court filing, in response to an open records law request, or otherwise. Whenever a recipient is entitled to receive records that also include information they are not permitted to see, those records should be redacted to protect the sensitive information within them. Redactions generally appear as heavy black boxes over individual words or numbers or, in the case of more extensive redactions, bars concealing lines of text.
Redaction can be a time-consuming, aggravating, and error-prone task. It is the quintessential needle-in-a-haystack data problem: legal teams must parse through pages of non-sensitive information to detect the small pieces of sensitive information—names, dollar values, identifying numbers, and more—that may be hidden within.
But finding sensitive information is only the first challenge involved in redaction: legal teams must also thoroughly sanitize that information so that it cannot be uncovered by any means. You increase your risk of inadvertently disclosing sensitive information when you don’t give sufficient attention to both pieces of the redaction puzzle. Using inefficient manual processes or outdated technologies to locate sensitive information can cause you to miss pieces of information entirely, such that you fail to apply a redaction. Similarly, using improper and ineffective redaction methods can cause you to produce information that you’ve identified and that you thought you redacted, but that can still be seen. Both errors create risks—which we’ll discuss next.
Download the whitepaper to learn how eDiscovery helps you easily sift through huge volumes of data and find all relevant documents for a case in a fast and cost-effective manner.
What are the risks involved with inadvertent disclosure of confidential information?
When confidential information is either not identified within a disclosure or incompletely redacted so that it can be revealed through various means, that information becomes accessible to parties who should not have received it. These errors can lead to data protection claims, waive the attorney-client privilege, provide the basis for a malpractice lawsuit or professional discipline, undercut arguments in a case, and more. The damages caused by an inadvertent disclosure of confidential information are, in some cases, limited only by the injured party’s imagination.
You’ve likely heard of some of these blunders, such as when Paul Manafort’s lawyers improperly redacted information in a court filing. Reporters who received redacted copies of the filings were able to discern the “redacted” confidential information by simply copying and pasting the text. Not only are such errors embarrassing, but they’re also potentially damaging to the lawyers’ subsequent arguments.
How was the information in Manafort’s filing so easily revealed? The legal team had covered the sensitive text with black boxes, but they had not actually removed the underlying text, so copying the selection into another document revealed it. The same can happen with text that’s “redacted” by changing the font color to match the background; change to a contrasting font color and the apparently missing text comes right back.
In more recent news, the US Department of Justice was facing a looming deadline (by August 25th, 2022) for a redacted affidavit on the FBI's Mar-a-Lago search. Considering the tight time pressure and the high importance of the case, committing redaction errors is extremely risky and undesirable.
Additionally, FOIA and open records requests, in particular, are usually under tight time pressure. Government agencies have only a limited window in which to respond to these requests, so time is of the essence. That added pressure can make the Sisyphean task of searching through records for confidential information even more daunting.
So, how do you avoid these risks by getting redaction right? Here are our top five best practices.
Best practices for redacting confidential information
In redacting confidential information, remember that there are two distinct challenges—and you have to get them both right. First, there’s the need to identify confidential information in the mass of non-sensitive data and flag it for redaction. Second, there must be an effective redaction that thoroughly sanitizes the disclosure and closes any “back doors” to revealing the redacted information. There’s one key that unifies these best practices: they all emphasize the importance of automatic redaction technology.
1. Don’t rely on forms to locate confidential information
Some of your disclosed documents may be forms that have been filled out. It’s tempting to try to save time by learning where personal data might be on these forms and effectively ignoring the rest of the document, but it would be a mistake to do so. Likewise, you shouldn’t do a simple search for the phrase “Social Security” when you’re looking for Social Security numbers; these numbers could be misplaced, used in other contexts, or referred to by an abbreviation or shortcut such as “SSN.” Remember that confidential information could be anywhere on the page—and to find all of it, you have to check all of the text. Fortunately, as the second-best practice points out, you don’t have to check it all with your own eyes.
2. Use technology to identify confidential information
Technology is the key to streamlining and simplifying redactions while simultaneously improving the accuracy and consistency of results. Technology offers solutions to both of the challenges of redaction, making it easy to pinpoint confidential information and enabling its complete removal.
One blog author referenced a recent case involving over 27,000 chat messages and more than 2 million individual redactions. That’s an overwhelming volume of messages for any human to sift through and an even more overwhelming number of redactions to apply manually. Powerful redaction software enables users to define patterns for confidential information, including Social Security numbers, phone numbers, email addresses, bank account numbers, and even proper names. This technique, known as auto-classification, allows users to define rules for text that should be redacted and then take advantage of digital search capabilities to rapidly scan through all of the data in a disclosure.
3. Include a reason code for each redaction
As much as you might sometimes want to, you can’t just redact everything in a document—that defeats the purpose of the disclosure, and it’s a violation of ethical standards. You need to have a reason for each redaction you apply should a recipient challenge the redaction. It’s therefore a common practice to automatically provide the reason for a redaction, often as white text on the black redaction box. Fortunately, as long as the user defines the reasons for a redaction when establishing a rule for an automated search, the same technology that identifies confidential information for redaction can automatically code the reason for the redaction. For example, if a user codes a redaction search for data that fits the pattern of a date of birth, they can include “personal data” or “personally identifiable information” as the rationale for that redaction. Then, anytime the software identifies a date of birth, it will both redact the text and code the reason for the redaction.
4. Ensure that confidential information is removed, not just covered
As demonstrated by the Manafort case, it’s not enough to cover up or “white-out” confidential information; that information must be permanently and entirely stripped from the document, image, or file in which it exists. Manual means of covering or obscuring text are not only time-consuming, but the results are also inadequate. Again, automatic redaction technology will completely obscure text and “burn” the redaction box so that the box covering the text cannot be removed.
5. Remove confidential information from text files and metadata
There are two “back doors” through which confidential information is often inadvertently disclosed despite its complete removal from the original file. First, with images that have been subject to optical character recognition (OCR) to translate text within the image, any confidential information must be redacted from both the image file and the accompanying text file. Second, files are typically accompanied by metadata files, such as load files and data files, some of which may contain the same information that was redacted from the original file. If these sources of information are not also stripped and sanitized, inadvertent disclosure of confidential information can still occur.
Technology is the key to efficient redaction of confidential information
Redacting confidential information to avoid inadvertent disclosure can be time-consuming, frustrating, and prone to errors—or it can be fast, easy, and thorough. Technology is what makes the difference, simplifying the process of identifying confidential information within a disclosure and then ensuring the complete removal of that information from the original file as well as any accompanying text and metadata files.
ZyLAB ONE includes an auto-redaction capability with an outstanding auto-classification tool. With ZyLAB ONE, you can create redaction rules for any type of confidential information and automatically identify and redact all of those types of data while adding a code for the redaction reason. It’s straightforward to configure rules that identify types of information, from names, emails, and employee identification numbers to banking information, Social Security numbers, and more. With ZyLAB ONE, redactions are simple, reliable, and complete. Contact us to learn more.