What we’re thinking about

Insights, news, and tips from our top tech and business innovators.

Thank you Goldman Sachs, free knowledge for compliance officers

afbleeding-avatar
Jeffrey Wolff |July 4, 2016|Read time: 2 min

Recently I bumped into the article, "You won't believe what gets an email flagged at Goldman: CNBC has the list". It's an interesting read to see what the compliance officers consider to be keywords, sentences and patterns that raise their interest.

Of course the Internet is full of articles talking about keywords you should not use or avoid in your email because some keywords raise flags, but I’ve never seen a “public” list like this before. It contains almost 200 keywords, sentences and patterns written as expressions. I know that many companies have their own internal lists based on their own specific domains varying from simple keyword lists to extensive use of the available query language or even advanced pattern recognition using text mining.

Depending on the domain several ZyLAB user groups share these lists amongst them and continue to improve to increase the recall and precision for e-Discovery, investigative or compliance purposes.

Goldman Sachs terms/searches
I {trusted}|{believed in}|{had faith in} you
"I {was|am} extremely[?]{pissed|angry|concerned|upset|agitated|bothered|distressed|
perturbed|worried|vexed|confused|flustered|discouraged|
rattled|daunted|demoralized|disheartened|

The list of Goldman Sachs reminded me of the queries that were used in the Lehman Brothers bankruptcy investigation, although more specific for the actual case at hand, some of them are quite similar:

Lehman Brothers Chapter 11 terms/searches
(fund* or cash) w/10 (transfer* or mov* or sweep*)
(large or big* or signific*) w/10 (collateral w/10 pledg* or mov*)
((antoncic* or *risk* or expos*) and ( caution or concern or increase or toxic or outsized or significant or chunky )) w/20 (*LBO* or *lever* or *buy-out* or *buyout* or bridge* or syndicat*)
shocked or speechless or stupid* or (huge mistake) or (big mistake) or dumb or (can’t believe) or (cannot believe) or (serious trouble) or (big trouble) or unsalvageable or (too late) or ((breach or violat*) w/5 (duty or duties or obligation*)) or (nothing we can do) or uncomfortable or (not comfortable) or (I don’t think we should) or (very sensitive) or (highly sensitive) or (very confidential) or (highly confidential) or (strongly disagree) or (do “not” share this) or (don’t share this) or (between you “and” me) or (just between us) or ((can’t or cannot or shouldn’t or (should not) or won’t or (will not)) w/5 (discuss or (talk about)) w/5 (email or e-mail or computer)) or (should w/5 (discuss or talk) w/5 (phone or (“in” person))

The complete Lehman Brothers search list can be found in the Report of the Examiner in the Chapter 11 proceedings of Lehman Brothers Holdings Inc in Volume 7, Exhibit 5.1.

 

Available now

ZyLAB users can use the searches from the Lehman investigation directly within ZyLAB as the same syntax is used. Not all of them might be relevant, but it’s a great source of ideas for creating your own searches.

For those ZyLAB users that are interested, I converted the Goldman Sachs list into ready to use with ZyLAB searches. The full list can be downloaded below. Of course some changes had to be made as the syntax was different, but with some find and replace actions it was pretty easy to convert. If you want to design your own searches or convert from other lists, keep in mind the following.

  • By default, ZyLAB is not using a noise word list. You can, but it is not required. This means that all words and single characters are indexed and searchable. If you have used a noise word list, make sure that you are aware of the contents.
  • Sometimes you want to use search operators as an actual keyword, you can do that by quoting these keywords (“not”, “of”, “in”, “and”, “or”) so that these are searched on.
  • An effective way to add variations to your query is by using the ZyLAB Quorum search option. This allows you to set the number of terms that should occur.
    • Number of terms to match of {term1, term2, term3, term?}
    • 1 of {pissed, angry, concerned, upset, agitated}
    • A quorum search can be used as an alternative to an OR query but is easier to manage and faster
    • The terms in a Quorum search can be queries themselves, which makes this a very powerful and efficient tool
  • Searches can be made more manageable by using search macros for repetitive parts.

Sample conversions

Goldman Sachs term ZyLAB Search
a sure {bet}|{thing} a sure (1 of {bet, thing})
adjust your account|losses|profits adjust your (account OR losses OR profits)

Download the full list of ZyLAB Searches.

Jeffrey Wolff
Jeffrey Wolff is a Certified E-Discovery Specialist who joined ZyLAB in May 2015 and serves as Director of E-Discovery Solutions. He brought with him over 20 years of experience in Information Systems and enterprise software. He has been involved in solution architecture, design, and implementation for major projects within the Department of Defense and Fortune 1000 corporations. Prior to joining ZyLAB, Jeffrey held senior positions within firms specializing in Microsoft SharePoint and enterprise search solutions, so he has vast technical knowledge in the fundamentals of information management and eDiscovery.

Share this blog post:

Get the latest ZyLAB updates