<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>ZyLAB eDiscovery &#38; Information Management</title>
	<atom:link href="http://www.zylab.com/blog/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://www.zylab.com/blog</link>
	<description>eDiscovery &#38; Information Management</description>
	<lastBuildDate>Wed, 09 May 2012 17:18:08 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.3</generator>
		<item>
		<title>Technology Assisted Review, Concept Search and Predictive Coding: The Limitations &amp; Risks</title>
		<link>http://www.zylab.com/blog/?p=209</link>
		<comments>http://www.zylab.com/blog/?p=209#comments</comments>
		<pubDate>Wed, 09 May 2012 17:18:08 +0000</pubDate>
		<dc:creator>Johannes Scholtes</dc:creator>
				<category><![CDATA[Johannes Scholtes Posts]]></category>

		<guid isPermaLink="false">http://www.zylab.com/blog/?p=209</guid>
		<description><![CDATA[(The following was posted to the AIIM Community on May 9, 2012.) Technology Assisted Review (TAR) is a marketing term used in the eDiscovery community to describe the process of automatic classification of documents in a so-called legal review. Similar &#8230; <a href="http://www.zylab.com/blog/?p=209">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><em>(The following was posted to the AIIM Community on May 9, 2012.)</em></p>
<p>Technology Assisted Review (TAR) is a marketing term used in the eDiscovery community to describe the process of automatic classification of documents in a so-called legal review. Similar documents are classified based on training data or seed sets. Typical classes include Confidential, Privileged or Responsive. As the saying goes, “there’s more than one way to skin a cat”; TAR is also called Machine Assisted Review (MAR), Computer Assisted Review (CAR), Predictive Coding, Concept Search, or Meaning-based computing: all of which are marketing terms without any specific scientific meaning.</p>
<p>A recent US ruling by Judge Peck regarding the use of machine learning technology in legal review, has created a lot of tumult in the eDiscovery community (see <a href="http://www.law.com/jsp/lawtechnologynews/PubArticleLTN.jsp?id=1202542221714">http://www.law.com/jsp/lawtechnologynews/PubArticleLTN.jsp?id=1202542221714</a> for more information and links to other articles). Not only has the opposing party filed complaints against the ruling (including accusing the judge of obtaining financial benefits from the vendor whose software was used), but the entire legal community seems to be engaged in the heated topic.</p>
<p>Now that there is case law on the use of TAR, and it has been confirmed by other judges, one can expect a dramatic increase in Predictive Coding, Concept Search or other terms relating to TAR capabilities being a requirement for eDiscovery software buyers.</p>
<p><strong>The Science Behind TAR </strong></p>
<p>For myself, I am an avid fan of artificial intelligence and machine learning. As a matter of fact, I have practiced in the field since 1985 and hold the special chair of text-mining at the University of Maastricht, where I teach my students everything there is to learn on the application of text mining for applications such as document classification, clustering and information extraction.</p>
<p>Machine learning and other techniques from artificial intelligence are not based on “hocus pocus”: they are based on solid mathematical and statistical frameworks in combination with common-sense or biology-inspired heuristics. In the case of text-mining, there is an extra complication: the content of textual documents has to be translated, so to speak, into numbers (probabilities, mathematical notions such as vectors, etc.) that machine learning algorithms can interpret. The choices that are made during this translation can highly influence the results of the machine learning algorithms.</p>
<p>For instance, the “bag-of-words” approach used by some products has several limitations that may result in having completely different documents ending up in the exact same vector for machine learning and having documents with the same meaning ending up as completely different vectors. See <a href="http://www.aiim.org/community/blogs/expert/Language-is-Not-Just-a-Jumbled-Bag-of-Words-Why-Natural-Language-Processing-Makes-a-Difference-in-Content-Analytics">http://www.aiim.org/community/blogs/expert/Language-is-Not-Just-a-Jumbled-Bag-of-Words-Why-Natural-Language-Processing-Makes-a-Difference-in-Content-Analytics</a> for more information on this topic. The garbage-in, garbage-out principle definitely applies here!</p>
<p>Other complications arise when:</p>
<ul>
<li>More than one foreign language is used in the document set, for instance, if some documents are in English and some documents are in Dutch. Multi-lingual documents in which multiple languages appear in individual documents causes even more problems.</li>
<li>The more document categories there are, the lower the quality will be for the document classification. This is very logical as it is easier to differentiate only black from whitethan it is to differentiate 1,000 types of gray values.</li>
<li>The absence of sufficient relevant training documents will lower the quality of classification. The number of required training documents grows faster than the increase of the number of categorization classes. So, for 2 times more classes one may need 4 times more training documents.</li>
<li>The documents use very different or very ambiguous language for the same topics (e.g. there are many synonyms and homonyms).</li>
</ul>
<p>Dealing with incremental document collections (e.g. new documents are added after training) will result in lower quality or require completely new training of the machine learning.</p>
<p>Several risk factors are listed here, but there are more depending on the specific machine learning technology that is used: technology that is based on Bayes classifiers (falsely) presumes statistical independence between measured features (e.g. word occurrences) and Latent Semantic Indexing (LSI) and its variants such as Probabilistic Latent Semantic Analysis (PLSA) effectively use a lossy information compression algorithm (SVD) that may result in more (irreversible) information loss than required. Knowledge of the specific parameter settings is integral to gaining a full understanding of the quality of specific machine learning models.</p>
<p><strong>There is No Free Lunch </strong></p>
<p>Machine-learning requires significant set-up involving training and testing the quality of the classification model (aka the classifier) , which is a time consuming and demanding task that requires at least the manual tagging and evaluation of both the training and the test set by more than one party (in order to prevent biased opinions). Testing has to be done according to best practice standards used in the information retrieval community (e.g. see the proceedings of the TREC conferences organized by the NIST). Deviation from such standards will be challenged in courts. This is time consuming and expensive and should be factored into the cost-benefit analysis for the approach.</p>
<p>If the classifier does not work (e.g. a mutually-agreed upon predefined quality level is not reached), only retraining the entire model with better training examples will work. Eventually, this process could negate any performance increases or cost savings that could have been achieved by applying the technology. In that event it is impossible to improve the model and all training and test efforts will have been a waste of time. This may very well happen in cases that suffer from the complications as described above.</p>
<p>Additionally, one has to be able to explain and defend the application of machine learning technology in court. This may not be a trivial task given the fact that machine learning is based on state-of-the-art principles in linear algebra and probability calculus that are not commonly understood by those who may be involved in the law suit. Therefore, parties and the court will rely heavily on (expensive) expert witnesses.</p>
<p><strong>Summary </strong></p>
<p>So, when applying Predictive Coding, Concept Search or other names that refer to Technology Assisted Review, first become informed of the potential risks of using the technology for a particular case. It may be the right choice for some cases, but not for others.</p>
<p>If it is not possible or too risky to apply machine learning techniques, then there are also other forms of automatic document classification, such as rules-based document classification. These may be a better choice to use in certain cases, especially when defensibility is an issue. They come with their fair share of set-up, but in almost all cases they are more defensible and easier to manage.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.zylab.com/blog/?feed=rss2&#038;p=209</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Enterprise Technology Counsel Mary Mack takes an In-depth Look at the Move Toward eDiscovery Cost Reduction</title>
		<link>http://www.zylab.com/blog/?p=203</link>
		<comments>http://www.zylab.com/blog/?p=203#comments</comments>
		<pubDate>Thu, 12 Apr 2012 16:05:53 +0000</pubDate>
		<dc:creator>Brenda Mahedy</dc:creator>
				<category><![CDATA[Archive]]></category>

		<guid isPermaLink="false">http://www.zylab.com/blog/?p=203</guid>
		<description><![CDATA[How Your Company Can Cut Costs and Re-Think your eDiscovery Model At one point or another in the course of business, many companies will engage in litigation or have to answer to regulators. One common thread between companies is that &#8230; <a href="http://www.zylab.com/blog/?p=203">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><em>How Your Company Can Cut Costs and Re-Think your eDiscovery Model</em></p>
<p>At one point or another in the course of business, many companies will engage in litigation or have to answer to regulators. One common thread between companies is that the process of responding to litigation and regulatory inquiries can derail their normal business processes and involve unnecessary costs.  Fortunately, there are ways in which companies can become better prepared for inquiries and disclosures and even reduce costs.</p>
<p>Mark Mack, ZyLAB’s Enterprise Technology Counsel, discusses examples of achieving eDiscovery cost savings with the <a href="http://www.metrocorpcounsel.com/articles/18422/ediscovery-%E2%80%93-focus-cost-saving-and-winning">Metropolitan Corporate Counsel,</a> saying, “Automatic redaction, the reuse of work product, the random sampling and our search technology are the pillars of cost savings for our clients.” <a href="#_ftn1">[1]</a> Companies that partner with ZyLAB receive world-class eDiscovery technology and see firsthand how it provides immediate eDiscovery cost savings, but also litigation readiness and improved information governance.</p>
<p>Some key topics and excerpts from the <a href="http://www.zylab.com/downloads/MCC_April2012.pdf">article</a>:</p>
<p><strong>Da Silva Moore and Technology Assisted Review </strong></p>
<p>When asked her thoughts on the case and its impact on eDiscovery, Mary said, “I support Judge Peck’s judicial management toward reducing costs, speeding trials and encouraging cooperation. Judge Peck is one of a handful of jurists who understands and can communicate about computers, and who is willing to help educate the bar.”</p>
<p>In reference to some predictive coding products: “with the seed set protocol, counsel can find themselves forced to disclose the seed set(s) of documents fed into the software –even the nonresponsive documents.”</p>
<p>According to Mary, the ZyLAB difference in these instances lies in, “Using the ZyLAB rules-based approach, it is much less likely our clients will need to disclose a sample set of nonresponsive documents.”</p>
<p>Looking forward Mary says, “I expect to see in the next year that a random sampling of what is left behind will be the most significant contribution to cost savings for corporations. This will allow them intelligently to exclude custodians, which is one of the most significant ways of reducing costs.”</p>
<p><strong>Having Better Control Over All Data: Reduce Overall Costs</strong></p>
<p>The article describes quantitative versus qualitative Early Case Assessments (ECA) and how “the legal team winds up with far more insight early on, which can reduce costs by accelerating the settlement process or reducing the scope.”</p>
<p>Additionally, Mary also touches on the importance of supporting design drawings like AutoCad: “Some may have a specific need to find relevant schematics. Others may need automatic redaction of personally identifiable information, such as Social Security numbers. This results in double savings because it is both a huge time saver as well as an hourly rate saver.”</p>
<p>With respect to data monitoring, Mary addresses the possibility for corporations to “proactively monitor sensitive documents to avoid data seepage and loss of trade secrets. This is an enormous cost savings and also helps protect the corporation in this era of the FCPA, UK Bribery Act, Dodd-Frank and whistleblowers.”</p>
<p>She also communicated the possibility of leveraging multiple eDiscovery models strategically: “From a cost perspective, it is important to find an eDiscovery company that can support such seamless and cost-effective transitions from processing services, to SaaS, to on-premise.”</p>
<p><strong>Proactive eDiscovery</strong></p>
<p>Reactive to eDiscovery, even with the best technology at your side, can involve potentially unnecessary costs and disruptions to operations. During this interview, Mary discusses how corporations are repurposing eDiscovery software like the ZyLAB eDiscovery &amp; Production System: “This proactive eDiscovery tool leads to intelligent governance and litigation readiness for the long term and attacks costs at the source: too much data that has no business or legal purpose.”</p>
<p>To read the full interview with Mary Mack, and get access to all her valuable insights, visit the ZyLAB media coverage page<a href="http://www.zylab.com/NewsEvents/MediaCoverage.aspx"> here</a>.</p>
<p><strong>Coming up…..</strong>Don’t forget to download the upcoming whitepaper from George Socha concerning re-purposing eDiscovery software for proactive purposes and achieving a state of litigation readiness. To reserve a complimentary copy in advance, please visit this <a href="http://www.zylab.com/mcc0412.aspx">page.</a></p>
<p>&nbsp;</p>
<div>
<hr size="1" />
<div>
<p><a href="#_ftnref">[1]</a> http://www.metrocorpcounsel.com/articles/18422/ediscovery-–-focus-cost-saving-and-winning</p>
</div>
</div>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.zylab.com/blog/?feed=rss2&#038;p=203</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Participate in ZyLAB’s Webinar on the issues of “Cross-border litigation: preparing for the unknown.”</title>
		<link>http://www.zylab.com/blog/?p=189</link>
		<comments>http://www.zylab.com/blog/?p=189#comments</comments>
		<pubDate>Wed, 28 Mar 2012 18:40:31 +0000</pubDate>
		<dc:creator>Brenda Mahedy</dc:creator>
				<category><![CDATA[Archive]]></category>
		<category><![CDATA[cross border litigation]]></category>
		<category><![CDATA[edisclosure]]></category>
		<category><![CDATA[ediscovery]]></category>
		<category><![CDATA[global litigation]]></category>
		<category><![CDATA[international litigation]]></category>

		<guid isPermaLink="false">http://www.zylab.com/blog/?p=189</guid>
		<description><![CDATA[In the world of global businesses, it is very likely that some companies will be involved in complex regulatory investigations that extend over international boundaries. In these instances, it can be difficult to address the multifaceted challenges of international litigation &#8230; <a href="http://www.zylab.com/blog/?p=189">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>In the world of global businesses, it is very likely that some companies will be involved in complex regulatory investigations that extend over international boundaries. In these instances, it can be difficult to address the multifaceted challenges of international litigation required to settle cross-border disputes.</p>
<p>European legal professionals are encouraged to join ZyLAB EU/UK’s teams for a <a href="https://www3.gotomeeting.com/register/961487814" target="_blank">webinar</a>, “Cross-border Litigation: Preparing for the Unknown,” this Thursday, March 29th.</p>
<p>This webinar will be lead by e-Disclosure authority Chris Dale and information management expert Johannes Scholtes and will dive into the specifics of data privacy and how it impacts eDiscovery during cross-border litigation. This webinar is part of ZyLAB EU/UK’s continuing European series of discussions for legal professionals and eDiscovery experts to explore the various challenges and scenarios when working within foreign courts and overseas legal systems.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.zylab.com/blog/?feed=rss2&#038;p=189</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Importance of Natural Language Processing for Content Analytics</title>
		<link>http://www.zylab.com/blog/?p=186</link>
		<comments>http://www.zylab.com/blog/?p=186#comments</comments>
		<pubDate>Tue, 27 Mar 2012 15:55:49 +0000</pubDate>
		<dc:creator>Johannes Scholtes</dc:creator>
				<category><![CDATA[Archive]]></category>
		<category><![CDATA[Johannes Scholtes Posts]]></category>
		<category><![CDATA[anaphora]]></category>
		<category><![CDATA[apposition]]></category>
		<category><![CDATA[ediscovery]]></category>
		<category><![CDATA[entity detection]]></category>
		<category><![CDATA[governance]]></category>
		<category><![CDATA[information archiving]]></category>
		<category><![CDATA[information management]]></category>
		<category><![CDATA[natural language processing]]></category>
		<category><![CDATA[normalization]]></category>
		<category><![CDATA[pos analysis]]></category>
		<category><![CDATA[predicate nominative]]></category>
		<category><![CDATA[text analysis]]></category>
		<category><![CDATA[text mining]]></category>

		<guid isPermaLink="false">http://www.zylab.com/blog/?p=186</guid>
		<description><![CDATA[State-of-the art text analysis supports multiple languages, which is critical when investigations go global and involve collections of information in various languages. In such scenarios, the technology obviously adapts to differences in character sets and words, but the tools also &#8230; <a href="http://www.zylab.com/blog/?p=186">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>State-of-the art text analysis supports multiple languages, which is critical when investigations go global and involve collections of information in various languages. In such scenarios, the technology obviously adapts to differences in character sets and words, but the tools also need to incorporate statistics and linguistic properties (i.e., conjunction, grammar, sentiments or meanings) of a language in order to achieve acceptable performance. This Natural Language Processing can dramatically influence the insight that is drawn from text.</p>
<p>However, some text analytics products take a so-called “bag of word” (BOW) approach, in which all words (maybe with the exception of a list of high frequency noise words) are dumped into a mathematical model, without any additional knowledge or interpretation of linguistic patterns and properties such as word order (“a good book” versus “book a good”), synonyms, spelling and syntactical variations, co-references and pronouns resolution or negations.</p>
<p>This “bag of words” approach takes simplicity one step too far. Here is why.</p>
<p>There are special issues that one has to take into account when applying, for instance, entity-, fact-, event- and concept extraction techniques in text mining and where Natural Language Processing can make the difference:</p>
<ul>
<li>Variant Identification and Grouping: It is sometimes needed to recognize variant names as different forms of the same entity giving accurate entity counts as well as the location of all appearances of a given entity. For example, one needs to recognize that the word &#8220;Smith&#8221;, in the example, refers to the &#8220;Joe Smith&#8221; identified earlier and therefore groups them together as aliases of the same entity.</li>
<li>Normalization: Normalizes entities such as dates, currencies, and measurements into standard formats, taking the guesswork out of the metadata creation, search, data mining, and link analysis processes. An example would be good here…</li>
<li>Entity Boundary Detection: Will the technology consider “Mr. and Ms. John Jones” as  one or two entities? And what will the processor consider to be the start and end of an excerpt like,  “VP John M.P. Kaplan-Jones, Ph.D. M.D.”?</li>
</ul>
<p>Such basic normalizations will not only dramatically reduce the size of the data set, it will also result in better data analysis and visualization: entities that would not be related without normalization can be the missing link between two datasets especially if they are written differently in different parts of the data set or if they are not recognized as being a singular or plural entity properly.</p>
<p>In addition, one of the other problems in the discovery and identification of entities, facts, events and concepts, is the resolving of the so called <em>anaphora</em> and <em>co-references</em>. This is the linguistic problem to associate pairs of linguistic expressions that refer to the same entities in the real world.</p>
<p>Consider the following text:</p>
<p>“A man walks to the station and tries to catch the train. His name is John Doe. Later he meets his colleague, who has just bought a card for the same train. They work together at the Rail Company as technical employees and they are going to a meeting with colleagues in New York.”</p>
<p>The text contains various references and co-references. Various <em>anaphora</em> and co-references will have to be disambiguated before it is possible to fully understand and extract the more complex patterns of events. The following list shows examples of these (mutual) references:</p>
<ul>
<li><em>Pronominal Anaphora</em>: he, she, we, oneself, etc.</li>
<li><em>Proper Name Co-reference</em>: For example, multiple references to the same name.</li>
<li><em>Apposition</em>: the additional information given to an entity, such as “John Doe, the father of Peter Doe”.</li>
<li><em>Predicate Nominative</em>: the additional description given to an entity, for example “john Doe, who is the chairman of the soccer club”.</li>
<li><em>Identical Sets</em>: A number of reference sets referring to equivalent entities, such as “Giants”, “the best team”, and the “group of players” which all refer to the same group of people.</li>
</ul>
<p>There are various ways of approaching these problems: (i) with an in-depth linguistic analysis of a sentence, or (ii) using a large already-annotated corpus. Both techniques have their advantages and disadvantages. There is still a lot of research required in this area in the coming years to improve the quality of these types of analyses, but there are already many reliable techniques to resolve co-references and anaphora.</p>
<p>Now, if all these natuaral language processing techniques are applied one will be able to:</p>
<ul>
<li>Extract 2-4x more (and better) entities, facts, events and concepts than other providers that use, for instance, bag of words technology which completely ignores linguistic patterns, negations, pronouns and co-references.</li>
<li>Increase overall extraction quality easily with 20-30% up to 95+% recall, especially due to the proper interpretation of negations and better boundary detection.</li>
<li>Provide a smaller and better set for data analysis such as the derivation of relationship networks and correlation patterns between custodians in social networks and email.</li>
<li>Create much better training sets for machine learning algorithms such as those used for machine assisted review or predictive coding.</li>
</ul>
<p>Language is not a jumbled bag of words; ignoring simple linguistic structures such as synonyms, spelling and syntax variations, co-references and pronouns resolution or negations will result in technology that will probably never exceed 60-65% recall and precision. This stunted performance is caused by the simple fact that lots of relevant information is ignored or wrongly used to build training sets for machine learning. To end users, 60-65% is only 15% off from random selection, which is often interpreted as unreliable behavior for eDiscovery, Governance, Enterprise Information Archiving and other Information Management initiatives. Yet surprisingly, many eDiscovery-related software products do not support Natural Language Processing, and therefore, they completely ignore relevant linguistic patterns like those described in here.</p>
<p>Of course, there are a few exceptions; some eDiscovery and information management products do offer Natural Language Processing as part of their text analysis.</p>
<p>For <em>text mining,</em> a very in-depth analysis is often not necessary; a reasonably light analysis can be sufficient to identify the most important elements of a sentence: the subject clause, the verb clause, potential proper nouns, references, and other relationships. In many cases <em>finite-state parsers</em> or <em>shallow parsers</em> can be used with the support of dictionaries. These analyses are also commonly known as a <em>part-of-speech</em> (<em>POS</em>) analysis.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.zylab.com/blog/?feed=rss2&#038;p=186</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Introducing Our “Discover eDiscovery” Video: See Why ZyLAB is the Best in eDiscovery and Investigative Data Solutions</title>
		<link>http://www.zylab.com/blog/?p=183</link>
		<comments>http://www.zylab.com/blog/?p=183#comments</comments>
		<pubDate>Thu, 22 Mar 2012 14:49:06 +0000</pubDate>
		<dc:creator>Brenda Mahedy</dc:creator>
				<category><![CDATA[Archive]]></category>
		<category><![CDATA[ediscovery]]></category>
		<category><![CDATA[edrm]]></category>
		<category><![CDATA[electronic discovery reference model]]></category>
		<category><![CDATA[exploratory search]]></category>
		<category><![CDATA[information management]]></category>
		<category><![CDATA[litigation response]]></category>

		<guid isPermaLink="false">http://www.zylab.com/blog/?p=183</guid>
		<description><![CDATA[Every sizeable organization at some point will be subject to an investigation, often with little or no warning. The investigation may be related to a litigation response or it may be part of a regulatory probe. Deadlines are tight, best &#8230; <a href="http://www.zylab.com/blog/?p=183">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Every sizeable organization at some point will be subject to an investigation, often with little or no warning. The investigation may be related to a litigation response or it may be part of a regulatory probe. Deadlines are tight, best practices are critical, and you cannot afford to miss anything.</p>
<p>This is when organizations discover the difference between world-class, end-to-end eDiscovery technology and other products on the market. Watch our new “Discover eDiscovery” <a href="../../discover_ediscovery.aspx">video</a> to see how ZyLAB’s powerful exploratory search is at the heart of our industry-leading software for eDiscovery, investigations and governance.</p>
<p>One component of a robust eDiscovery and information management platform is exploratory search. Unlike more traditional search tools, a legal or exploratory search product supports all the disparate data in the enterprise and retrieves all of the potentially relevant hits—not just the most popular ones. With the increased usage of cloud computing, social media, and multimedia, partnering with an eDiscovery software and services leader that handles every data format, source, and language with ease is crucial.</p>
<p>For example, the ZyLAB eDiscovery &amp; Production System delivers advanced, exploratory searches across all electronically stored information in 400+ foreign languages. ZyLAB is the only eDiscovery and information management product that is optimized for 100% recall of relevant data through a combination of advanced search methods and the ability to interpret and handle complex files. Our technology reveals that crucial piece of evidence no matter how deeply it is buried in the organization. Plus, our software is proven, innovative, and fosters global best practices, such as Electronic Discovery Reference Model (EDRM) and Sedona.</p>
<p>So, when litigation—or the regulator—is knocking at your door, ZyLAB helps you save cost and time when there is none to spare.</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.zylab.com/blog/?feed=rss2&#038;p=183</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>New Cloud Collector Software Provides Evidence-Grade Email Collection from the Cloud</title>
		<link>http://www.zylab.com/blog/?p=175</link>
		<comments>http://www.zylab.com/blog/?p=175#comments</comments>
		<pubDate>Thu, 15 Mar 2012 15:34:36 +0000</pubDate>
		<dc:creator>Brenda Mahedy</dc:creator>
				<category><![CDATA[Archive]]></category>
		<category><![CDATA[ediscovery]]></category>
		<category><![CDATA[email ediscovery]]></category>
		<category><![CDATA[exchange activesync]]></category>
		<category><![CDATA[information management]]></category>
		<category><![CDATA[microsoft exchange]]></category>
		<category><![CDATA[office 365]]></category>

		<guid isPermaLink="false">http://www.zylab.com/blog/?p=175</guid>
		<description><![CDATA[Email eDiscovery tasks would be so much easier with a system that could cut through the mess of web-based email accounts and compile all information in one system with the same dexterity and intensity as collections from corporate servers and &#8230; <a href="http://www.zylab.com/blog/?p=175">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Email eDiscovery tasks would be so much easier with a system that could cut through the mess of web-based email accounts and compile all information in one system with the same dexterity and intensity as collections from corporate servers and data stores.</p>
<p>Nowadays professionals carry out many business tasks remotely and via cloud-based tools. Plus, more and more courts are compelling production of “personal” data within online email. It is critical that those sources are properly supported by eDiscovery and information management software.</p>
<p>With the multitude of cloud-based email products and with potential evidence being stored in calendars, contacts, messages, attachments and a variety of other places, this new collector can help alleviate some of the obstacles of getting to the information. Are custodians using Gmail? Office 365*? Not a problem.</p>
<p>The new Cloud Collector pulls data from Microsoft Exchange Online, part of Office 365*, email messaging applications that support Exchange ActiveSync Protocol, IMAP and Post Office Protocol (POP3) making it much more possible to collect and search email data from the diverse systems used by custodians.</p>
<p>Being able to efficiently and effectively collect web-based email data and process it alongside all of the other ESI makes for a more complete picture of the case and potential evidence.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.zylab.com/blog/?feed=rss2&#038;p=175</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>From Quantitative to Qualitative Early Case Assessment</title>
		<link>http://www.zylab.com/blog/?p=172</link>
		<comments>http://www.zylab.com/blog/?p=172#comments</comments>
		<pubDate>Tue, 06 Mar 2012 16:09:07 +0000</pubDate>
		<dc:creator>Johannes Scholtes</dc:creator>
				<category><![CDATA[Archive]]></category>
		<category><![CDATA[Johannes Scholtes Posts]]></category>
		<category><![CDATA[automatic redaction]]></category>
		<category><![CDATA[de-duplication]]></category>
		<category><![CDATA[early case assessment]]></category>
		<category><![CDATA[ediscovery]]></category>
		<category><![CDATA[legal hold]]></category>
		<category><![CDATA[machine assisted review]]></category>

		<guid isPermaLink="false">http://www.zylab.com/blog/?p=172</guid>
		<description><![CDATA[It has recently become clear that a new breed of early case assessment (ECA) is emerging. In recent years, many eDiscovery vendors have focused on ECA as a means to trim down data volumes as fast as possible. As a &#8230; <a href="http://www.zylab.com/blog/?p=172">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>It has recently become clear that a new breed of early case assessment (ECA) is emerging. In recent years, many eDiscovery vendors have focused on ECA as a means to trim down data volumes as fast as possible. As a result, the potentially most insightful (albeit expensive) attorneys only see the data after it has been cleaned. The argument, of course, is that previewing pre-cleaned data is not worth the time of a $500/hour attorney. If they read all data in a linear fashion, that would be valid reasoning. I’ll call this scenario ”Quantitative Early Case Assessment”.</p>
<p>Examples of technology to support this approach are:</p>
<ul>
<li>Smart identification of custodians and data locations with automatic legal hold interviews. During the legal hold interview, software (like ZyLAB’s) can provide the customer with interview tools to determine a (very) early case assessment. Such interview questions are based on ABA best practices for ECA.</li>
<li>Automatic and recurring collection of data, not only by location, file name, extension, and MIME type, but also by using full-text search to collect only relevant documents that match the negotiated Boolean.</li>
<li>Exact de-duplication: NIST, in-house NIST, against other custodians, within custodian, and in production.</li>
<li>Support in the review process for batch coding in the case of email threads, related documents, (near) duplicates, etc.</li>
<li>Organizing review documents in groups with Machine Assisted Review technology to identify potentially privileged, confidential and responsive documents.</li>
<li>Automatic redaction.</li>
</ul>
<p>But there are many cases where a more qualitative approach can be very efficient and valuable to a $500/hour attorney. This is especially when negotiating a settlement—which happens in 97% of all cases. In this scenario, parallel to the quantitative approach described above, a team of attorneys (often mediators) searches and analyzes the data to look for strong evidence that will allow them to accelerate the negotiation and settlement process with the opposing party. All data that is ingested and processed is automatically enriched with advanced content analytics and is available for in-depth search and advanced analysis and visualization. Data can be mapped against the claims and relevant documents can be found quickly and easily. In addition, valuable insights gained from “the earliest early case assessment” help steer the eDiscovery strategy and budget, and minimize the legal risk, exposure and expenses from later stages.</p>
<p>Using this parallel approach, often a settlement is reached well before the quantitative data processing work is done and clients are saved a lot of money. There are more and more high-end law firms who prefer to make money providing valuable legal advice to their clients (for instance in a settlement process), which also has a much higher margin to them than a low-value and low-margin document processing and review services and which also results in happier (returning) clients.</p>
<p>Yes, I can see why Qualitative Early Case Assessment is giving Quantitative Early Case Assessment and run for the money!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.zylab.com/blog/?feed=rss2&#038;p=172</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Corporate Governance is Finally Becoming a Reality—Thanks to eDiscovery</title>
		<link>http://www.zylab.com/blog/?p=167</link>
		<comments>http://www.zylab.com/blog/?p=167#comments</comments>
		<pubDate>Tue, 21 Feb 2012 21:46:19 +0000</pubDate>
		<dc:creator>Johannes Scholtes</dc:creator>
				<category><![CDATA[Archive]]></category>
		<category><![CDATA[Johannes Scholtes Posts]]></category>

		<guid isPermaLink="false">http://www.zylab.com/blog/?p=167</guid>
		<description><![CDATA[At LegalTech 2012 in New York, many vendors showcased eDiscovery products for use in-house and in the cloud to adequately respond to eDiscovery. It seems most products still emphasize how to react when being sued. But there was a new trend &#8230; <a href="http://www.zylab.com/blog/?p=167">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>At LegalTech 2012 in New York, many vendors showcased eDiscovery products for use in-house and in the cloud to adequately respond to eDiscovery. It seems most products still emphasize how to <em><span style="text-decoration: underline">react</span></em> when being sued.<a></a></p>
<p>But there was a new trend at the show this year. For the first time, visitors sought <em><span style="text-decoration: underline">proactive</span></em> information management, enterprise information archiving, legacy information clean-up, defensible disposal, data monitoring for internal investigations such as non-compliance, early fraud detection and other intelligent governance tools. The focus is shifting away from solely limiting risks and costs on specific matters, but looking to use the software for better knowledge management and to assist an organization to implement its strategic goals.</p>
<p>For most vendors, the marketing focus has remained squarely on  reactive eDiscovery solutions for the simple reason that these generated most revenue and closed many times faster than traditional Governance, Archival and other Information Management solutions.</p>
<p>But, we now see that organizations that have bought and deployed reactive eDiscovery solutions are now looking to leverage the same technology, people and skills for more proactive programs! This is especially true when the litigation work load is low, which is often the case, as litigation does not occur on a predictable timeline. There is really no excuse anymore not to use the tools and skills that one already has to limit risks and cost of risky custodians, locations and projects, let alone the benefits of less storage, less duplicates, less backup and better knowledge management.</p>
<p>So, finally, the only and ultimate solution to reduce eDiscovery risks and cost is being implemented: proactive eDiscovery, Governance and Information Archiving programs. All thanks to reactive eDiscovery!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.zylab.com/blog/?feed=rss2&#038;p=167</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Another Great LegalTech!</title>
		<link>http://www.zylab.com/blog/?p=161</link>
		<comments>http://www.zylab.com/blog/?p=161#comments</comments>
		<pubDate>Wed, 08 Feb 2012 15:57:41 +0000</pubDate>
		<dc:creator>Johannes Scholtes</dc:creator>
				<category><![CDATA[Archive]]></category>
		<category><![CDATA[Johannes Scholtes Posts]]></category>

		<guid isPermaLink="false">http://www.zylab.com/blog/?p=161</guid>
		<description><![CDATA[I’ve just returned from another great, high-energy and exciting LegalTech conference in New York. I love that show! Large crowds of old and new friends and incredible buzz. There was no shortage of software solutions that could be deployed either &#8230; <a href="http://www.zylab.com/blog/?p=161">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>I’ve just returned from another great, high-energy and exciting LegalTech conference in New York. I love that show! Large crowds of old and new friends and incredible buzz.</p>
<p>There was no shortage of software solutions that could be deployed either in-house or in the cloud, but the availability of eDiscovery solutions that are fully parallel with distributed Virtual Machine architectures was minimal. I predict that will be the buzz next year, as the capability makes perpetual scalability easily attainable. Some of the showcased solutions cover the entire EDRM, whereas others address just parts of it. Some vendors also provided legal expertise to implement a solid legally-defensible methodology including templates, a well-documented quality control methodology, and referencable case law.</p>
<p>We saw the beginnings of new approaches to Early Case Assessment, where next to the traditional quantitative data-processing eDiscovery approach, some vendors also made it possible to use advanced text, audio and image search in combination with content analytics and data visualization for more qualitative Early Case Assessment (ECA) approach. We also saw advancements in Machine Assisted Review (including automatic creation of random data-sampling sets), Machine Translation and Automatic Redaction on hundreds of languages, electronic file formats and different database repositories including popular cloud-based data collections.</p>
<p>But, besides all these high-tech innovations, one of the key questions remains: can you collect from “my repository” – which is often comprised of legacy email archives or other obsolete content management systems for which support no longer exists. I hope that all the eDiscovery pain and challenges related to these proprietary archives makes organizations look for more enduring and sustainable open archive solutions in the future. In reality, if everything had been archived in XML, for instance, than collection would be a simple endeavor as well.</p>
<p>Another continuing question relates to legal production formats. Many law firms negotiate sometimes exotic or at least uncommon requests or they have found out that TIFF is not the best format to produce XLS (formulas), PPT (animations) or video. Intelligent and flexible legal production functionality can make or break a deal.</p>
<p>Also, for the first time at LegalTech, many visitors shied away from the eDiscovery bells and whistles and wanted to learn more about “proactive eDiscovery” for information management, enterprise information archiving, legacy information clean-up, defensible disposal, data monitoring for internal investigations such as non-compliance, early fraud detection and other intelligent governance tools. The focus seems to be shifting away from solely limiting risks and costs on specific matters, but looking to use the software for better knowledge management and to assist an organization to implement its strategic goals.</p>
<p>In the upcoming weeks, I plan to elaborate in more detail on these topics in my blogs, so look out for more!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.zylab.com/blog/?feed=rss2&#038;p=161</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Calculate your eDiscovery eValuation at #LTNY Booth 325</title>
		<link>http://www.zylab.com/blog/?p=154</link>
		<comments>http://www.zylab.com/blog/?p=154#comments</comments>
		<pubDate>Fri, 27 Jan 2012 21:35:07 +0000</pubDate>
		<dc:creator>Brenda Mahedy</dc:creator>
				<category><![CDATA[Archive]]></category>
		<category><![CDATA[ZyLAB Posts]]></category>

		<guid isPermaLink="false">http://www.zylab.com/blog/?p=154</guid>
		<description><![CDATA[ZyLAB&#8217;s new &#8220;eDiscovery eValuation&#8221; is thrilling CFO&#8217;s everywhere. What if eDiscovery technology could add millions to your bottom line? What if software could make your alternative fee bids more competitive? What if your clients gladly paid for your premium services? &#8230; <a href="http://www.zylab.com/blog/?p=154">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>ZyLAB&#8217;s new &#8220;eDiscovery eValuation&#8221; is thrilling CFO&#8217;s everywhere. What if eDiscovery technology could add millions to your bottom line? What if software could make your alternative fee bids more competitive? What if your clients gladly paid for your premium services? What if you could quantify the value of in-house eDiscovery software? Our eDiscovery eValuation will give you this insight.</p>
<p>During LegalTech New York, ZyLAB will showcase financial models for organizations to make informed decisions about eDiscovery in-house versus outsourcing and to pinpoint unnecessary over-spending on eDiscovery work.</p>
<p>Stop by <strong>booth 325 at LegalTech New York (#LTNY)</strong>. All you need is a few minutes to see our state-of-the-art eDiscovery tools and calculate the impact they will have on your organization. See you there!</p>
<p>Or visit us at <a href="http://www.zylab.com">www.zylab.com</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.zylab.com/blog/?feed=rss2&#038;p=154</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

