ZyLAB Whitepaper "Text Analysis: The Next Step in Search Technology" to Appear in June Edition of KMWorld Magazine
Whitepaper Outlines Challenges and Opportunities for Implementing Text Analysis Solutions to Find Information not Known Beforehand
McLean, VA - May 19, 2009 - ZyLAB, an innovative developer of Information Access Solutions, today announced that it has released a new educational white paper (appearing in the June issue of KMWorld Magazine) that provides guidance to organizations in need of better understanding the challenges and opportunities associated with using text analysis solutions.
Whitepaper Overview:
- Text analysis differs from traditional search in that, whereas search requires a user to know what he or she is looking for, text analysis attempts to discover information in a pattern that is not known beforehand (through the use of advanced techniques such as pattern recognition, natural language processing, machine learning and so on). By focusing on patterns and characteristics, text analysis can produce better search results and deeper data analysis, thereby providing quick retrieval of information that otherwise would remain hidden
- Even with some of the limitations and challenges profiled in this whitepaper, on balance the next few years will see the extensive application of text analysis in two areas: e-discovery and compliance. Associated with these are the cognate areas of bankruptcy settlements, due diligence processes, and the handling of data rooms during a takeover or a merger
Whitepaper Key Sections:
-
Challenges Facing Text Analysis
Perhaps the biggest challenge with text analysis is that increasing recall can compromise precision, meaning that users end up having to browse large collections of documents to verify their relevance. Standard approaches to countering decreasing precision rely on language-based technology, but when text collections are not in one language, are not domain specific, and/or contain documents of variable sizes and types; these approaches often fail or are too sophisticated for users to comprehend what processes are actually taking place, thereby diminishing their control -
Control of Unstructured Information
Text analysis uses various mathematical, statistical, linguistic and pattern-recognition techniques to allow automatic analysis of unstructured information as well as the extraction of high quality and relevant data -
Information Visualization
Text analysis is often mentioned in the same sentence as information visualization, in large part because visualization is one of the viable technical tools for information analysis after unstructured information has been structured -
Text analysis on non-English Documents
A few text analysis and text-analytics solutions exist that provide real coverage for languages other than English. Due to large investments by the US government, languages such as Arabic, Farsi, Urdu, Somali, Chinese and Russian are often well covered, but German, Spanish, French, Dutch and Scandinavian languages are almost always not fully supported. These limitations need to be taken into account when applying text analysis technology in international cases -
A prosperous future for text analysis
As major legislative changes and stricter control systems will undoubtedly take place companies will have to carry out regular (real time) internal preventative investigations, deeper audits, and risk analyses. Text analysis technology will become an essential tool to help process and analyse the enormous amount of information in a timely fashion