| Text Mining
Text Mining refers to the automatic extraction of data from different
archived resources for the purpose of discovering new or previously
unknown information in unstructured textual data. This extracted
information is linked together to create new facts or to make logical
associations that can be investigated in depth through the use of
sophisticated technology, such as ZyLABs visualization, discovery
and disclosure tools.
Innovative visualization tools help expedite the searching and
finding of key information. More specifically, visualization enables
relevant information and its interrelationships to be seen at the
same time, in a given context. Advanced visualization tools enable
users to analyze large data sets as a whole and can provide a threefold
increase in the speed at which specific information in these data
sets can be found.
Visualization does, however, require structure, but structure is
not always available within complex data sets. Manual structuring
is time consuming and expensive, so users need efficient structuring
tools. For this reason, ZyLAB developed several manual structuring
tools that have been integrated into standard ZyIMAGE products.
The ZyIMAGE Text Mining tools provide (semi)-automatic structuring
tools that further advance the process of giving structure to unstructured
data.
Types of text mining
ZyLAB provides two levels of text mining functionality:
ZyLAB Standard Text Mining Suite (a standard component of
the ZyIMAGE XML Wrapper module) provides users the following advantages:
- File property extraction
- Document property extraction
- Concept extraction
- Automatic language recognition
- Hashing capabilities for unique document identification (based
upon SHA1)
ZyLAB Professional Text Mining Suite builds further upon
the Standard Text Mining functionality to offer advanced features
such as:
- Entity extraction
- Fact extraction
- Summarization
- Document categorization
- Automatic taxonomy generation (ATG)
The ZyIMAGE Professional Text Mining Suite provides users with
an advanced set of text analysis tools that enable them to perform
entity extraction, fact finding, summarization, automatic taxonomy
generation and document categorization. This ZyIMAGE tool suite
is unique because it also supports 31 languages, a very large set
of linguistic capabilities for technology of this level. (See the
product specification for the up-to-date language set.)
|