Using the text mining nodes
to process a large collection of documents can require a lot of computing
time and resources. If you have limited resources, it might be necessary
to take one or more of the following actions:
-
Use a sample of the document collection.
-
When using the Text Miner node,
set some of the Parse properties to
No, such
as
Find Entities,
Noun Groups, and
Terms in Single Document.
-
When using the Text Parsing node,
set some of the Detect properties to
No,
such as
Find Entities and
Noun
Groups.
-
In the Text Miner node, reduce
the number of SVD dimensions or roll-up terms. If you are running
into memory problems with the SVD approach, you can roll up a certain
number of terms, and then the remaining terms are automatically dropped.
-
Use the Ignore properties of the
Text Parsing node to limit parsing to high information words. You
can do this by ignoring all parts of speech other than nouns, proper
nouns, noun groups, and verbs.
-
You can also use the Parse properties
in the Text Miner node to ignore all parts of speech other than nouns,
proper nouns, noun groups, and verbs.
-
Structure sentences properly for
best results, including correct grammar, punctuation, and capitalization.
Entity extraction does not always generate reasonable results.