public class ExtractionOptions extends Object
| Modifier | Constructor and Description |
|---|---|
|
ExtractionOptions()
Initializes a new instance of the
ExtractionOptions class. |
|
ExtractionOptions(IndexingOptions options,
IFieldExtractor customExtractor,
IOcrConnector ocrConnector)
Initializes a new instance of the
ExtractionOptions class. |
|
ExtractionOptions(Object data)
Initializes a new instance of the
ExtractionOptions class. |
protected |
ExtractionOptions(Object state1,
Object state2)
Initializes a new instance of the
ExtractionOptions class. |
| Modifier and Type | Method and Description |
|---|---|
boolean |
getAutoDetectEncoding()
Gets a value indicating whether to detect encoding automatically or not.
|
Object |
getCore() |
IFieldExtractor |
getCustomExtractor()
Gets the custom text extractor.
|
String |
getEncoding()
Gets the encoding used to extract text from text documents.
|
ImageIndexingOptions |
getImageIndexingOptions()
Gets the image indexing options for reverse image search.
|
MetadataIndexingOptions |
getMetadataIndexingOptions()
Gets the options for indexing metadata fields.
|
OcrIndexingOptions |
getOcrIndexingOptions()
Gets the options for OCR processing and indexing recognized text.
|
boolean |
getUseRawTextExtraction()
Gets a value indicating whether the raw mode is used for text extraction if possible.
|
void |
setAutoDetectEncoding(boolean value)
Sets a value indicating whether to detect encoding automatically or not.
|
void |
setCustomExtractor(IFieldExtractor value)
Sets or sets the custom text extractor.
|
void |
setEncoding(String value)
Sets the encoding used to extract text from text documents.
|
void |
setUseRawTextExtraction(boolean value)
Sets a value indicating whether the raw mode is used for text extraction if possible.
|
public ExtractionOptions()
ExtractionOptions class.public ExtractionOptions(IndexingOptions options, IFieldExtractor customExtractor, IOcrConnector ocrConnector)
ExtractionOptions class.options - The options.customExtractor - The custom extractor.ocrConnector - The ocr connector.public ExtractionOptions(Object data)
ExtractionOptions class.data - The serialized data.public boolean getAutoDetectEncoding()
false.public Object getCore()
public IFieldExtractor getCustomExtractor()
null.public String getEncoding()
null, which means that the default encoding UTF-8 is used.
If AutoDetectEncoding is true then this value is used as the default encoding.public ImageIndexingOptions getImageIndexingOptions()
public MetadataIndexingOptions getMetadataIndexingOptions()
public OcrIndexingOptions getOcrIndexingOptions()
public boolean getUseRawTextExtraction()
true.
The raw mode can significantly increase the indexing speed, but normal mode improves the formatting of the extracted text.public void setAutoDetectEncoding(boolean value)
false.value - A value indicating whether to detect encoding automatically or not.public void setCustomExtractor(IFieldExtractor value)
null.value - The custom text extractor.public void setEncoding(String value)
null, which means that the default encoding UTF-8 is used.
If AutoDetectEncoding is true then this value is used as the default encoding.value - The encoding used to extract text from text documents.public void setUseRawTextExtraction(boolean value)
true.
The raw mode can significantly increase the indexing speed, but normal mode improves the formatting of the extracted text.value - A value indicating whether the raw mode is used for text extraction if possible.Copyright © 2026. All rights reserved.