GroupDocs.Search for .NET 25.1 Release Notes
Full List of Issues Covering all Changes in this Release
Key | Summary | Category |
---|---|---|
SEARCHNET-3252 | Implement the ability to separately extract data from documents in the search network | Feature |
SEARCHNET-3405 | Raises ArgumentException when opening a POTX file | Fix |
SEARCHNET-3406 | Raises ArgumentOutOfRangeException when opening a PDF file | Fix |
Public API and Backward Incompatible Changes
Implement the ability to separately extract data from documents in the search network
This functionality allows you to extract data from documents in the search network without indexing the extracted data. This separation of the extraction functionality allows you to create a temporary network, dedicating significant computing resources for a short period of time only to extraction. Then the computing resources for extraction can be freed up, and a smaller portion of the computing resources needed for searching the network will be used.
Public API changes
Property GroupDocs.Search.Common.ExtractedData Data has been added to GroupDocs.Search.Scaling.Events.DataExtractedEventArgs class.
Method Void Extract(System.Collections.Generic.IList<GroupDocs.Search.Common.Document>, System.Collections.Generic.IList<System.String>, GroupDocs.Search.Options.IndexingOptions) has been added to GroupDocs.Search.Scaling.Indexer class.
Property GroupDocs.Search.Scaling.SearchNetworkStatus Status has been added to GroupDocs.Search.Scaling.Indexer class.
Event System.EventHandler ExtractionCompleted has been added to GroupDocs.Search.Scaling.Events.NodeEventHub class.
Field GroupDocs.Search.Scaling.SearchNetworkStatus Extracting has been added to GroupDocs.Search.Scaling.SearchNetworkStatus enum.
Use cases
List<ExtractedData> extractedData = new List<ExtractedData>();
// Subscribing to the event to receive data extraction results
node.Events.DataExtracted += (s, e) =>
{
extractedData.Add(e.Data);
};
Stream[] streams = new Stream[filePaths.Length];
Document[] documents = new Document[filePaths.Length];
string[] passwords = new string[filePaths.Length];
for (int i = 0; i < filePaths.Length; i++)
{
string filePath = filePaths[i];
DateTime modificationDate = File.GetLastWriteTime(filePath);
string fileName = Path.GetFileName(filePath);
string extension = Path.GetExtension(filePath);
Stream stream = File.OpenRead(filePath);
streams[i] = stream;
Document document = Document.CreateFromStream(
fileName,
modificationDate,
extension,
stream);
documents[i] = document;
}
IndexingOptions options = new IndexingOptions();
options.UseRawTextExtraction = false;
options.ImageIndexingOptions.EnabledForSeparateImages = true;
options.ImageIndexingOptions.EnabledForEmbeddedImages = true;
options.ImageIndexingOptions.EnabledForContainerItemImages = true;
options.OcrIndexingOptions.EnabledForSeparateImages = true;
options.OcrIndexingOptions.EnabledForEmbeddedImages = true;
options.OcrIndexingOptions.EnabledForContainerItemImages = true;
// Extracting data from documents
node.Indexer.Extract(documents, passwords, options);
for (int i = 0; i < streams.Length; i++)
{
streams[i].Close();
}
// Adding extracted data to the index
node.Indexer.Add(extractedData, options);
Raises ArgumentException when opening a POTX file
Indexing of some specific POTX documents has been fixed.
Public API changes
None.
Use cases
None.
Raises ArgumentOutOfRangeException when opening a PDF file
Indexing of some specific PDF documents has been fixed.
Public API changes
None.
Use cases
None.