GroupDocs.Search for .NET 25.1 Release Notes

This page contains release notes for GroupDocs.Search for .NET 25.1

Full List of Issues Covering all Changes in this Release

Key	Summary	Category
SEARCHNET-3252	Implement the ability to separately extract data from documents in the search network	Feature
SEARCHNET-3405	Raises ArgumentException when opening a POTX file	Fix
SEARCHNET-3406	Raises ArgumentOutOfRangeException when opening a PDF file	Fix

Public API and Backward Incompatible Changes

Implement the ability to separately extract data from documents in the search network

This functionality allows you to extract data from documents in the search network without indexing the extracted data. This separation of the extraction functionality allows you to create a temporary network, dedicating significant computing resources for a short period of time only to extraction. Then the computing resources for extraction can be freed up, and a smaller portion of the computing resources needed for searching the network will be used.

Public API changes

Property GroupDocs.Search.Common.ExtractedData Data has been added to GroupDocs.Search.Scaling.Events.DataExtractedEventArgs class.
Method Void Extract(System.Collections.Generic.IList<GroupDocs.Search.Common.Document>, System.Collections.Generic.IList<System.String>, GroupDocs.Search.Options.IndexingOptions) has been added to GroupDocs.Search.Scaling.Indexer class.
Property GroupDocs.Search.Scaling.SearchNetworkStatus Status has been added to GroupDocs.Search.Scaling.Indexer class.
Event System.EventHandler ExtractionCompleted has been added to GroupDocs.Search.Scaling.Events.NodeEventHub class.
Field GroupDocs.Search.Scaling.SearchNetworkStatus Extracting has been added to GroupDocs.Search.Scaling.SearchNetworkStatus enum.

Use cases

List<ExtractedData> extractedData = new List<ExtractedData>();

// Subscribing to the event to receive data extraction results
node.Events.DataExtracted += (s, e) =>
{
    extractedData.Add(e.Data);
};

Stream[] streams = new Stream[filePaths.Length];
Document[] documents = new Document[filePaths.Length];
string[] passwords = new string[filePaths.Length];
for (int i = 0; i < filePaths.Length; i++)
{
    string filePath = filePaths[i];
    DateTime modificationDate = File.GetLastWriteTime(filePath);
    string fileName = Path.GetFileName(filePath);
    string extension = Path.GetExtension(filePath);
    Stream stream = File.OpenRead(filePath);
    streams[i] = stream;
    Document document = Document.CreateFromStream(
        fileName,
        modificationDate,
        extension,
        stream);
    documents[i] = document;
}

IndexingOptions options = new IndexingOptions();
options.UseRawTextExtraction = false;
options.ImageIndexingOptions.EnabledForSeparateImages = true;
options.ImageIndexingOptions.EnabledForEmbeddedImages = true;
options.ImageIndexingOptions.EnabledForContainerItemImages = true;
options.OcrIndexingOptions.EnabledForSeparateImages = true;
options.OcrIndexingOptions.EnabledForEmbeddedImages = true;
options.OcrIndexingOptions.EnabledForContainerItemImages = true;

// Extracting data from documents
node.Indexer.Extract(documents, passwords, options);

for (int i = 0; i < streams.Length; i++)
{
    streams[i].Close();
}

// Adding extracted data to the index
node.Indexer.Add(extractedData, options);

Raises ArgumentException when opening a POTX file

Indexing of some specific POTX documents has been fixed.

Public API changes

None.

Use cases

None.

Raises ArgumentOutOfRangeException when opening a PDF file

Indexing of some specific PDF documents has been fixed.

Public API changes

None.

Use cases

None.