GroupDocs.Search for .NET 24.3 Release Notes

Major Features

There are the following features, enhancements, and fixes in this release:

  • Implement support for surrogate pairs
  • Enable Chinese, Japanese, Korean alphabets by default

Full List of Issues Covering all Changes in this Release

KeySummaryCategory
SEARCHNET-3102Implement support for surrogate pairsBreaking Change
SEARCHNET-3100Enable Chinese, Japanese, Korean alphabets by defaultEnhancement

Public API and Backward Incompatible Changes

Implement support for surrogate pairs

This enhancement implements support for Unicode surrogate pairs. Surrogate pairs are indexed as single characters. Also in the Alphabet and Character Replacement dictionaries, surrogate pairs can be managed as single, indivisible characters.

Public API changes

Method System.Collections.Generic.IEnumerator<System.Char> GetEnumerator() has been removed from GroupDocs.Search.Dictionaries.Alphabet class.
Method GroupDocs.Search.Dictionaries.CharacterType GetCharacterType(Int32) has been added to GroupDocs.Search.Dictionaries.Alphabet class.
Method GroupDocs.Search.Dictionaries.CharacterType GetCharacterType(System.String) has been added to GroupDocs.Search.Dictionaries.Alphabet class.
Method System.Collections.Generic.IEnumerator<System.Int32> GetEnumerator() has been added to GroupDocs.Search.Dictionaries.Alphabet class.
Method Void SetRange(System.String[], GroupDocs.Search.Dictionaries.CharacterType) has been added to GroupDocs.Search.Dictionaries.Alphabet class.
Method Void SetRange(Int32[], GroupDocs.Search.Dictionaries.CharacterType) has been added to GroupDocs.Search.Dictionaries.Alphabet class.

Method Boolean Contains(Char) has been removed from GroupDocs.Search.Dictionaries.CharacterReplacementDictionary class.
Method System.Collections.Generic.IEnumerator<System.Char> GetEnumerator() has been removed from GroupDocs.Search.Dictionaries.CharacterReplacementDictionary class.
Method Char GetReplacement(Char) has been removed from GroupDocs.Search.Dictionaries.CharacterReplacementDictionary class.
Method Void AddRange(System.Collections.Generic.IEnumerable<System.Collections.Generic.KeyValuePair<System.Int32, System.Int32») has been added to GroupDocs.Search.Dictionaries.CharacterReplacementDictionary class.
Method Boolean Contains(Int32) has been added to GroupDocs.Search.Dictionaries.CharacterReplacementDictionary class.
Method System.Collections.Generic.IEnumerator<System.Int32> GetEnumerator() has been added to GroupDocs.Search.Dictionaries.CharacterReplacementDictionary class.
Method Int32 GetReplacement(Int32) has been added to GroupDocs.Search.Dictionaries.CharacterReplacementDictionary class.
Method Void RemoveRange(System.Collections.Generic.IEnumerable<System.Int32>) has been added to GroupDocs.Search.Dictionaries.CharacterReplacementDictionary class.
Method Void RemoveRange(Int32[]) has been added to GroupDocs.Search.Dictionaries.CharacterReplacementDictionary class.

Property Char Character has been removed from GroupDocs.Search.Dictionaries.CharacterReplacementPair class.
Property Char Replacement has been removed from GroupDocs.Search.Dictionaries.CharacterReplacementPair class.
Property Int32 Character has been added to GroupDocs.Search.Dictionaries.CharacterReplacementPair class.
Constructor CharacterReplacementPair(Int32, Int32) has been added to GroupDocs.Search.Dictionaries.CharacterReplacementPair class.
Property Int32 Replacement has been added to GroupDocs.Search.Dictionaries.CharacterReplacementPair class.

Use cases

The following example shows how to search for Unicode characters that are represented by surrogate pairs in UTF-16; as well as sequences of characters for which the SeparateWord type is specified.

C#

string indexFolder = @"c:\MyIndex\";
string documentsFolder = @"c:\MyDocuments\";

// Creating an index in the specified folder
Index index = new Index(indexFolder);

// Indexing documents from the specified folder
index.Add(documentsFolder);

// Search for a surrogate pair using Unicode code point
string query1 = char.ConvertFromUtf32(0x20E97);
SearchResult result1 = index.Search(query1);

// Search for a sequence of hieroglyphs
// Note that the line is enclosed in double quotes and there are spaces between the hieroglyphs
string query2 = "\"入 里 面\"";
SearchResult result2 = index.Search(query2);

Enable Chinese, Japanese, Korean alphabets by default

This enhancement allows Chinese, Japanese and Korean to be indexed by default, without having to set the character type in Alphabet. Each character of these languages has a SeparateWord type. This means that to search for a sequence of such characters, you must use phrase search syntax.

Public API changes

None.

Use cases

None.