GroupDocs.Search for .NET 24.3 Release Notes
Major Features
There are the following features, enhancements, and fixes in this release:
- Implement support for surrogate pairs
- Enable Chinese, Japanese, Korean alphabets by default
Full List of Issues Covering all Changes in this Release
Key | Summary | Category |
---|---|---|
SEARCHNET-3102 | Implement support for surrogate pairs | Breaking Change |
SEARCHNET-3100 | Enable Chinese, Japanese, Korean alphabets by default | Enhancement |
Public API and Backward Incompatible Changes
Implement support for surrogate pairs
This enhancement implements support for Unicode surrogate pairs. Surrogate pairs are indexed as single characters. Also in the Alphabet and Character Replacement dictionaries, surrogate pairs can be managed as single, indivisible characters.
Public API changes
Method System.Collections.Generic.IEnumerator<System.Char> GetEnumerator() has been removed from GroupDocs.Search.Dictionaries.Alphabet class.
Method GroupDocs.Search.Dictionaries.CharacterType GetCharacterType(Int32) has been added to GroupDocs.Search.Dictionaries.Alphabet class.
Method GroupDocs.Search.Dictionaries.CharacterType GetCharacterType(System.String) has been added to GroupDocs.Search.Dictionaries.Alphabet class.
Method System.Collections.Generic.IEnumerator<System.Int32> GetEnumerator() has been added to GroupDocs.Search.Dictionaries.Alphabet class.
Method Void SetRange(System.String[], GroupDocs.Search.Dictionaries.CharacterType) has been added to GroupDocs.Search.Dictionaries.Alphabet class.
Method Void SetRange(Int32[], GroupDocs.Search.Dictionaries.CharacterType) has been added to GroupDocs.Search.Dictionaries.Alphabet class.
Method Boolean Contains(Char) has been removed from GroupDocs.Search.Dictionaries.CharacterReplacementDictionary class.
Method System.Collections.Generic.IEnumerator<System.Char> GetEnumerator() has been removed from GroupDocs.Search.Dictionaries.CharacterReplacementDictionary class.
Method Char GetReplacement(Char) has been removed from GroupDocs.Search.Dictionaries.CharacterReplacementDictionary class.
Method Void AddRange(System.Collections.Generic.IEnumerable<System.Collections.Generic.KeyValuePair<System.Int32, System.Int32») has been added to GroupDocs.Search.Dictionaries.CharacterReplacementDictionary class.
Method Boolean Contains(Int32) has been added to GroupDocs.Search.Dictionaries.CharacterReplacementDictionary class.
Method System.Collections.Generic.IEnumerator<System.Int32> GetEnumerator() has been added to GroupDocs.Search.Dictionaries.CharacterReplacementDictionary class.
Method Int32 GetReplacement(Int32) has been added to GroupDocs.Search.Dictionaries.CharacterReplacementDictionary class.
Method Void RemoveRange(System.Collections.Generic.IEnumerable<System.Int32>) has been added to GroupDocs.Search.Dictionaries.CharacterReplacementDictionary class.
Method Void RemoveRange(Int32[]) has been added to GroupDocs.Search.Dictionaries.CharacterReplacementDictionary class.
Property Char Character has been removed from GroupDocs.Search.Dictionaries.CharacterReplacementPair class.
Property Char Replacement has been removed from GroupDocs.Search.Dictionaries.CharacterReplacementPair class.
Property Int32 Character has been added to GroupDocs.Search.Dictionaries.CharacterReplacementPair class.
Constructor CharacterReplacementPair(Int32, Int32) has been added to GroupDocs.Search.Dictionaries.CharacterReplacementPair class.
Property Int32 Replacement has been added to GroupDocs.Search.Dictionaries.CharacterReplacementPair class.
Use cases
The following example shows how to search for Unicode characters that are represented by surrogate pairs in UTF-16; as well as sequences of characters for which the SeparateWord type is specified.
C#
string indexFolder = @"c:\MyIndex\";
string documentsFolder = @"c:\MyDocuments\";
// Creating an index in the specified folder
Index index = new Index(indexFolder);
// Indexing documents from the specified folder
index.Add(documentsFolder);
// Search for a surrogate pair using Unicode code point
string query1 = char.ConvertFromUtf32(0x20E97);
SearchResult result1 = index.Search(query1);
// Search for a sequence of hieroglyphs
// Note that the line is enclosed in double quotes and there are spaces between the hieroglyphs
string query2 = "\"入 里 面\"";
SearchResult result2 = index.Search(query2);
Enable Chinese, Japanese, Korean alphabets by default
This enhancement allows Chinese, Japanese and Korean to be indexed by default, without having to set the character type in Alphabet. Each character of these languages has a SeparateWord type. This means that to search for a sequence of such characters, you must use phrase search syntax.
Public API changes
None.
Use cases
None.