GroupDocs.Parser for .NET 25.8 Release Notes

Full List of Issues Covering all Changes in this Release

KeySummaryCategory
PARSERNET-2708Change the concept of the parsing template so that it can be applied to any page in a documentImprovement
PARSERNET-2705Implement automatic matching of template fields to data in a documentImprovement
PARSERNET-2706Implement automatic selection of a template used to extract data from a documentImprovement
PARSERNET-2707Implement ability to scale template elements with a given coefficientImprovement
PARSERNET-2709Implement the ability to set properties in option classes separatelyImprovement
PARSERNET-2710Move page preview options from Parser settings to methods optionsImprovement
PARSERNET-2711Implement Equals and GetHashCode methods for the Point classImprovement

Public API and Backward Incompatible Changes

Change the concept of the parsing template so that it can be applied to any page in a document

Description

This improvement changes the concept of the parsing template from multi-page to single-page. That is, the parsing template items no longer have the PageIndex field. The page index is now passed to the ParseByTemplate method in the options.

Public API changes

Property System.Nullable<System.Int32> PageIndex has been removed from GroupDocs.Parser.Templates.TemplateItem class.
Property Int32 PageIndex has been added to GroupDocs.Parser.Options.ParseByTemplateOptions class.

Constructor ParseByTemplateOptions(Boolean) has been removed from GroupDocs.Parser.Options.ParseByTemplateOptions class.
Constructor ParseByTemplateOptions(Boolean, GroupDocs.Parser.Options.OcrOptions) has been removed from GroupDocs.Parser.Options.ParseByTemplateOptions class.
Constructor ParseByTemplateOptions(Int32) has been added to GroupDocs.Parser.Options.ParseByTemplateOptions class.
Constructor ParseByTemplateOptions(Int32, Boolean) has been added to GroupDocs.Parser.Options.ParseByTemplateOptions class.
Constructor ParseByTemplateOptions(Int32, Boolean, GroupDocs.Parser.Options.OcrOptions) has been added to GroupDocs.Parser.Options.ParseByTemplateOptions class.

Constructor TemplateBarcode(GroupDocs.Parser.Data.Rectangle, System.String, System.Nullable<System.Int32>) has been removed from GroupDocs.Parser.Templates.TemplateBarcode class.
Constructor TemplateBarcode(GroupDocs.Parser.Data.Rectangle, System.String, System.Nullable<System.Int32>, System.Nullable<System.Double>) has been removed from GroupDocs.Parser.Templates.TemplateBarcode class.
Constructor TemplateBarcode(GroupDocs.Parser.Data.Rectangle, System.String, System.Nullable<System.Int32>, System.Nullable<System.Double>, Boolean) has been removed from GroupDocs.Parser.Templates.TemplateBarcode class.
Constructor TemplateBarcode(GroupDocs.Parser.Data.Rectangle, System.String, System.Nullable<System.Double>) has been added to GroupDocs.Parser.Templates.TemplateBarcode class.
Constructor TemplateBarcode(GroupDocs.Parser.Data.Rectangle, System.String, System.Nullable<System.Double>, Boolean) has been added to GroupDocs.Parser.Templates.TemplateBarcode class.

Constructor TemplateField(GroupDocs.Parser.Templates.TemplatePosition, System.String, System.Nullable<System.Int32>) has been removed from GroupDocs.Parser.Templates.TemplateField class.
Constructor TemplateField(GroupDocs.Parser.Templates.TemplatePosition, System.String, System.Nullable<System.Int32>, System.Nullable<System.Double>) has been removed from GroupDocs.Parser.Templates.TemplateField class.
Constructor TemplateField(GroupDocs.Parser.Templates.TemplatePosition, System.String, System.Nullable<System.Int32>, System.Nullable<System.Double>, Boolean) has been removed from GroupDocs.Parser.Templates.TemplateField class.
Constructor TemplateField(GroupDocs.Parser.Templates.TemplatePosition, System.String, System.Nullable<System.Double>) has been added to GroupDocs.Parser.Templates.TemplateField class.
Constructor TemplateField(GroupDocs.Parser.Templates.TemplatePosition, System.String, System.Nullable<System.Double>, Boolean) has been added to GroupDocs.Parser.Templates.TemplateField class.

Constructor TemplateTable(GroupDocs.Parser.Templates.TemplateTableLayout, System.String, System.Nullable<System.Int32>) has been removed from GroupDocs.Parser.Templates.TemplateTable class.
Constructor TemplateTable(GroupDocs.Parser.Templates.TemplateTableLayout, System.String, System.Nullable<System.Int32>, System.Nullable<System.Double>) has been removed from GroupDocs.Parser.Templates.TemplateTable class.
Constructor TemplateTable(GroupDocs.Parser.Templates.TemplateTableLayout, System.String, System.Nullable<System.Int32>, System.Nullable<System.Double>, Boolean) has been removed from GroupDocs.Parser.Templates.TemplateTable class.
Constructor TemplateTable(GroupDocs.Parser.Templates.TemplateTableParameters, System.String, System.Nullable<System.Int32>) has been removed from GroupDocs.Parser.Templates.TemplateTable class.
Constructor TemplateTable(GroupDocs.Parser.Templates.TemplateTableParameters, System.String, System.Nullable<System.Int32>, System.Nullable<System.Double>) has been removed from GroupDocs.Parser.Templates.TemplateTable class.
Constructor TemplateTable(GroupDocs.Parser.Templates.TemplateTableParameters, System.String, System.Nullable<System.Int32>, System.Nullable<System.Double>, Boolean) has been removed from GroupDocs.Parser.Templates.TemplateTable class.
Constructor TemplateTable(GroupDocs.Parser.Templates.TemplateTableLayout, System.String) has been added to GroupDocs.Parser.Templates.TemplateTable class.
Constructor TemplateTable(GroupDocs.Parser.Templates.TemplateTableLayout, System.String, System.Nullable<System.Double>) has been added to GroupDocs.Parser.Templates.TemplateTable class.
Constructor TemplateTable(GroupDocs.Parser.Templates.TemplateTableLayout, System.String, System.Nullable<System.Double>, Boolean) has been added to GroupDocs.Parser.Templates.TemplateTable class.
Constructor TemplateTable(GroupDocs.Parser.Templates.TemplateTableParameters, System.String) has been added to GroupDocs.Parser.Templates.TemplateTable class.
Constructor TemplateTable(GroupDocs.Parser.Templates.TemplateTableParameters, System.String, System.Nullable<System.Double>) has been added to GroupDocs.Parser.Templates.TemplateTable class.
Constructor TemplateTable(GroupDocs.Parser.Templates.TemplateTableParameters, System.String, System.Nullable<System.Double>, Boolean) has been added to GroupDocs.Parser.Templates.TemplateTable class.

Usage

TemplateField templateField = new TemplateField(
    new TemplateFixedPosition(new Rectangle(new Point(35, 160), new Size(110, 20))),
    "Address");
Template template = new Template(new TemplateItem[] { templateField });
ParseByTemplateOptions options = new ParseByTemplateOptions(3); // Setting the page index in the options
using (Parser parser = new Parser(documentPath))
{
    DocumentData data = parser.ParseByTemplate(template, options);
}

Implement automatic matching of template fields to data in a document

Description

This improvement implements automatic matching of parsing template fields with target data on a document page by analyzing special adjustment fields that have a new IsHidden property set to true. Matching of parsing template fields is required in cases where a document page has a structure specified in a template, but there are distortions in scale and offset.

Public API changes

Class AdjustmentFieldsOptions has been added to GroupDocs.Parser.Options namespace.
Constructor AdjustmentFieldsOptions() has been added to GroupDocs.Parser.Options.AdjustmentFieldsOptions class.
Property System.String FieldNamePrefix has been added to GroupDocs.Parser.Options.AdjustmentFieldsOptions class.
Property GroupDocs.Parser.Options.OcrOptions OcrOptions has been added to GroupDocs.Parser.Options.AdjustmentFieldsOptions class.
Property Int32 PageIndex has been added to GroupDocs.Parser.Options.AdjustmentFieldsOptions class.
Property System.Nullable<System.Double> RequestedPageWidth has been added to GroupDocs.Parser.Options.AdjustmentFieldsOptions class.

Method System.Collections.Generic.IEnumerable<GroupDocs.Parser.Templates.TemplateItem> GenerateAdjustmentFields(GroupDocs.Parser.Options.AdjustmentFieldsOptions) has been added to GroupDocs.Parser.Parser class.

Property Boolean IsHidden has been added to GroupDocs.Parser.Templates.TemplateField class.
Property System.String Value has been added to GroupDocs.Parser.Templates.TemplateField class.

Usage

The following example shows how to create a parsing template with adjustment fields, and then use it to parse another document. The scale and offset adjustments will work due to the presence of adjustment fields in the template.

// Generating adjustment fields
IEnumerable<TemplateItem> adjustmentFields;
using (Parser parser = new Parser(document1Path))
{
    AdjustmentFieldsOptions adjustmentFieldsOptions = new AdjustmentFieldsOptions();
    adjustmentFieldsOptions.OcrOptions = new OcrOptions(new PagePreviewOptions(144));
    adjustmentFields = parser.GenerateAdjustmentFields(adjustmentFieldsOptions);
}

// Creating a template
TemplateField[] targetFields = new TemplateField[]
{
        new TemplateField(new TemplateFixedPosition(new Rectangle(195, 395, 310, 410)), "Description", 595, false),
        new TemplateField(new TemplateFixedPosition(new Rectangle(235, 395, 256, 410)), "Word", 595, false),
        new TemplateField(new TemplateFixedPosition(new Rectangle(455, 395, 480, 410)), "Sum", 595, false),
        new TemplateField(new TemplateFixedPosition(new Rectangle(430, 505, 480, 520)), "Tax", 595, false),
};
IEnumerable<TemplateItem> allFields = adjustmentFields.Concat(targetFields);
Template template = new Template(allFields);

// Parsing by template the other document that has the same structure
using (Parser parser = new Parser(document2Path))
{
    ParseByTemplateOptions parseByTemplateOptions = new ParseByTemplateOptions();
    parseByTemplateOptions.PageIndex = 7;
    parseByTemplateOptions.UseOcr = true;
    parseByTemplateOptions.OcrOptions = new OcrOptions(new PagePreviewOptions(144));
    DocumentData result = parser.ParseByTemplate(template, parseByTemplateOptions);
}

Implement automatic selection of a template used to extract data from a document

Description

This improvement allows you to pass a collection of templates to the ParseByTemplate method and automatically select the required template for parsing.

Public API changes

Class TemplateCollection has been added to GroupDocs.Parser.Templates namespace.
Method Void Add(GroupDocs.Parser.Templates.Template) has been added to GroupDocs.Parser.Templates.TemplateCollection class.
Method Void Clear() has been added to GroupDocs.Parser.Templates.TemplateCollection class.
Property Int32 Count has been added to GroupDocs.Parser.Templates.TemplateCollection class.
Method GroupDocs.Parser.Templates.Template Get(Int32) has been added to GroupDocs.Parser.Templates.TemplateCollection class.
Method System.Collections.Generic.IEnumerator<GroupDocs.Parser.Templates.Template> GetEnumerator() has been added to GroupDocs.Parser.Templates.TemplateCollection class.
Method Boolean Remove(GroupDocs.Parser.Templates.Template) has been added to GroupDocs.Parser.Templates.TemplateCollection class.
Method Void RemoveAt(Int32) has been added to GroupDocs.Parser.Templates.TemplateCollection class.
Constructor TemplateCollection() has been added to GroupDocs.Parser.Templates.TemplateCollection class.
Constructor TemplateCollection(System.Collections.Generic.IEnumerable<GroupDocs.Parser.Templates.Template>) has been added to GroupDocs.Parser.Templates.TemplateCollection class.

Method GroupDocs.Parser.Data.DocumentData ParseByTemplate(GroupDocs.Parser.Templates.TemplateCollection, GroupDocs.Parser.Options.ParseByTemplateOptions) has been added to GroupDocs.Parser.Parser class.

Constructor DocumentData(GroupDocs.Parser.Templates.Template, System.Collections.Generic.IEnumerable<GroupDocs.Parser.Data.FieldData>) has been added to GroupDocs.Parser.Data.DocumentData class.
Property GroupDocs.Parser.Templates.Template Template has been added to GroupDocs.Parser.Data.DocumentData class.
Constructor DocumentPageData(System.Collections.Generic.IEnumerable<GroupDocs.Parser.Data.FieldData>, Int32) has been removed from GroupDocs.Parser.Data.DocumentPageData class.
Constructor DocumentPageData(GroupDocs.Parser.Templates.Template, System.Collections.Generic.IEnumerable<GroupDocs.Parser.Data.FieldData>, Int32) has been added to GroupDocs.Parser.Data.DocumentPageData class.

Usage

The following example shows how to perform parsing by template with automatic selection of the required template. It should be noted that all templates in the collection must have adjustment fields, otherwise automatic template selection will not work.

using (Parser parser = new Parser(filePath))
{
    ParseByTemplateOptions parseByTemplateOptions = new ParseByTemplateOptions();
    parseByTemplateOptions.PageIndex = 3;
    parseByTemplateOptions.UseOcr = true;
    parseByTemplateOptions.OcrOptions = new OcrOptions(new PagePreviewOptions(144));
    DocumentData result = parser.ParseByTemplate(templateCollection, parseByTemplateOptions);
}

Implement ability to scale template elements with a given coefficient

Description

This improvement allows you to recalculate the coordinates of any template element in a scale with a given coefficient by calling just one method.

Public API changes

Method GroupDocs.Parser.Templates.TemplateItem Scale(Double) has been added to GroupDocs.Parser.Templates.TemplateItem class.
Method GroupDocs.Parser.Templates.TemplateItem Scale(Double) has been added to GroupDocs.Parser.Templates.TemplateBarcode class.
Method GroupDocs.Parser.Templates.TemplateItem Scale(Double) has been added to GroupDocs.Parser.Templates.TemplateTable class.
Method GroupDocs.Parser.Templates.TemplateItem Scale(Double) has been added to GroupDocs.Parser.Templates.TemplateField class.

Usage

The following example shows how to create a template field from an existing template field with coordinates rescaled by a factor of 1.5.

TemplateField templateField = new TemplateField(
    new TemplateFixedPosition(new Rectangle(new Point(35, 160), new Size(110, 20))),
    "Address",
    720,
    false);
TemplateField scaledField = (TemplateField)templateField.Scale(1.5);

Implement the ability to set properties in option classes separately

Description

This improvement adds setters to the properties of some options classes, allowing them to be set not only in the constructor.

Public API changes

Constructor OcrOptions() has been added to GroupDocs.Parser.Options.OcrOptions class.
Constructor FormattedTextOptions() has been added to GroupDocs.Parser.Options.FormattedTextOptions class.
Constructor HighlightOptions() has been added to GroupDocs.Parser.Options.HighlightOptions class.
Constructor PageAreaOptions() has been added to GroupDocs.Parser.Options.PageAreaOptions class.
Constructor PageTextAreaOptions() has been added to GroupDocs.Parser.Options.PageTextAreaOptions class.
Constructor ParseByTemplateOptions() has been added to GroupDocs.Parser.Options.ParseByTemplateOptions class.
Constructor TextOptions() has been added to GroupDocs.Parser.Options.TextOptions class.

Usage

The following example shows how to create an instance of an options class and set the value of only one property in it.

ParseByTemplateOptions options = new ParseByTemplateOptions();
options.UseOcr = true;

Move page preview options from Parser settings to methods options

Description

This improvement increases the intuitiveness of the GroupDocs.Parser API in the area of ​​using page preview options. Now, page preview options are passed in options only to those Parser methods in which they are used.

Public API changes

Property GroupDocs.Parser.Options.PagePreviewOptions PagePreviewOptions has been removed from GroupDocs.Parser.Options.ParserSettings class.
Constructor ParserSettings(GroupDocs.Parser.Options.PagePreviewOptions) has been removed from GroupDocs.Parser.Options.ParserSettings class.
Constructor ParserSettings(GroupDocs.Parser.Options.ILogger, GroupDocs.Parser.Options.PagePreviewOptions) has been removed from GroupDocs.Parser.Options.ParserSettings class.
Constructor ParserSettings(GroupDocs.Parser.Options.OcrConnectorBase, GroupDocs.Parser.Options.PagePreviewOptions) has been removed from GroupDocs.Parser.Options.ParserSettings class.
Constructor ParserSettings(GroupDocs.Parser.Options.ILogger, GroupDocs.Parser.Options.OcrConnectorBase, GroupDocs.Parser.Options.PagePreviewOptions) has been removed from GroupDocs.Parser.Options.ParserSettings class.
Constructor ParserSettings(GroupDocs.Parser.Options.ILogger, GroupDocs.Parser.Options.OcrConnectorBase, GroupDocs.Parser.Options.ExternalResourceHandler, GroupDocs.Parser.Options.PagePreviewOptions) has been removed from GroupDocs.Parser.Options.ParserSettings class.

Property GroupDocs.Parser.Options.PagePreviewOptions PagePreviewOptions has been added to GroupDocs.Parser.Options.OcrOptions class.
Constructor OcrOptions(GroupDocs.Parser.Data.Rectangle, GroupDocs.Parser.Options.OcrEventHandler, Boolean) has been removed from GroupDocs.Parser.Options.OcrOptions class.
Constructor OcrOptions(GroupDocs.Parser.Options.PagePreviewOptions) has been added to GroupDocs.Parser.Options.OcrOptions class.
Constructor OcrOptions(GroupDocs.Parser.Data.Rectangle, GroupDocs.Parser.Options.OcrEventHandler, GroupDocs.Parser.Options.PagePreviewOptions, Boolean) has been added to GroupDocs.Parser.Options.OcrOptions class.

Property GroupDocs.Parser.Options.PagePreviewOptions PagePreviewOptions has been added to GroupDocs.Parser.Options.BarcodeOptions class.

Usage

The following example shows how to create an instance of an options class and set the value of only one property in it.

ParseByTemplateOptions options = new ParseByTemplateOptions();
options.UseOcr = true;

Implement Equals and GetHashCode methods for the Point class

Description

This improvement implements comparison of Point class instances based on all its properties, which means it can also be used as the dictionary key.

Public API changes

Method Boolean Equals(System.Object) has been added to GroupDocs.Parser.Data.Point class.
Method Boolean Equals(GroupDocs.Parser.Data.Point) has been added to GroupDocs.Parser.Data.Point class.
Method Int32 GetHashCode() has been added to GroupDocs.Parser.Data.Point class.

Usage

None.