GroupDocs.Editor for Java 18.9 Release Notes

Major Features

Cells support

The main feature of the third version of GroupDocs.Editor for Java is a Cells support. List of supported Cells formats are:

  • XLSX
  • Excel97-2003 XLS
  • XLSM format, which enable macros
  • XLTX
  • XLTM format, which enable macros
  • CSV (Comma Separated Value) format
  • Tab delimited text file
  • ODS (OpenDocument Spreadsheet)
  • Excel 2003 XML format
  • XLSB (binary)

For text-based formats like CSV or tab-delimited GroupDocs.Editor allows to specify a separator, which can be a character or a string. For non-text-based formats GroupDocs.Editor allows to specify an opening password, if the input document is encrypted. Also it is possible to specify closing password during document saving; in that case output document will be password-protected.

With GroupDocs.Editor you need to specify, which tab should be edited; you cannot edit multiple tabs simultaneously. Tabs are specified via 0-based sequential index.

If GroupDocs.Editor works in a trial mode, you can select for edit only first two tabs. Also, there will be trial message on a top of the document and watermark on every single raster image within the document.

Cells module

  1. New properties in TextLoadOptions
  2. New option to optimize memory usage
  3. New option to exclude hidden worksheets
  4. Worksheet protection

Words module

  1. Document protection
  2. Reply comments and statuses
  3. New option to optimize memory usage
  4. PDF compliance

Metered license

Second main feature in 18.9 is a Metered license support. Now you can use Metered license instead of standard one.

Other Features

Along with Cells support we have slightly improved existing Words processing module.

  • Multiple consequent spaces are now processed much better in round trip scenarios (open-edit-save cycle).
  • Improved space processing for bidirectional text.
  • Improved list processing in round trip scenarios.
  • Other minor improvements.
  • Fixed several bugs and security improvements update.

Full List of Issues Covering all Changes in this Release

KeySummaryCategory
EDITORNET-868Implement opening Cells documents and converting them to the HTML formatNew Feature
EDITORNET-869Implement generating Cells documents from input HTMLNew Feature
EDITORNET-870Add support of text-based Cells documents with ability to specify a separatorNew Feature
EDITORNET-871Add support of opening encrypted documents with passwordNew Feature
EDITORNET-872Implement support of encrypting output Cells documents with setting a passwordNew Feature
EDITORNET-873Add support of Metered license systemNew Feature
EDITORNET-874Improve processing of multiple consequent spaces in Words processing module for round trip scenariosImprovement
EDITORNET-875Improve space processing for bidirectional textImprovement
EDITORNET-876Improve list processing in round trip scenariosImprovement
EDITORNET-911Implement support of generating the password-protected sheets in spreadsheet documentsNew Feature
EDITORNET-927Implement support of additional parameters when processing text-based spreadsheetNew Feature
EDITORNET-928Implement ability to adjust memory usage during opening input Cells documentNew Feature
EDITORNET-929Implement the ExcludeHiddenWorksheets optionNew Feature
EDITORNET-930Implement ability to adjust memory usage during Words processingNew Feature
EDITORNET-931Add support of document protection during Words document generationNew Feature
EDITORNET-933Implement Reply comments and Done statusNew Feature
EDITORNET-935Implement ability to select PDF standards compliance level when generating PDF from HTMLNew Feature
EDITORNET-946Security improvements updateImprovement
EDITORNET-914Fix common bug in length and Resolution parsing modulesBug
EDITORNET-895Fix ArgumentException with pages15.docx sample documentBug

Public API and Backward Incompatible Changes

Metered license support

GroupDocs.Editor 18.9 now supports Metered license system along with usual licensing system from previous versions. This means that, instead of specifying license file, you can now switch GroupDocs.Editor from trial into licensing mode using Metered keys. Class Metered contains two methods and is responsible for setting keys:

/**
  * <p>
  *  Provides methods to set metered key.
  *  </p>
  */
public class Metered
{
/**
  * <p>
  * Sets metered public and private key
  * </p>
  * @param publicKey public key
  * @param privateKey private key
  */
    public void setMeteredKey(string publicKey, string privateKey)
     
  /**
    * <p>
    * Gets consumption quantity
    * </p>
    * @return consumption quantity
    */
    public static java.math.BigDecimal getConsumptionQuantity()
}

The first method allows to set a pair of public and private keys, while second allows you to obtain the quantity of already consumed data.

Cells support

The main feature of GroupDocs.Editor 18.9 for Java is a full support of all variety of Cells documents, which includes XLS, XLSX, CSV, ODS and others. The EditorHandler class supports new document formats automatically. This means that when you invoke ToHtml method with Cells document stream, GroupDocs.Editor will detect the type of the document properly. There are also option classes, which allow to tune up conversion process.

Cells to HTML

/**
     * <p>
     * Allows to specify custom options for loading documents of all supportable Cells (Excel-compatible) formats
     * </p>
     */
public class CellsToHtmlOptions
{
    public int getWorksheetIndex(){}
    public void setWorksheetIndex(int value){}
     
    public String getPassword(){}
    public void setPassword(String value){}
 
    public  TextLoadOptions getTextOptions(){}
    public void setTextOptions(TextLoadOptions value){}
}

The CellsToHtmlOptions class allows to specify opening password (in case if document is encrypted), worksheet index to open, and text options in case when input document is text-based (CSV, tab-delimited, semicolon-delimited etc.). There are several things that should be noted:

  1. Password is ignored, if input document is not encrypted or is text-based.
  2. Text options are ignored, if input document is not a text-based
  3. Worksheet index is 0-based. If input Cells document contains only one tab, this option will be ignored. Default value is 0 (first tab). If specified index exceeds the number of all tabs, the exception will be thrown.
  4. Text options class allows to specify a separator (delimiter), which can be an arbitrary character or string.

TextLoadOptions class contains only one string property, which allows to specify a separator (delimited) for text-based Cells documents. This may be a single character (like comma, semicolon, white space, tab, or anything else) or even a custom string. GroupDocs.Editor will apply this separator for representing the input text-based Cells document in a proper view.

/**
* Subclass for loading text-based Cells documents (CSV, Tab-based etc.)
*/
public class TextLoadOptions
{
    /**
    * Allows to specify a string separator for text-based Cells documents
    */
    public String getSeparator(){}
    public void setSeparator(String value){}
}

HTML to Cells

Class CellsSaveOptions is designed for tuning the backward process of conversion from HTML to one of output Cells formats.

/** 
 * Allows to specify custom options for generating and saving MS Excel-compliant documents
 */
public class CellsSaveOptions
{
    public String getPassword(){}
    public void setPassword(String value){}
 
    public byte getOutputFormat(){}
    public void setOutputFormat(byte value){}
}

The Password property, if set to not-null and not-empty string, allows to protect the generated output document with the password. The OutputFormat property is an enumeration, which provides ability to specify output document format. The default value is XLSX. If output format is a text-based document (like CSV), then password, even being specified, will be ignored.

Improvements and new features in Cells module

New properties in TextLoadOptions

GroupDocs.Editor supports different Cells formats when converting document to HTML. Some of Cells formats are binary, like XLSX, while some have textual nature, like CSV, TabDelimited and some other. CellsToHtmlOptions class contains an inner class TextLoadOptions, which is designed especially for such text-based Cells formats. In v18.9 version we have added two new public options to this class: ConvertDateTimeData and ConvertNumericData. Both options are boolean and are false by default.

/** 
* Gets or sets a value that indicates whether the string in text file is converted to the date data. Default is false.
*/
public bool getConvertDateTimeData() {}
public void setConvertDateTimeData(boolean value){}
 
/**
* Gets or sets a value that indicates whether the string in text file is converted to numeric data. Default is false.
*/
public bool getConvertNumericData() {}
public void setConvertNumericData(boolean value){}

By default the GroupDocs.Editor, when opening text-based Cells document, interpret all content from any cell as textual. With this option users can specify, whether GroupDocs.Editor needs to parse such content and tries to convert it to the numeric or datetime data.

New option to optimize memory usage

By default GroupDocs.Editor works with Cells document with maximum performance; in other words, it tries to perform the work in the least possible time. The drawback of such approach is that in some cases, especially when processed document is huge, memory consumption may be a problem. In such cases, when you’re facing OutOfMemoryException, you big memory consumption is unacceptable, you may turn on the OptimizeMemoryUsage option by setting it to true. In this case GroupDocs.Editor will significantly decrease memory usage, but this will degrade performance. The *OptimizeMemoryUsage *boolean option is disabled by default and is located in the CellsToHtmlOptions class.

/**
* Enables memory optimization mechanisms during input document processing, which may degrade performance in some special cases, 
* but on the other hand decrease memory usage. Useful when processing huge documents and facing OutOfMemoryException. 
* Default is false (memory optimization is disabled for the sake of better performance).
*/
public boolean getOptimizeMemoryUsage(){}
public void setOptimizeMemoryUsage(boolean value){}

New option to exclude hidden worksheets

Almost any Cells document (excluding the text-based) along with any spreadsheet-processing software (like MS Excel) supports multiple worksheets (tabs). GroupDocs.Editor can process only single tab at once, the WorksheetIndex option is responsible for selecting such tab. Several binary Cells formats (like XLSX) support hidden worksheets (tabs) concept. Hidden worksheet means that when opening document with such tabs, you will not see them usually unless you manually will make them visible (normal). GroupDocs.Editor by default completely ignores visibility status of worksheets, i.e. it processes all tabs usually. But now, with the new option ExcludeHiddenWorksheets, it is allowed to exclude hidden tabs from processing. When enabled, GroupDocs.Editor will completely ignore them, like they are not existing. In such scenario the WorksheetIndex option “covers” only visible tabs. For example, when document has three tabs, where first tab is hidden, while two consequent (second and third) are visible, the ExcludeHiddenWorksheets = 1 will select the last (third) tabs, because it is second visible. So, we may say that ExcludeHiddenWorksheets option, when turned on, modifies the behavior of the WorksheetIndex option.

/**
* Allows to exclude hidden worksheets in the input Cells document, so they will be totally ignored. 
* Default is false - hidden worksheets are available and processed as normal.
*/
/** 
* Remarks:
* Several binary Cells formats (like XLSX) support hidden worksheets (tabs) concept. 
* Document of such format, if it has more then one worksheet, may contain additional hidden worksheets. 
* By default such hidden worksheets are available for processing, but with this option it is able to ignore them, 
* like these hidden worksheets are absent and don't exist. When this option is enabled, you cannot select hidden workseet with the 
* '<see cref="WorksheetIndex"/>' property.
*/
public boolean getExcludeHiddenWorksheets(){}
public void setExcludeHiddenWorksheets(boolean value){}

ExcludeHiddenWorksheets is a boolean property, which is disabled (false) by default, and is located in the CellsToHtmlOptions class.

Worksheet protection

Most of binary Cells formats support the document protection feature — when the document is protected from modifications of specific type with the password. Most of spreadsheet-processing software (like MS Excel) also supports this feature. And now, with the v18.9, such feature is also supported by the GroupDocs.Editor. When saving edited document in HTML format into Cells format, you are able to apply a document protection. Of course, it will not work in case when you select Cells format, which doesn’t support this feature; for example, any of text-based, like CSV.

/** 
* Allows to enable a worksheet protection for the output document. By default is NULL - protection is not applied.
*/
public WorksheetProtection getWorksheetProtection(){}
public void setWorksheetProtection(WorksheetProtection value){}

Document protection is regulated by the WorksheetProtection property in the CellsSaveOptions class. By default it is NULL — document protection is disabled. In order to apply the protection you need to create an instance of the WorksheetProtection class, fill it with necessary values, and set to the WorksheetProtection property.

The WorksheetProtection class is listed below:

/** 
* Encapsulates worksheet protection options, which allow to protect a worksheet in the output Cells document from modification of specified type 
* with a specified password.
*/
/**  
* Remarks:
* Most of Cells formats like XLSX allows to protect a worksheet from editing with password. 
* This class allows to enable such protection and specify its options.
*/
public sealed class WorksheetProtection
{
    /** 
    * Allows to specify a type of worksheet protection. By default is 'None' - protection is not applied.
    */
    public byte getProtectionType(){}
    public void setProtectionType(byte value){}
    /** 
    * Password, which is used for protecting a worksheet. If NULL or empty string, the protection will not be applied.
    */
    public String getPassword(){}
    public void setPassword(String value){}
}

It has two properties — protection type (level) and password. By default the ProtectionType property is set to None — protection is not applied (default value). Password is set to NULL — protection is not applied too. So in order to truly apply the document protection, you need to create an instance of the WorksheetProtection class, set non-null and non-empty password, select valid ProtectionType, and assign this instance to the CellsSaveOptions.WorksheetProtection* *property.

The WorksheetProtectionType is an enumeration, which contains all possible levels of document protection. They are listed below.

  1. None  — Protection is not applied (default value)
  2. All — User cannot modify anything on the worksheet
  3. Contents — User cannot enter data in the worksheet
  4. Objects — User cannot modify drawing objects
  5. Scenarios — User cannot modify saved scenarios
  6. Structure — User cannot modify the structure
  7. Window — User cannot modify the window

Improvements and new features in Words module

Document protection

Most of Words formats, like DOCX and ODT, support a document protection; when user can protect the document from modification of specific type with a password. GroupDocs.Editor, starting from v18.9.0, also supports this feature. When saving edited document to some of Words formats, you are able to apply a some level of protection to the resultant document with the Protection property in the WordsSaveOptions class.

/**
* Allows to control and apply the document protection options for the Words document of any format, which supports document protection. 
* By default is NULL - document protection will not be used.
*/
public DocumentProtection getProtection(){}
public void setProtection(DocumentProtection value){}

By default this property is NULL — no protection is applied. In order to apply the protection, you need to create an instance of the DocumentProtection class and assign it to this property.

/**
* Encapsulates document protection options for the Words document, which is generated from HTML
*/
public final class DocumentProtection
{
    /**
    * Parameterless constructor - all parameters have default values
    */
    public DocumentProtection()
    {
    }
 
    /**
    * Allows to set all parameters during class instantiation
    *
    * @param protectionType Set the protection type of the document
    * @param password Set the protection password
    */
    public DocumentProtection(byte protectionType, String password)
    {
        this.setProtectionType(protectionType);
        this.setPassword(password);
    }
    /**
    * Allows to set a protection type of the document. By default is set to not protect the document at all.
    */
    public byte getProtectionType(){}
    public void setProtectionType(byte value){}
    /**
    * The password to protect the document with. If null or empty string - the protection will not be applied to the document.
    */
    public String getPassword(){}
    public void setPassword(String value){}
}

The DocumentProtection class contains two properties, which both are vital for protecting the document — *Password *and ProtectionType. By default the ProtectionType property is set to *NoProtection *— protection is not applied (default value). Password is set to NULL — protection is not applied too. So in order to truly apply the document protection, you need to create an instance of the DocumentProtection class, set non-null and non-empty password, select valid ProtectionType, and assign this instance to the *WordsSaveOptions.*Protection property.

The DocumentProtectionType is an enumeration, which contains all possible levels of document protection. They are listed below.

  1. NoProtection — The document is not protected. Default value.
  2. AllowOnlyRevisions — User can only add revision marks to the document
  3. AllowOnlyComments — User can only modify comments in the document
  4. AllowOnlyFormFields — User can only enter data in the form fields in the document
  5. ReadOnly — No changes are allowed to the document

Reply comments and statuses

GroupDocs.Editor supports comments in documents from the first release. However in the newest versions of Office Open XML was introduced a concept of comments hierarchy, where there are root comments and reply comments, which may be treated as descendants (or children) for the root comments. Starting from v18.9, GroupDocs.Editor recognizes and supports such comments. When opening document in HTML for editing, GroupDocs.Editor preserves comment hierarchy and renews it when saving edited document in some of Words formats. Along with this, GroupDocs.Editor now supports the “Done” status for the comments. Again, this is supported also for the backward conversion.

New option to optimize memory usage

When saving edited documents in HTML to some of Words formats, GroupDocs.Editor works with maximum performance, trying to save the document during the least possible time. But such approach may require a huge amount of memory when the document is big. When high memory consumption is not suitable for you, or you’re facing the OutOfMemoryException, you can enable the option OptimizeMemoryUsage from the WordsSaveOptions class. By default this boolean option is set to false — memory optimization is turned off for the sake of the best performance. When turning on, this will significantly decrease memory consumption while generating large documents at the cost of slower saving time.

/**
* Enables memory optimization mechanisms during document generation from HTML, which degrades performance in as a cost of decreasing memory usage. 
* Setting this option to true can significantly decrease memory consumption while generating large documents at the cost of slower saving time.
* Default is false (memory optimization is disabled for the sake of better performance).
*/
public boolean getOptimizeMemoryUsage(){}
public void setOptimizeMemoryUsage(boolean value){}

PDF compliance

When opening Words document, editing it in HTML editor and saving back to some of formats, you may select not only the Words formats, but also PDF using the PdfSaveOptions class. Now we added a new option *Compliance *into this class, which is responsible for the PDF compliance of generated PDF document.

/**
* Specifies the PDF standards compliance level for output documents. Default is PdfCompliance.Pdf15.
*/
public int getCompliance(){}
public void setCompliance(int value){}

This property is of PdfCompliance type, which is enumeration. By default all documents are generated in PDF 1.5 standard. However, with this new option you also may select:

  1. PDF/A-1a standard. This level includes all the requirements of PDF/A-1b and additionally requires that document structure be included (also known as being “tagged”), with the objective of ensuring that document content can be searched and re-purposed. Please note that exporting the document structure significantly increases the memory consumption, especially for the large documents.

  2. PDF/A-1b standard. PDF/A-1b has the objective of ensuring reliable reproduction of the visual appearance of the document.