<dependency>
    <groupId>com.groupdocs</groupId>
    <artifactId>groupdocs-redaction</artifactId>
    <version>20.2</version>
</dependency>
copied!  
compile(group: 'com.groupdocs', name: 'groupdocs-redaction', version: '20.2')
copied!  
<dependency org="com.groupdocs" name="groupdocs-redaction" rev="20.2">
    <artifact name="groupdocs-redaction" ext="jar"/>
</dependency>
copied!  
libraryDependencies += "com.groupdocs" % "groupdocs-redaction" % "20.2"
copied!  

High Code Java API to Hide & Redact Sensitive Data

banner

Product Page | Docs | Demos | API Reference | Examples | Blog | Free Support | Temporary License

GroupDocs.Redaction for Java is on-premise API that enable your Java applications to hide and redact any sensitive or classified data, content, information, or metadata to make it completely unreadable and non-searchable.

Document Redaction Java On-premise API Features

  • Preview the document by rendering it in JPEG, PNG, or BMP image format.
  • Text Redaction
    • Replace or hide the classified text
    • Search for an exact phrase and apply redaction on it.
    • Support for case-sensitive & case-insensitive search.
    • Support to use regular expressions (regex) search.
    • Option to use a colored box as well as a replacement string for redaction.
  • Metadata Redaction
    • Repalce all or specific metadata values empty (blank / minimal)
    • Redact the metadata values
    • Apply filters to fetch the desired metadata for redaction
    • Use regular expressions (regex) to filter out the desired metadata for redaction
    • Ability to detect the metadata items for which the redaction failed, skipped or rejected
  • Annotation Redaction
    • Redact the annotation text or delete the annotations
    • Remove all or specific comments from the document
    • Search for specific strings within comments and then apply redaction to the matching ones
    • Ability to redact specific text from within the comment instead of redacting/removing the whole comment
  • Spreadsheet Redaction
    • Apply redaction to specific Microsoft Excel® Worksheet or column
    • Ability to apply filters to identify & designate the column to be redacted
  • Image Redaction
    • Redact classfied & sensitive information within an image
    • Apply a colored box over the area that contains classified information
    • Ability to change image metadata by acting as an EXIF eraser
    • Detect text within the image via OCR and then redact that text
    • Search for a specific text to be redacted within the image using regular expressions (regex) via OCR
    • Apply area redaction or text redaction on the images embedded in documents
  • OCR to perform redaction on Images
    • Perform OCR on scanned documents
    • Perform OCR on images embedded within Microsoft Word® or PDF files
  • Create PDF files with image redaction
  • Support for rasterization to make redacted PDF non-searchable and without metadata
  • Keeping the document formatting intact even after the removal (redaction) of sensitive data
  • You may also implement custom format handler for the file formats which are not currently being supported.

Supported Document Redaction File Formats

The redaction operation on the document body & metadata is supported for the following file formats:

Microsoft Word®: DOC/DOT/DOCX/DOCM/DOTX/DOTM/RTF
Microsoft Excel®: XLSX/XLSM/XLTX/XLTM/XLS/XLT/CSV
Microsoft PowerPoint®: PPTX/PPT/PPSX/POT/PPS/PPTM/PPSM/POTM
Image: JPEG/TIF/TIFF/PNG/BMP/GIF
Fixed Layout: PDF

The redaction operation on the document annotations (comments) is supported for the following file formats:

Microsoft Word®: DOC/DOT/DOCX/DOCM/DOTX/DOTM/RTF
Microsoft Excel®: XLSX/XLSM/XLTX/XLTM/XLS/XLT/CSV
Microsoft PowerPoint®: PPTX/PPT/PPSX/POT/PPS/PPTM/PPSM/POTM
Fixed Layout: PDF

The redaction operation on the document embedded images is supported for the following file formats:

Microsoft Word®: DOC/DOT/DOCX/DOCM/DOTX/DOTM/RTF
Microsoft PowerPoint®: PPTX/PPT/PPSX/POT/PPS/PPTM/PPSM/POTM
Fixed Layout: PDF

The redaction operation on the document OCR (Optical Character Recognition) is supported for the following file formats:

Microsoft Word®: DOC/DOT/DOCX/DOCM/DOTX/DOTM
Microsoft PowerPoint®: PPTX/PPT/PPSX/POT/PPS/PPTM/PPSM/POTM
Image: JPEG/TIF/TIFF/PNG/BMP
Fixed Layout: PDF

For details and limitations please visit, Supported Document Formats.

System Requirements

  • Microsoft Windows: Windows Desktop & Server (x86, x64), Microsoft Azure
  • macOS: Mac OS X
  • Linux: Ubuntu, OpenSUSE, CentOS, and others
  • Java Versions: J2SE 7.0 (1.7), J2SE 8.0 (1.8) or above (for example Java 10)

GroupDocs.Redaction for Java does not require any external software or third party tool to be installed. Just follow one of the ways as described in Installation and Configuration.

Get Started

GroupDocs hosts all Java APIs at the GroupDocs Repository. You can easily use GroupDocs.Redaction for Java API directly in your Maven projects with simple configurations. For the detailed instructions please visit Installation from GroupDocs Repository using Maven documentation page.

Sample Java code to convert DOCX to rasterized & redacted PDF

ByteArrayInputStream inputStream = null;
// Rasterize the document before applying redactions
final Redactor raterizer = new Redactor("C:\\Temp\\sample.docx");
try 
{
    // Perform annotation and textual redactions, if needed
    ByteArrayOutputStream stream = new ByteArrayOutputStream();
    RasterizationOptions options = new RasterizationOptions();
    options.setEnabled(true);
    raterizer.save(stream, options);
    inputStream = new ByteArrayInputStream(stream.toByteArray());  
    stream.close();
}
finally { raterizer.close(); }
if (inputStream != null)
{
    // Re-open the rasterized PDF document to redact its pages as images
    final Redactor redactor = new Redactor(inputStream);
    try 
    {
        RedactorChangeLog result = redactor.apply(new ImageAreaRedaction(new java.awt.Point(1160, 2375),
            new RegionReplacementOptions(java.awt.Color.BLUE, new java.awt.Dimension(1050, 720))));
        if (result.getStatus() != RedactionStatus.Failed)
        {
            final FileOutputStream fileStream = new FileOutputStream("C:\\Temp\\sample_docx_Raster.pdf");
            try 
            {
                RasterizationOptions options = new  RasterizationOptions();
                options.setEnabled(false);
                redactor.save(fileStream, options);
            }
            finally { fileStream.close(); }
        }         
    }
    finally { redactor.close(); inputStream.close(); }
}

Product Page | Docs | Demos | API Reference | Examples | Blog | Free Support | Temporary License

VersionRelease Date
24.1January 25, 2024
23.9September 8, 2023
23.7July 3, 2023
23.3March 29, 2023
23.2February 2, 2023
22.10October 19, 2022
20.7January 25, 2022
19.11January 25, 2022
21.12December 10, 2021
21.6June 25, 2021
21.1January 29, 2021
20.11November 13, 2020
20.2February 28, 2020
20.1January 16, 2020
19.6June 11, 2019