public final class WordsMetadataExtractor extends MetadataExtractor
Provides the functionality to extract the metadata from text documents.
Supported formats:
.DOC | Microsoft Word Text document |
.DOT | Microsoft Word Text template |
.DOCX | Microsoft Office Open XML Text document |
.DOCM | Microsoft Word 2007 Master document |
.RTF | Rich Text Format text file |
.ODT | OpenDocument text |
.HTML (.XHTML, .HTM) | Hypertext Markup Language document |
.MHTML (.MHT) | Web Archive Single File |
Extracting the metadata:
// Create a metadata extractor for text documents
MetadataExtractor metadataExtractor = new WordsMetadataExtractor();
// Extract a metadata from the stream
MetadataCollection metadata = metadataExtractor.extractMetadata(stream);
Constructor and Description |
---|
WordsMetadataExtractor()
Initializes a new instance of the
WordsMetadataExtractor class. |
Modifier and Type | Method and Description |
---|---|
protected MetadataCollection |
extractMetadataFromStream(InputStream stream,
LoadOptions loadOptions)
Extracts the metadata from the
stream . |
extractMetadata, extractMetadata, extractMetadata, extractMetadata
public WordsMetadataExtractor()
Initializes a new instance of the WordsMetadataExtractor
class.
protected MetadataCollection extractMetadataFromStream(InputStream stream, LoadOptions loadOptions)
Extracts the metadata from the stream
.
extractMetadataFromStream
in class MetadataExtractor
stream
- The stream of the document.loadOptions
- The options of loading the file.
Copyright © 2018. All rights reserved.