public final class MarkdownFormattedTextExtractor extends TextExtractor implements ITextExtractorWithFormatter
Provides the formatted text extractor for Markdown (.md) documents.
Extracts a line of characters from a document:
// Create a text extractor for Markdown documents
TextExtractor extractor = new MarkdownFormattedTextExtractor(stream);
// Extract a line of the text
String line = extractor.extractLine();
// If the line is null, then the end of the file is reached
while (line != null) {
// Print a line to the console
System.out.println(line);
// Extract another line
line = extractor.extractLine();
}
Extracts all characters from a document:
// Create a text extractor for Markdown documents
TextExtractor extractor = new MarkdownFormattedTextExtractor(stream);
// Extract a text
System.out.println(extractor.extractAll());
For setting a formatter DocumentFormatter
property is used.
// Create a formatted text extractor for text documents
MarkdownFormattedTextExtractor extractor = new MarkdownFormattedTextExtractor(stream);
// Set a HTML formatter for formatting
extractor.setDocumentFormatter(new HtmlDocumentFormatter()); // all the text will be formatted as HTML
By default a text is formatted as a plain text by Formatters.Plain.PlainDocumentFormatter
.
Constructor and Description |
---|
MarkdownFormattedTextExtractor(InputStream stream)
Initializes a new instance of the
MarkdownFormattedTextExtractor class. |
MarkdownFormattedTextExtractor(String fileName)
Initializes a new instance of the
MarkdownFormattedTextExtractor class. |
Modifier and Type | Method and Description |
---|---|
protected void |
dispose(boolean disposing)
Releases the unmanaged resources used by the extractor.
|
DocumentFormatter |
getDocumentFormatter()
Gets a
DocumentFormatter . |
protected String |
prepareLine()
Returns a line of the text.
|
void |
reset()
Resets the current document.
|
void |
setDocumentFormatter(DocumentFormatter value)
Sets a
DocumentFormatter . |
checkDisposed, close, dispose, extractAll, extractLine, extractText, extractTextLine, getEncoding, getMediaType, getPassword, isDisposed, setEncoding, setMediaType
public MarkdownFormattedTextExtractor(String fileName)
Initializes a new instance of the MarkdownFormattedTextExtractor
class.
fileName
- The path to the file.public MarkdownFormattedTextExtractor(InputStream stream)
Initializes a new instance of the MarkdownFormattedTextExtractor
class.
stream
- The stream of the document.public DocumentFormatter getDocumentFormatter()
Gets a DocumentFormatter
.
getDocumentFormatter
in interface ITextExtractorWithFormatter
DocumentFormatter
. The default is PlainDocumentFormatter
.
PlainDocumentFormatter
class. You can
set any other formatter or null, if you want to use default formatter.
public void setDocumentFormatter(DocumentFormatter value)
Sets a DocumentFormatter
.
setDocumentFormatter
in interface ITextExtractorWithFormatter
value
- An instance of the DocumentFormatter
. The default is PlainDocumentFormatter
.
PlainDocumentFormatter
class. You can
set any other formatter or null, if you want to use default formatter.
public void reset()
Resets the current document.
ExtractLine
method will return the first line of the document.
reset
in class TextExtractor
protected void dispose(boolean disposing)
Releases the unmanaged resources used by the extractor.
dispose
in class TextExtractor
disposing
- A boolean true if invoked from Dispose; otherwise, false.protected String prepareLine()
Returns a line of the text.
prepareLine
in class TextExtractor
Copyright © 2018. All rights reserved.