public final class WordExtractor extends Object implements POIOLE2TextExtractor
| Constructor and Description |
|---|
WordExtractor(DirectoryNode dir) |
WordExtractor(HWPFDocument doc)
Create a new Word Extractor
|
WordExtractor(InputStream is)
Create a new Word Extractor
|
WordExtractor(POIFSFileSystem fs)
Create a new Word Extractor
|
| Modifier and Type | Method and Description |
|---|---|
String[] |
getCommentsText() |
HWPFDocument |
getDocument() |
String[] |
getEndnoteText() |
HWPFDocument |
getFilesystem() |
String |
getFooterText()
Deprecated.
3.8 beta 4
|
String[] |
getFootnoteText() |
String |
getHeaderText()
Deprecated.
3.8 beta 4
|
String[] |
getMainTextboxText() |
String[] |
getParagraphText()
Get the text from the word file, as an array with one String per
paragraph
|
String |
getText()
Grab the text, based on the WordToTextConverter.
|
String |
getTextFromPieces()
Grab the text out of the text pieces.
|
boolean |
isCloseFilesystem() |
void |
setCloseFilesystem(boolean doCloseFilesystem) |
static String |
stripFields(String text)
Removes any fields (eg macros, page markers etc) from the string.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitgetDocSummaryInformation, getMetadataTextExtractor, getRoot, getSummaryInformationclosepublic WordExtractor(InputStream is) throws IOException
is - InputStream containing the word fileIOExceptionpublic WordExtractor(POIFSFileSystem fs) throws IOException
fs - POIFSFileSystem containing the word fileIOExceptionpublic WordExtractor(DirectoryNode dir) throws IOException
IOExceptionpublic WordExtractor(HWPFDocument doc)
doc - The HWPFDocument to extract frompublic String[] getParagraphText()
public String[] getFootnoteText()
public String[] getMainTextboxText()
public String[] getEndnoteText()
public String[] getCommentsText()
@Deprecated public String getHeaderText()
@Deprecated public String getFooterText()
public String getTextFromPieces()
public String getText()
getText in interface POITextExtractorpublic static String stripFields(String text)
public HWPFDocument getDocument()
getDocument in interface POIOLE2TextExtractorgetDocument in interface POITextExtractorpublic void setCloseFilesystem(boolean doCloseFilesystem)
setCloseFilesystem in interface POITextExtractorpublic boolean isCloseFilesystem()
isCloseFilesystem in interface POITextExtractorpublic HWPFDocument getFilesystem()
getFilesystem in interface POITextExtractor