<< Click to Display Table of Contents >> Navigation: Process > Document Types |
DWR processes all file types. However, DWR also extracts content and/or categorizes documents based on file extension.
Below is the list of special file extensions and how they are treated:
General Document Containers |
||
Contents of these containers are always extracted |
||
|
7z |
|
|
lzh |
|
|
rar |
|
|
zip |
|
|
zipx |
|
|
tar |
|
|
gz |
|
|
tgz |
|
|
|
|
Mail Containers |
|
|
Contents of these containers are always extracted, preserving family relationships |
||
|
dbx |
|
|
eml |
|
|
emlx |
|
|
mbox |
|
|
mbs |
|
|
mbx |
|
|
msg |
|
|
p7m |
// encrypted email format |
|
pmm |
|
|
pst |
|
|
qim |
|
|
sbd |
|
|
|
|
Office Documents |
|
|
Search for embedded documents |
||
|
docx |
|
|
pptx |
|
|
ppsx |
|
|
xlsx |
|
|
|
|
Other Known Extensions |
|
|
Recognized but content is not extracted |
||
|
ace |
|
|
arc |
|
|
arj |
|
|
bz |
// unix archive |
|
bz2 |
|
|
bza |
|
|
cpio |
|
|
czip |
|
|
dmg |
|
|
gca |
|
|
gza |
|
|
gzip |
|
|
hqx |
|
|
lha |
|
|
lzs |
|
|
nsf |
// lotus notes mail |
|
ost |
// outlook/exchange cache file |
|
pak |
|
|
pk3 |
|
|
rpm |
|
|
sea |
|
|
sit |
// mac stuff-it file |
|
sitx |
// mac stuff-it extended file |
|
snm |
// unix mail |
|
taz |
|
|
tbz |
|
|
wad |
|
|
war |
|
|
x-stu |
|
|
|
|
Load File Formats |
|
|
These are recognized as standard production loadfiles |
||
|
opt |
|
|
lfp |
|
|
|
|
Assumed Image File Types |
||
|
bmp |
|
|
dwg |
|
|
gif |
|
|
jpeg |
|
|
jpg |
|
|
|
|
|
png |
|
|
tif |
|
|
tiff |
|
|
|
|
Assumed Binary File Types |
||
|
bin |
|
|
cmd |
|
|
com |
|
|
dll |
|
|
exe |
|
|
msp |
|
|
ocx |
|
|
reg |
|