Microsoft's IE7 Alters Gmail Contents

A friend sent out the latest clip for Ghost Humpers, our episodic mockumentary. Downloading the quicktime movie from Gmail gave me a compressed version. I thought that was odd. After a little searching, I found an option to download attachments as a zip file. Simply replace "disp=attd" with "disp=zip" in the attachment URL. IE7 was changing this to "disp=indzip" for the same result; on its own. This piqued my curiosity.

I searched Gmail for more attachments. A WMV file gave me an unmodified URL. A DOC file gave me an unmodified URL. Files with Apple specific extensions of MOV, MOVIE, MOOV, MOVIEPROJ, QT, and QTCH gave the modified URL. What was going on?

Firefox, Safari, and Opera all gave me the unmodified URLs for every file. Therefore, there are only two options: either Google is giving IE7 a modified URL or Microsoft has coded IE7 to look for a list of extensions within Gmail and modify the attachment URL. The first option seemed highly unlikely.

The result of what Microsoft had done was obvious. A stumbling block stood in the way of getting to files downloaded through Gmail's webmail. This stumbling block required unzipping the file and it only appeared with Apple specific files. Why would Microsoft go to all the trouble of coding IE7 to make it harder for Gmail users to get to files that weren't from Microsoft? And what other kinds of files was Microsoft's product coded to look for? There was one way I could find out.

By copying every page of extensions from filext.com to a text file called extensions.txt, I had a list of every extension they are aware of. How many extensions were in the list?

wc -l extensions.txt
This showed 24656 lines. The file contained a bunch of superfluous information on every line, so I removed everything except the extension itself:
sed -i 's/ .*//' extensions.txt
The file also contained many duplicate file extensions.
cat extensions.txt | uniq > ext.txt
The file ext.txt contained 14037 lines. Now to create 14037 files:
cat ext.txt | while read line; do echo "${line}.${line}" > "${line}.${line}"; done
Using Evolution and Gmail's smtp server, I sent myself all these files in batches of about 100. The first and last extension of each batch was written down to keep track of my progress. When a batch was rejected, this was noted. Of 137 batches sent, 27 were rejected for containing extensions banned by Google.

By dividing up each rejected batch into smaller and smaller chunks, a list of 33 banned extensions emerged:

ADE
ADP
BAT
CHM
CMD
COM
CPL
EXE
HTA
INS
ISP
JSE
LIB
MDE
MSC
MSP
MST
PIF
SCR
SCT
SHB
SYS
TAR
TAZ
TGZ
VB
VBE
VBS
VXD
WSC
WSF
WSH
ZIP
From within Gmail's webmail, using IE7, I opened each batch and slid the cursor down every attachment, while watching the corresponding URL. It was immediately apparent that IE7 was programmed with an exception list, as every URL was modified. There were 31 exceptions:
DOC
GIF
GIF2
GIF87
GIF89A
GIFA
GIFENX
GIFF
GG
ID3
JPE
JPEG
JPG
MP2
MP3
MPE
MPEG
MPG
MPGA
PDF
PNG
PPS
PPT
SGIF
TIF
TIFF
WAV
WAVEBNK
WMA
WMV
XLS
Google already has safeguards in place that ban executable files with an exclusion list, so Microsoft can't claim this mantle. They go to all the trouble of creating IE7 with code specific to Gmail, to give people zipped versions of files that aren't on their list. No other browser puts this petty stumbling block in people's way.

Note: IE7 only mucks with Gmail in standard mode.

-Caleb, 6-12-2008

Addendum: Google's list of banned extensions has no bearing on the article's premise. However, it should be noted that TAR, TAZ, TGZ, and ZIP are banned if they are invalid files or contain executables. The article left open the "highly unlikely" possibility of Google being responsible. This was reduced to "infinitesimally small" by changing the user agent of Firefox to IE7 and testing. (K5 Discussion)