Debian Bug report logs -
#838860
Version 32+ Flash (SWF) files detected as 'application/octet-stream' (data), not 'application/x-shockwave-flash' by file (libmagic1)
Report forwarded
to debian-bugs-dist@lists.debian.org, Christoph Biedl <debian.axhn@manchmal.in-ulm.de>: Bug#838860; Package file.
(Sun, 25 Sep 2016 20:30:04 GMT) (full text, mbox, link).
Acknowledgement sent
to Laurence Parry <greenreaper@gmail.com>:
New Bug report received and forwarded. Copy sent to Christoph Biedl <debian.axhn@manchmal.in-ulm.de>.
(Sun, 25 Sep 2016 20:30:04 GMT) (full text, mbox, link).
Package: file
Version: 1:5.22+15-2+deb8u2
Tags: upstream
Flash files compiled with -swf-version=32 or above are being recognized as
'application/octet-stream' (data) rather than
'application/x-shockwave-flash' due to a restriction in the magic
definition file.
Command:
file -b --mime-type test.swf
Expected output:
application/x-shockwave-flash
Actual output (with jessie and testing):
application/octet-stream
Hex dump of first 16 bytes:
hd -n 16 test.swf
00000000 43 57 53 20 f9 27 53 00 78 9c 94 9a 55 50 1d d1 |CWS
.'S.x...UP..|
(the full file has not yet been publicly released by the creator)
This bug was introduced in 2014 in version 1.10 of the flash magic
definition used by file (via libmagic1 / libmagic-mgc) in an attempt to fix
Debian bug #745546
https://github.com/file/file/commit/281578a58328ed76ea2b00c03c3e45f36203c354#diff-ea5efd5565ac4dfd72536c835cab977c
This appears to be the current upstream version. The version in wheezy is
not affected.
It was assumed that the version number would remain below 32 "for the time
being". This time has passed. Version 32 was published in May 2016, and it
is already up to 34:
http://www.adobe.com/devnet/articles/flashplayer-air-feature-list.html
We detected this issue when our web application refused an SWF file created
by an animator.
It may be prudent to assume that the full version byte may be used.
However, this would trigger the issue mentioned in #745546:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=745546
i.e. misdetection of this file as a 516MB SWF:
http://git.openttd.org/?p=trunk.git;a=blob;f=os/dos/cwsdpmi/cwsdpmi.txt
Alternatives which would preserve the fix for #745546 might be to permit
versions below 48 ('0') or 65 ('A'), and/or to test for a sane length, e.g.
0 string CWS Macromedia Flash data (compressed),
>3 byte x version %d,
>>4 lelong <0x20000000 length %d bytes
!:mime application/x-shockwave-flash
This refuses a 512MB compressed Flash file. I am not aware of anyone who's
created such a file, but it is technically possible (e.g. Flash games with
very large embedded flash videos).
We've worked around this bug by adding a previous version of the magic
definition to /etc/magic for now.
--
Laurence "GreenReaper" Parry
http://www.greenreaper.co.uk/ - https://inkbunny.net
<http://www.greenreaper.co.uk/>
Information forwarded
to debian-bugs-dist@lists.debian.org: Bug#838860; Package file.
(Mon, 17 Oct 2016 19:33:07 GMT) (full text, mbox, link).
Acknowledgement sent
to Christoph Biedl <debian.axhn@manchmal.in-ulm.de>:
Extra info received and forwarded to list.
(Mon, 17 Oct 2016 19:33:07 GMT) (full text, mbox, link).
From: Christoph Biedl <debian.axhn@manchmal.in-ulm.de>
To: Laurence Parry <greenreaper@gmail.com>, 838860@bugs.debian.org
Subject: Re: Bug#838860: Version 32+ Flash (SWF) files detected as
'application/octet-stream' (data), not 'application/x-shockwave-flash' by
file (libmagic1)
Laurence Parry wrote...
> It was assumed that the version number would remain below 32 "for the time
> being". This time has passed. Version 32 was published in May 2016, and it
> is already up to 34:
> http://www.adobe.com/devnet/articles/flashplayer-air-feature-list.html
> We detected this issue when our web application refused an SWF file created
> by an animator.
Thanks for the catch, although this is rather bad news for the file
program. As any value from 32 on is a printable character, there will
always be a risk of mis-detection.
> Alternatives which would preserve the fix for #745546 might be to permit
> versions below 48 ('0') or 65 ('A'), and/or to test for a sane length, e.g.
>
> 0 string CWS Macromedia Flash data (compressed),
> >3 byte x version %d,
> >>4 lelong <0x20000000 length %d bytes
> !:mime application/x-shockwave-flash
>
> This refuses a 512MB compressed Flash file. I am not aware of anyone who's
> created such a file, but it is technically possible (e.g. Flash games with
> very large embedded flash videos).
I'm not really happy about this and could use more ideas. Assuming you
have a major collection to such files, is there anything in the
following header octets (FrameSize, FrameRate, FrameCount) that
somewhat certainly is not printable?
Christoph
Information forwarded
to debian-bugs-dist@lists.debian.org, Christoph Biedl <debian.axhn@manchmal.in-ulm.de>: Bug#838860; Package file.
(Tue, 18 Oct 2016 04:27:03 GMT) (full text, mbox, link).
Acknowledgement sent
to "Laurence Parry" <greenreaper@gmail.com>:
Extra info received and forwarded to list. Copy sent to Christoph Biedl <debian.axhn@manchmal.in-ulm.de>.
(Tue, 18 Oct 2016 04:27:03 GMT) (full text, mbox, link).
To: "Christoph Biedl" <debian.axhn@manchmal.in-ulm.de>,
<838860@bugs.debian.org>
Subject: Re: Bug#838860: Version 32+ Flash (SWF) files detected as 'application/octet-stream' (data), not 'application/x-shockwave-flash' by file (libmagic1)
Date: Tue, 18 Oct 2016 05:24:02 +0100
Perhaps, though the SWF format does not make it easy . . .
== FWS ==
https://www.adobe.com/content/dam/Adobe/en/devnet/swf/pdf/swf-file-format-spec.pdf
(See Appendix A for another walkthrough)
All integer values are little-endian byte order, but big-endian bit order
within bytes. Signed integers have typical twos-complement arithmetic
including sign-extension.
To start with, FrameSize is a RECT - a variable-length structure starting
with an _unsigned_ five-bit value determining how many bits the other four
_signed_ bit-values (Xmin, Xmax, Ymin, Ymax) each have. If it starts 01011
in bitstream order, the next eleven bits are Xmin, and so on.
On the plus side, "FrameSize RECT always has Xmin and Ymin value of 0." So
we could create 31 cases depending on the value of the ninth octet equating
to a particular bitmask and then check for 0-values for Xmin and Xmax [which
vary in length and, for Ymin, position, depending on their length].
In other words, in this particular case, we check in bitstream order for:
[01011|00000000000|xxxxxxxxxxx|0000000000]
[mask | Xmin | Xmax | Ymin ]
I foresee lots of & and ^, unfortunately. But it should be possible. Could
short-cut it a bit, since for all but the 1- and 2-bit cases, the rest of
the ninth octet must be 0 in order to match Xmin, so it's not necessary to
mask the ninth octet to match the first five bits.
FrameRate and FrameCount might be useful, too. Note that as integers, they
are byte-aligned, with zero-padding at the end of the preceding RECT if
necessary.
--
There is another problem: those octets are only guaranteed to be available
for FWS. In the case of CWS or ZWS, the files are compressed after Length
with ZLIB (introduced in SWF 6) or LZMA (SWF 13) respectively.
The file in question was CWS, and I understand this to be the default option
in current versions of Adobe software, which are also the ones most likely
to be saving files in the latest versions. Reviewing an assortment of the
latest SWF files uploaded to our website, the division is 60%/40% CWS/FWS.
The compressed length relates to the actual length of the file, but I don't
think libmagic can use that. However, the files must be in the according
compressed formats, which have their own headers that may be of use.
== ZLIB (CWS) ==
https://www.ietf.org/rfc/rfc1950.txt
CM (compression method) nibble is always 8, and the CINFO (compression info)
nibble which defines the base-2 logarithm of the LC77 window size, minus
eight, must be 7 or below. In all the files I have examined, it is 7;
however it could theoretically be something else. This means the ninth byte
of a CWS file is 0xN8 , where N <= 7; and commonly it is 0x78 ('x'). [Note:
it is perfectly possible for an uncompressed FWS file to have an 0x78 in the
9th position.]
The flag octet after it, is commonly 0x9C ('Œ') but this is not guaranteed;
I have also seen 0xDA ('Ú') and various items may be expected, so I would
not rely on it. Beyond that is the possible dictionary ID and then
compressed data.
== LZMA (ZWS) ==
http://www.7-zip.org/a/lzma-specification.7z
with a summary at
https://svn.python.org/projects/external/xz-5.0.3/doc/lzma-file-format.txt
I don't have any of these SWF files to hand, but the specification above
notes that LZMA Utils only creates files with lz/lp/pb values 3/0/2. This
would correspond to a properties byte of 0x5d (9th octet). There is also a
little-endian dictionary size and a file length, which may be all FF if it
is unknown. For comparison, one bare .lzma file looks like this:
00000000 5d 00 00 80 00 ff ff ff ff ff ff ff ff 00 16 e9
|]...............|
But it is technically possible to create a valid LZMA stream with other
property bytes, and presumably these would be valid SWF files as well.
Perhaps it's possible to delegate to the LZMA and ZLIB magic to test this?
--
Laurence "GreenReaper" Parry
http://greenreaper.co.uk - https://inkbunny.net
Information forwarded
to debian-bugs-dist@lists.debian.org, Christoph Biedl <debian.axhn@manchmal.in-ulm.de>: Bug#838860; Package file.
(Fri, 20 Jan 2017 20:57:03 GMT) (full text, mbox, link).
Acknowledgement sent
to Laurence Parry <greenreaper@hotmail.com>:
Extra info received and forwarded to list. Copy sent to Christoph Biedl <debian.axhn@manchmal.in-ulm.de>.
(Fri, 20 Jan 2017 20:57:03 GMT) (full text, mbox, link).
To: "838860@bugs.debian.org" <838860@bugs.debian.org>
Subject: Re: Bug#838860: Version 32+ Flash (SWF) files detected as
'application/octet-stream' (data), not 'application/x-shockwave-flash' by
file (libmagic1)
Date: Fri, 20 Jan 2017 20:56:05 +0000
Should this be submitted upstream via https://bugs.gw.com? I have not done
so myself because the FAQ suggests that the maintainer should, if necessary.
I appreciate that it'd be nice to have a Debian-developed resolution, as
this issue was triggered by a fix for another Debian issue. However, it'd
also be nice to resolve this upstream before Debian 9 is released, as there
will be an increasing number of Flash files with such versions over its
lifetime.
Ideally all three styles of SWF file would be able to be distinguished from
regular text files - but failing that, reverting the version test but
testing for the presence of ZLIB compression bytes for CWS files, as
documented above, would at least avoid a regression of the original issue
with CWSDPMI.TXT in openttd:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=745546
If more sample files are required for testing, I suspect Newgrounds would
provide a fertile source.
--
Laurence "GreenReaper" Parry
Information forwarded
to debian-bugs-dist@lists.debian.org: Bug#838860; Package file.
(Wed, 25 Jan 2017 20:45:02 GMT) (full text, mbox, link).
Acknowledgement sent
to Christoph Biedl <debian.axhn@manchmal.in-ulm.de>:
Extra info received and forwarded to list.
(Wed, 25 Jan 2017 20:45:02 GMT) (full text, mbox, link).
From: Christoph Biedl <debian.axhn@manchmal.in-ulm.de>
To: Laurence Parry <greenreaper@gmail.com>, 838860@bugs.debian.org
Subject: Re: Bug#838860: Version 32+ Flash (SWF) files detected as
'application/octet-stream' (data), not 'application/x-shockwave-flash' by
file (libmagic1)
Laurence Parry wrote...
> Perhaps, though the SWF format does not make it easy . . .
Thanks a lot for your input, and sorry for the long delay. I'll try to
find a solution that covers at least the vast majority of the files that
are around there. Frankly speaking, file(1) cannot be perfect and will
never be. But we can at least aim.
> == FWS ==
(...)
> On the plus side, "FrameSize RECT always has Xmin and Ymin value of 0." So
> we could create 31 cases depending on the value of the ninth octet equating
> to a particular bitmask and then check for 0-values for Xmin and Xmax [which
> vary in length and, for Ymin, position, depending on their length].
>
> In other words, in this particular case, we check in bitstream order for:
> [01011|00000000000|xxxxxxxxxxx|0000000000]
> [mask | Xmin | Xmax | Ymin ]
>
> I foresee lots of & and ^, unfortunately. But it should be possible. Could
> short-cut it a bit, since for all but the 1- and 2-bit cases, the rest of
> the ninth octet must be 0 in order to match Xmin, so it's not necessary to
> mask the ninth octet to match the first five bits.
That is something to work on. Most notably, a mask len of six and above
requires the following octet has a value of 0x1f the most, i.e.
non-printable. This leaves six cases to examine, that's feasible.
> == ZLIB (CWS) ==
> CM (compression method) nibble is always 8, and the CINFO (compression info)
> nibble which defines the base-2 logarithm of the LC77 window size, minus
> eight, must be 7 or below. In all the files I have examined, it is 7;
> however it could theoretically be something else. This means the ninth byte
> of a CWS file is 0xN8 , where N <= 7; and commonly it is 0x78 ('x'). [Note:
> it is perfectly possible for an uncompressed FWS file to have an 0x78 in the
> 9th position.]
You brought back old memories. I remember I had to detect compressed
files before, might have been git's packed files. However, this is one
of the places where I'd sacrifice perfection for a solution that is good
enough for the most cases.
> == LZMA (ZWS) ==
> I don't have any of these SWF files to hand, but the specification above
> notes that LZMA Utils only creates files with lz/lp/pb values 3/0/2. This
> would correspond to a properties byte of 0x5d (9th octet). There is also a
> little-endian dictionary size and a file length, which may be all FF if it
> is unknown. For comparison, one bare .lzma file looks like this:
>
> 00000000 5d 00 00 80 00 ff ff ff ff ff ff ff ff 00 16 e9
> |]...............|
So we'll have to guess here anyway. For all three I'll try to come up
with something suitable within the next hours (uploads targetting
stretch should be done be tomorrow). Upstreaming them will be my job,
too.
> Perhaps it's possible to delegate to the LZMA and ZLIB magic to test this?
I'll keep that in mind. It might require a major change in file's
architecture.
Christoph
Reply sent
to Christoph Biedl <debian.axhn@manchmal.in-ulm.de>:
You have taken responsibility.
(Thu, 26 Jan 2017 01:51:08 GMT) (full text, mbox, link).
Notification sent
to Laurence Parry <greenreaper@gmail.com>:
Bug acknowledged by developer.
(Thu, 26 Jan 2017 01:51:08 GMT) (full text, mbox, link).
From: Christoph Biedl <debian.axhn@manchmal.in-ulm.de>
To: 838860-close@bugs.debian.org
Subject: Bug#838860: fixed in file 1:5.29-3
Date: Thu, 26 Jan 2017 01:48:40 +0000
Source: file
Source-Version: 1:5.29-3
We believe that the bug you reported is fixed in the latest version of
file, which is due to be installed in the Debian FTP archive.
A summary of the changes between this version and the previous one is
attached.
Thank you for reporting the bug, which will now be closed. If you
have further comments please address them to 838860@bugs.debian.org,
and the maintainer will reopen the bug report if appropriate.
Debian distribution maintenance software
pp.
Christoph Biedl <debian.axhn@manchmal.in-ulm.de> (supplier of updated file package)
(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing ftpmaster@ftp-master.debian.org)
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
Format: 1.8
Date: Thu, 26 Jan 2017 00:29:24 +0100
Source: file
Binary: file libmagic1 libmagic-mgc libmagic-dev python-magic python3-magic
Architecture: source powerpc all
Version: 1:5.29-3
Distribution: unstable
Urgency: medium
Maintainer: Christoph Biedl <debian.axhn@manchmal.in-ulm.de>
Changed-By: Christoph Biedl <debian.axhn@manchmal.in-ulm.de>
Description:
file - Recognize the type of data in a file using "magic" numbers
libmagic-dev - Recognize the type of data in a file using "magic" numbers - deve
libmagic-mgc - File type determination library using "magic" numbers (compiled m
libmagic1 - Recognize the type of data in a file using "magic" numbers - libr
python-magic - Recognize the type of data in a file using "magic" numbers - Pyth
python3-magic - Recognize the type of data in a file using "magic" numbers - Pyth
Closes: 838860852476
Changes:
file (1:5.29-3) unstable; urgency=medium
.
* Restore full local.support-local-definitions-in-etc-magic patch.
Closes: #852476
* Include all upstream commits since the 5.29 release
* Improve detection of Flash data. Closes: #838860
Checksums-Sha1:
c52d158b6f0a0ee08270f3ba8ede2c05e64662eb 2124 file_5.29-3.dsc
75e0d09a4cebfed521d516c6123bb1eee6dc2a7c 41052 file_5.29-3.debian.tar.xz
0984aae9455e27ebd10f1cab3f016dd8cacd9080 14118 file-dbgsym_5.29-3_powerpc.deb
c2539a91298a2905acb655a1d6049755bc12fae8 6836 file_5.29-3_powerpc.buildinfo
4f84255c33e689ae9d0e9f6cc0308342b1eada77 63652 file_5.29-3_powerpc.deb
6819c81cd5c68f9274bd67874bd95a0eac3b1376 115776 libmagic-dev_5.29-3_powerpc.deb
0f073252673b476b8828bd87b14a298f7571802b 221996 libmagic-mgc_5.29-3_powerpc.deb
014942600d88a905b00b96c6d98b20daa11f2aea 152734 libmagic1-dbgsym_5.29-3_powerpc.deb
04a659f1cf0e1056069c238c8baeda66899606fd 108606 libmagic1_5.29-3_powerpc.deb
18390f355a242ef06452367abd5665549b4f114c 47860 python-magic_5.29-3_all.deb
1a23720b895815ecec411ec98a6ea0ce4f8ac591 47928 python3-magic_5.29-3_all.deb
Checksums-Sha256:
e72d2d4b53a2872f36fefe6a8bc38068ec5882ab44f59b54b8357debd9d64315 2124 file_5.29-3.dsc
0c4265ea108b6f25cc8cb742542ed013ef207ce5a66b8cdfd121b1c596d3f28d 41052 file_5.29-3.debian.tar.xz
3830dc2c1e80aa8fd0cf1d91ab659f04efea751a3d8565a77b842eb4ef1379a3 14118 file-dbgsym_5.29-3_powerpc.deb
858ad02d35d87bd88c7203e966f297a822c0a252b7f3a93398486862f0559d60 6836 file_5.29-3_powerpc.buildinfo
6c198c174949df76c7ac7602cf260d2d2373d62c3796b307f12ebfe6cbc27a07 63652 file_5.29-3_powerpc.deb
34bfdac21f777056aaf53aa66d907a5d03790407ac7c618dd846c0e693f38bf7 115776 libmagic-dev_5.29-3_powerpc.deb
8591bca019ba4dbd1cb35f0b6a84449ffc64cacb0c611efb54dae5539cf1fc80 221996 libmagic-mgc_5.29-3_powerpc.deb
141355d450901acf42afce535e7eff847a280321cb72a0185044500a291841d5 152734 libmagic1-dbgsym_5.29-3_powerpc.deb
4dfdf2f90c42ae7fdd22ab6fe815afeb03286e6805167a967a4be90428f25354 108606 libmagic1_5.29-3_powerpc.deb
5cb59dc176157d6f9dd88585a81d369317ac381e86f44052821d911207b6dd33 47860 python-magic_5.29-3_all.deb
e88e03ac9445a52ef2bf14fec160380752708a94bcc36e1847b26bcfc798913b 47928 python3-magic_5.29-3_all.deb
Files:
677843489a933c3207832bf5f6cd718d 2124 utils standard file_5.29-3.dsc
cc72c748d7e75ac55d87bde9da4b50e7 41052 utils standard file_5.29-3.debian.tar.xz
b35d1ffde9f796941217f0ea30d69ed0 14118 debug extra file-dbgsym_5.29-3_powerpc.deb
53d0918ec98a6be81dd5779227f98ab9 6836 utils standard file_5.29-3_powerpc.buildinfo
6cf834d69b450dbd9420f5a118188db3 63652 utils standard file_5.29-3_powerpc.deb
e8026b2faa0f309d86171d281e7c2132 115776 libdevel optional libmagic-dev_5.29-3_powerpc.deb
245eb1eeb1b453c465172315e0b1364b 221996 libs standard libmagic-mgc_5.29-3_powerpc.deb
a6738ecda11706f8db8d5e48ba7834b2 152734 debug extra libmagic1-dbgsym_5.29-3_powerpc.deb
d1c62b8334731f087adf8ce19fd5c9ca 108606 libs standard libmagic1_5.29-3_powerpc.deb
3a4a78ddba1da926b543358185deb18a 47860 python optional python-magic_5.29-3_all.deb
00a9c345ef8145f0af7b794619d9c195 47928 python optional python3-magic_5.29-3_all.deb
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAEBCgAGBQJYiU5qAAoJEMQsWOtZFJL9XhoQANHZp7zfXFuMgK8wX8FrFbpJ
M7tj4LEcZ3K9P3gT/ALZLmqSDf6DrqUnBzHIlTc+hZI7OfFJ16HKfleBlnFX7L1G
RK1fLdn5I16oPLOYfF+nt+UN7gqFlCenI0W3SFlLR5uvylA+FcZ9eONsChONIOCi
uIieN4PxrHSqdVepqL9uyW0t5koEgQGTFAuW9/LJE9PlBN9ygxZTzXn9CzcW8bRi
1MnSqiZ/ZknXWwTSULuc1E/D/k5hm29+sRXPaIwSaVuGbXcYOoOsWkWWLn9IYMrh
hcvxX6kyXv/ZvVhZJEmit6PxuHnMIdlw7ppNBcV9Jtuw1jMVlyJj3oP4I6e/B0zY
057VefYnIlfm5Jl4Uv36v9O7/kYSEYwudCruWz4uwXLpKoagpWNApez2MZQnkxGx
WwjLkG09AqzW+SOHhfuIjceTWdphRCOiLOwOil2BMs2Y/vlvTckmMZXpF1ugt2NA
hzbFSnP4f+tII/hjV4vlxVn3nJECoVOtpJa+zOGutK3EOPCsPCTGV50zvsArDf3D
p7eF/KFdz8p8abkwSWZBuVAWT4MwZR8xwSfRUE9oxCyMXnS5y0YJYGn1oQUG9ORf
3gKUs21ttI2LzZwr8LpFgZ6WUYb33h32RsPYskEGZASjEV0lZn7pgDR4WYfq3hUw
m+1qkXJOLurh7g2kfIMf
=gym/
-----END PGP SIGNATURE-----
Bug archived.
Request was from Debbugs Internal Request <owner@bugs.debian.org>
to internal_control@bugs.debian.org.
(Thu, 23 Feb 2017 07:34:21 GMT) (full text, mbox, link).
Debbugs is free software and licensed under the terms of the GNU General
Public License version 2. The current version can be obtained
from https://bugs.debian.org/debbugs-source/.