Hello, I have a centos 7.9.2009 with 'file' command v 5.11. Because this is very old, file --mime-type does not recognise docx correctly. It says the mime type is 'application/msword', which is not correct, it shoudl be 'application/vnd.openxmlformats-officedocument.wordprocessingml.document'.
I tried updating the /etc/magic file and put in under $HOME/magic from another opensuse installation but get a lot of errors.
How can I get an up to file file/magic handling on this distribution?
How to get new file magic file?
Re: How to get new file magic file?
The standard answer is that one does use the packages that the distro has, or one does use a different distro.
While Red Hat does backport features to Enterprise Linux https://access.redhat.com/solutions/57665
the RHEL 7 (and hence CentOS 7) was released 2014, does no longer receive feature updates, and will die "soon" (June 2024).
The section in EL9's version of magic that seems to be about msooxml:
man file does mention $HOME/.magic -- not $HOME/magic
While Red Hat does backport features to Enterprise Linux https://access.redhat.com/solutions/57665
the RHEL 7 (and hence CentOS 7) was released 2014, does no longer receive feature updates, and will die "soon" (June 2024).
The section in EL9's version of magic that seems to be about msooxml:
man file does mention $HOME/.magic -- not $HOME/magic
Code: Select all
#------------------------------------------------------------------------------
# $File: msooxml,v 1.13 2019/11/27 13:12:55 christos Exp $
# msooxml: file(1) magic for Microsoft Office XML
# From: Ralf Brown <ralf.brown@gmail.com>
# .docx, .pptx, and .xlsx are XML plus other files inside a ZIP
# archive. The first member file is normally "[Content_Types].xml".
# but some libreoffice generated files put this later. Perhaps skip
# the "[Content_Types].xml" test?
# Since MSOOXML doesn't have anything like the uncompressed "mimetype"
# file of ePub or OpenDocument, we'll have to scan for a filename
# which can distinguish between the three types
0 name msooxml
>0 string word/ Microsoft Word 2007+
!:mime application/vnd.openxmlformats-officedocument.wordprocessingml.document
>0 string ppt/ Microsoft PowerPoint 2007+
!:mime application/vnd.openxmlformats-officedocument.presentationml.presentation
>0 string xl/ Microsoft Excel 2007+
!:mime application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
0 string visio/ Microsoft Visio 2013+
!:mime application/vnd.ms-visio.drawing.main+xml
# start by checking for ZIP local file header signature
0 string PK\003\004
!:strength +10
# make sure the first file is correct
>0x1E use msooxml
>0x1E regex \\[Content_Types\\]\\.xml|_rels/\\.rels|docProps
# skip to the second local file header
# since some documents include a 520-byte extra field following the file
# header, we need to scan for the next header
>>(18.l+49) search/6000 PK\003\004
# now skip to the *third* local file header; again, we need to scan due to a
# 520-byte extra field following the file header
>>>&26 search/6000 PK\003\004
# and check the subdirectory name to determine which type of OOXML
# file we have. Correct the mimetype with the registered ones:
# https://technet.microsoft.com/en-us/library/cc179224.aspx
>>>>&26 use msooxml
>>>>&26 default x
# OpenOffice/Libreoffice orders ZIP entry differently, so check the 4th file
>>>>>&26 search/6000 PK\003\004
>>>>>>&26 use msooxml
>>>>>>&26 default x Microsoft OOXML
>>>>>&26 default x Microsoft OOXML