Skip to content

Commit

Permalink
Allow globs to match against a suffix of a file's extensions
Browse files Browse the repository at this point in the history
This has the effect of allowing a glob `*.html` to match the file
`foo.en.html`. For compatibility, this is only allowed with
`cabal-version: 3.0` or later; for earlier spec versions, a warning
will be generated by `cabal check` if there are files affected by this
change in behaviour.

Fixes haskell#5057. Fixes haskell#784. Closes haskell#5061.
  • Loading branch information
quasicomputational authored and 23Skidoo committed Jun 13, 2018
1 parent 185ad50 commit bc3c477
Show file tree
Hide file tree
Showing 17 changed files with 219 additions and 58 deletions.
9 changes: 9 additions & 0 deletions Cabal/ChangeLog.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,15 @@
`cxx-options`, `cpp-options` are not deduplicated anymore
([#4449](https://github.com/haskell/cabal/issues/4449)).
* Deprecated `cabal hscolour` in favour of `cabal haddock --hyperlink-source` ([#5236](https://github.com/haskell/cabal/pull/5236/)).
* With `cabal-version: 3.0`, when matching a wildcard, the
requirement for the full extension to match exactly has been
loosened. Instead, if the wildcard's extension is a suffix of the
file's extension, the file will be selected. For example,
previously `foo.en.html` would not match `*.html`, and
`foo.solaris.tar.gz` would not match `*.tar.gz`, but now both
do. This may lead to files unexpectedly being included by `sdist`;
please audit your package descriptions if you rely on this
behaviour to keep sensitive data out of distributed packages.

----

Expand Down
53 changes: 51 additions & 2 deletions Cabal/Distribution/PackageDescription/Check.hs
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ import Distribution.Types.CondTree
import Distribution.Types.ExeDependency
import Distribution.Types.UnqualComponentName
import Distribution.Utils.Generic (isAscii)
import Distribution.Verbosity
import Distribution.Version
import Language.Haskell.Extension
import System.FilePath
Expand Down Expand Up @@ -111,7 +112,7 @@ data PackageCheck =
-- quite legitimately refuse to publicly distribute packages with these
-- problems.
| PackageDistInexcusable { explanation :: String }
deriving (Eq)
deriving (Eq, Ord)

instance Show PackageCheck where
show notice = explanation notice
Expand Down Expand Up @@ -1840,7 +1841,13 @@ checkDevelopmentOnlyFlags pkg =
-- package and expects to find the package unpacked in at the given file path.
--
checkPackageFiles :: PackageDescription -> FilePath -> NoCallStackIO [PackageCheck]
checkPackageFiles pkg root = checkPackageContent checkFilesIO pkg
checkPackageFiles pkg root = do
contentChecks <- checkPackageContent checkFilesIO pkg
missingFileChecks <- checkPackageMissingFiles pkg root
-- Sort because different platforms will provide files from
-- `getDirectoryContents` in different orders, and we'd like to be
-- stable for test output.
return (sort contentChecks ++ sort missingFileChecks)
where
checkFilesIO = CheckPackageContentOps {
doesFileExist = System.doesFileExist . relative,
Expand Down Expand Up @@ -2136,6 +2143,48 @@ checkTarPath path
++ "Files with an empty name cannot be stored in a tar archive or in "
++ "standard file systems."

-- ------------------------------------------------------------
-- * Checks for missing content
-- ------------------------------------------------------------

-- | Similar to 'checkPackageContent', 'checkPackageMissingFiles' inspects
-- the files included in the package, but is primarily looking for files in
-- the working tree that may have been missed.
--
-- Because Hackage necessarily checks the uploaded tarball, it is too late to
-- check these on the server; these checks only make sense in the development
-- and package-creation environment. Hence we can use IO, rather than needing
-- to pass a 'CheckPackageContentOps' dictionary around.
checkPackageMissingFiles :: PackageDescription -> FilePath -> NoCallStackIO [PackageCheck]
checkPackageMissingFiles = checkGlobMultiDot

-- | Before Cabal 3.0, the extensions of globs had to match the file
-- exactly. This has been relaxed in 3.0 to allow matching only the
-- suffix. This warning detects when pre-3.0 package descriptions are
-- omitting files purely because of the stricter check.
checkGlobMultiDot :: PackageDescription
-> FilePath
-> NoCallStackIO [PackageCheck]
checkGlobMultiDot pkg root =
fmap concat $ for allGlobs $ \(field, dir, glob) -> do
--TODO: baked-in verbosity
results <- matchDirFileGlob' normal (specVersion pkg) (root </> dir) glob
return
[ PackageDistSuspiciousWarn $
"In '" ++ field ++ "': the pattern '" ++ glob ++ "' does not"
++ " match the file '" ++ file ++ "' because the extensions do not"
++ " exactly match (e.g., foo.en.html does not exactly match *.html)."
++ " To enable looser suffix-only matching, set 'cabal-version: 3.0' or higher."
| GlobWarnMultiDot file <- results
]
where
adjustedDataDir = if null (dataDir pkg) then "." else dataDir pkg
allGlobs = concat
[ (,,) "extra-source-files" "." <$> extraSrcFiles pkg
, (,,) "extra-doc-files" "." <$> extraDocFiles pkg
, (,,) "data-files" adjustedDataDir <$> dataFiles pkg
]

-- ------------------------------------------------------------
-- * Utils
-- ------------------------------------------------------------
Expand Down
106 changes: 77 additions & 29 deletions Cabal/Distribution/Simple/Glob.hs
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE RankNTypes #-}

Expand All @@ -14,19 +15,23 @@
-- Simple file globbing.

module Distribution.Simple.Glob (
GlobSyntaxError(..),
GlobResult(..),
globMatches,
matchFileGlob,
matchDirFileGlob,
matchDirFileGlob',
fileGlobMatches,
parseFileGlob,
explainGlobSyntaxError,
GlobSyntaxError(..),
Glob,
) where

import Prelude ()
import Distribution.Compat.Prelude

import Control.Monad (guard)

import Distribution.Simple.Utils
import Distribution.Verbosity
import Distribution.Version
Expand All @@ -39,6 +44,21 @@ import System.FilePath (joinPath, splitExtensions, splitDirectories, takeFileNam
-- slash and backslash as its path separators, if we left in the
-- separators from the glob we might not end up properly normalised.

data GlobResult a
= GlobMatch a
-- ^ The glob matched the value supplied.
| GlobWarnMultiDot a
-- ^ The glob did not match the value supplied because the
-- cabal-version is too low and the extensions on the file did
-- not precisely match the glob's extensions, but rather the
-- glob was a proper suffix of the file's extensions; i.e., if
-- not for the low cabal-version, it would have matched.
deriving (Show, Eq, Ord, Functor)

-- | Extract the matches from a list of 'GlobResult's.
globMatches :: [GlobResult a] -> [a]
globMatches input = [ a | GlobMatch a <- input ]

data GlobSyntaxError
= StarInDirectory
| StarInFileName
Expand Down Expand Up @@ -86,35 +106,59 @@ explainGlobSyntaxError filepath VersionDoesNotSupportGlob =

data IsRecursive = Recursive | NonRecursive

data MultiDot = MultiDotDisabled | MultiDotEnabled

data Glob
= GlobStem String Glob
= GlobStem FilePath Glob
-- ^ A single subdirectory component + remainder.
| GlobFinal GlobFinal

data GlobFinal
= FinalMatch IsRecursive String
= FinalMatch IsRecursive MultiDot String
-- ^ First argument: Is this a @**/*.ext@ pattern?
-- Second argument: the extensions to accept.
-- Second argument: should we match against the exact extensions, or accept a suffix?
-- Third argument: the extensions to accept.
| FinalLit FilePath
-- ^ Literal file name.

fileGlobMatches :: Glob -> FilePath -> Bool
fileGlobMatches pat = fileGlobMatchesSegments pat . splitDirectories
-- | Returns 'Nothing' if the glob didn't match at all, or 'Just' the
-- result if the glob matched (or would have matched with a higher
-- cabal-version).
fileGlobMatches :: Glob -> FilePath -> Maybe (GlobResult FilePath)
fileGlobMatches pat candidate = do
match <- fileGlobMatchesSegments pat (splitDirectories candidate)
return (candidate <$ match)

fileGlobMatchesSegments :: Glob -> [FilePath] -> Bool
fileGlobMatchesSegments _ [] = False
fileGlobMatchesSegments :: Glob -> [FilePath] -> Maybe (GlobResult ())
fileGlobMatchesSegments _ [] = Nothing
fileGlobMatchesSegments pat (seg : segs) = case pat of
GlobStem dir pat' ->
dir == seg && fileGlobMatchesSegments pat' segs
GlobStem dir pat' -> do
guard (dir == seg)
fileGlobMatchesSegments pat' segs
GlobFinal final -> case final of
FinalMatch Recursive ext ->
FinalMatch Recursive multidot ext -> do
let (candidateBase, candidateExts) = splitExtensions (last $ seg:segs)
in ext == candidateExts && not (null candidateBase)
FinalMatch NonRecursive ext ->
guard (not (null candidateBase))
checkExt multidot ext candidateExts
FinalMatch NonRecursive multidot ext -> do
let (candidateBase, candidateExts) = splitExtensions seg
in null segs && ext == candidateExts && not (null candidateBase)
FinalLit filename ->
null segs && filename == seg
guard (null segs && not (null candidateBase))
checkExt multidot ext candidateExts
FinalLit filename -> do
guard (null segs && filename == seg)
return (GlobMatch ())

checkExt
:: MultiDot
-> String -- ^ The pattern's extension
-> String -- ^ The candidate file's extension
-> Maybe (GlobResult ())
checkExt multidot ext candidate
| ext == candidate = Just (GlobMatch ())
| ext `isSuffixOf` candidate = case multidot of
MultiDotDisabled -> Just (GlobWarnMultiDot ())
MultiDotEnabled -> Just (GlobMatch ())
| otherwise = Nothing

parseFileGlob :: Version -> FilePath -> Either GlobSyntaxError Glob
parseFileGlob version filepath = case reverse (splitDirectories filepath) of
Expand All @@ -127,14 +171,14 @@ parseFileGlob version filepath = case reverse (splitDirectories filepath) of
| null ext -> Left NoExtensionOnStar
| otherwise -> Right ext
_ -> Left LiteralFileNameGlobStar
foldM addStem (GlobFinal $ FinalMatch Recursive ext) segments
foldM addStem (GlobFinal $ FinalMatch Recursive multidot ext) segments
| otherwise -> Left VersionDoesNotSupportGlobStar
(filename : segments) -> do
pat <- case splitExtensions filename of
("*", ext) | not allowGlob -> Left VersionDoesNotSupportGlob
| '*' `elem` ext -> Left StarInExtension
| null ext -> Left NoExtensionOnStar
| otherwise -> Right (FinalMatch NonRecursive ext)
| otherwise -> Right (FinalMatch NonRecursive multidot ext)
(_, ext) | '*' `elem` ext -> Left StarInExtension
| '*' `elem` filename -> Left StarInFileName
| otherwise -> Right (FinalLit filename)
Expand All @@ -145,24 +189,27 @@ parseFileGlob version filepath = case reverse (splitDirectories filepath) of
addStem pat seg
| '*' `elem` seg = Left StarInDirectory
| otherwise = Right (GlobStem seg pat)
multidot
| version >= mkVersion [3,0] = MultiDotEnabled
| otherwise = MultiDotDisabled

matchFileGlob :: Verbosity -> Version -> FilePath -> IO [FilePath]
matchFileGlob :: Verbosity -> Version -> FilePath -> IO [GlobResult FilePath]
matchFileGlob verbosity version = matchDirFileGlob verbosity version "."

-- | Like 'matchDirFileGlob'', but will 'die'' when the glob matches
-- no files.
matchDirFileGlob :: Verbosity -> Version -> FilePath -> FilePath -> IO [FilePath]
matchDirFileGlob :: Verbosity -> Version -> FilePath -> FilePath -> IO [GlobResult FilePath]
matchDirFileGlob verbosity version dir filepath = do
matches <- matchDirFileGlob' verbosity version dir filepath
when (null matches) $ die' verbosity $
when (null $ globMatches matches) $ die' verbosity $
"filepath wildcard '" ++ filepath
++ "' does not match any files."
return matches

-- | Match files against a glob, starting in a directory.
--
-- The returned values do not include the supplied @dir@ prefix.
matchDirFileGlob' :: Verbosity -> Version -> FilePath -> FilePath -> IO [FilePath]
matchDirFileGlob' :: Verbosity -> Version -> FilePath -> FilePath -> IO [GlobResult FilePath]
matchDirFileGlob' verbosity version rawDir filepath = case parseFileGlob version filepath of
Left err -> die' verbosity $ explainGlobSyntaxError filepath err
Right pat -> do
Expand All @@ -185,20 +232,21 @@ matchDirFileGlob' verbosity version rawDir filepath = case parseFileGlob version
-- literal.
let (prefixSegments, final) = splitConstantPrefix pat
joinedPrefix = joinPath prefixSegments
files <- case final of
FinalMatch recursive exts -> do
case final of
FinalMatch recursive multidot exts -> do
let prefix = dir </> joinedPrefix
candidates <- case recursive of
Recursive -> getDirectoryContentsRecursive prefix
NonRecursive -> filterM (doesFileExist . (prefix </>)) =<< getDirectoryContents prefix
let checkName candidate =
let checkName candidate = do
let (candidateBase, candidateExts) = splitExtensions $ takeFileName candidate
in not (null candidateBase) && exts == candidateExts
return $ filter checkName candidates
guard (not (null candidateBase))
match <- checkExt multidot exts candidateExts
return (joinedPrefix </> candidate <$ match)
return $ mapMaybe checkName candidates
FinalLit fn -> do
exists <- doesFileExist (dir </> joinedPrefix </> fn)
return [ fn | exists ]
return $ fmap (joinedPrefix </>) files
return [ GlobMatch (joinedPrefix </> fn) | exists ]

unfoldr' :: (a -> Either r (b, a)) -> a -> ([b], r)
unfoldr' f a = case f a of
Expand Down
2 changes: 1 addition & 1 deletion Cabal/Distribution/Simple/Haddock.hs
Original file line number Diff line number Diff line change
Expand Up @@ -266,7 +266,7 @@ haddock pkg_descr lbi suffixes flags' = do
CBench _ -> when (flag haddockBenchmarks) $ smsg >> doExe component

for_ (extraDocFiles pkg_descr) $ \ fpath -> do
files <- matchFileGlob verbosity (specVersion pkg_descr) fpath
files <- fmap globMatches $ matchFileGlob verbosity (specVersion pkg_descr) fpath
for_ files $ copyFileTo verbosity (unDir $ argOutputDir commonArgs)

-- ------------------------------------------------------------------------------
Expand Down
4 changes: 2 additions & 2 deletions Cabal/Distribution/Simple/Install.hs
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ import Distribution.Package
import Distribution.PackageDescription
import Distribution.Simple.LocalBuildInfo
import Distribution.Simple.BuildPaths (haddockName, haddockPref)
import Distribution.Simple.Glob (matchDirFileGlob)
import Distribution.Simple.Glob (matchDirFileGlob, globMatches)
import Distribution.Simple.Utils
( createDirectoryIfMissingVerbose
, installDirectoryContents, installOrdinaryFile, isInSearchPath
Expand Down Expand Up @@ -238,7 +238,7 @@ installDataFiles verbosity pkg_descr destDataDir =
srcDataDir = if null srcDataDirRaw
then "."
else srcDataDirRaw
files <- matchDirFileGlob verbosity (specVersion pkg_descr) srcDataDir file
files <- globMatches <$> matchDirFileGlob verbosity (specVersion pkg_descr) srcDataDir file
let dir = takeDirectory file
createDirectoryIfMissingVerbose verbosity True (destDataDir </> dir)
sequence_ [ installOrdinaryFile verbosity (srcDataDir </> file')
Expand Down
10 changes: 7 additions & 3 deletions Cabal/Distribution/Simple/SrcDist.hs
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,9 @@ listPackageSources verbosity pkg_descr0 pps = do
listPackageSourcesMaybeExecutable :: Verbosity -> PackageDescription -> IO [FilePath]
listPackageSourcesMaybeExecutable verbosity pkg_descr =
-- Extra source files.
fmap concat . for (extraSrcFiles pkg_descr) $ \fpath -> matchFileGlob verbosity (specVersion pkg_descr) fpath
fmap concat . for (extraSrcFiles pkg_descr) $ \fpath ->
fmap globMatches $
matchFileGlob verbosity (specVersion pkg_descr) fpath

-- | List those source files that should be copied with ordinary permissions.
listPackageSourcesOrdinary :: Verbosity
Expand Down Expand Up @@ -214,12 +216,14 @@ listPackageSourcesOrdinary verbosity pkg_descr pps =
then "."
else srcDataDirRaw
in fmap (fmap (srcDataDir </>)) $
matchDirFileGlob verbosity (specVersion pkg_descr) srcDataDir filename
fmap globMatches $
matchDirFileGlob verbosity (specVersion pkg_descr) srcDataDir filename

-- Extra doc files.
, fmap concat
. for (extraDocFiles pkg_descr) $ \ filename ->
matchFileGlob verbosity (specVersion pkg_descr) filename
fmap globMatches $
matchFileGlob verbosity (specVersion pkg_descr) filename

-- License file(s).
, return (licenseFiles pkg_descr)
Expand Down
14 changes: 9 additions & 5 deletions Cabal/doc/developing-packages.rst
Original file line number Diff line number Diff line change
Expand Up @@ -990,11 +990,15 @@ describe the package as a whole:
- ``*`` wildcards are only allowed in place of the file name, not
in the directory name or file extension. It must replace the
whole file name (e.g., ``*.html`` is allowed, but
``chapter-*.html`` is not). Furthermore, if a wildcard is used
it must be used with an extension, so ``data-files: data/*`` is
not allowed. When matching a wildcard plus extension, a file's
full extension must match exactly, so ``*.gz`` matches
``foo.gz`` but not ``foo.tar.gz``.
``chapter-*.html`` is not). If a wildcard is used, it must be
used with an extension, so ``data-files: data/*`` is not
allowed.

- Prior to Cabal 3.0, when matching a wildcard plus extension, a
file's full extension must match exactly, so ``*.gz`` matches
``foo.gz`` but not ``foo.tar.gz``. This restriction has been
lifted when ``cabal-version: 3.0`` or greater so that ``*.gz``
does match ``foo.tar.gz``

- ``*`` wildcards will not match if the file name is empty (e.g.,
``*.html`` will not match ``foo/.html``).
Expand Down
Loading

0 comments on commit bc3c477

Please sign in to comment.