Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#64] Add annotations for the copypaste check #246

Open
wants to merge 3 commits into
base: YuriRomanowski/#64-Implement-copy-paste-protection
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,11 +36,13 @@ Unreleased
+ Now we call references to anchors in current file (e.g. `[a](#b)`) as
`file-local` references instead of calling them `current file` (which was ambiguous).
* [#233](https://github.com/serokell/xrefcheck/pull/233)
+ Now xrefxcheck does not follow redirect links by default. It fails for permanent
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

+ Now xrefcheck does not follow redirect links by default. It fails for permanent
redirect responses (i.e. 301 and 308) and passes for temporary ones (i.e. 302, 303, 307).
* [#231](https://github.com/serokell/xrefcheck/pull/231)
+ Anchor analysis takes now into account the appropriate case-sensitivity depending on
the configured Markdown flavour.
* [#240](https://github.com/serokell/xrefcheck/pull/240)
+ Now xrefcheck is able to detect possible copy-pastes relying on links and their names.

0.2.2
==========
Expand Down
16 changes: 16 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ Comparing to alternative solutions, this tool tries to achieve the following poi
* Supports external links (`http`, `https`, `ftp` and `ftps`).
* Detects broken and ambiguous anchors in local links.
* Integration with GitHub Actions.
* Detects possible bad copy-pastes of links.

## Dependencies [↑](#xrefcheck)

Expand Down Expand Up @@ -148,6 +149,21 @@ There are several ways to fix this:
* By default, `xrefcheck` will ignore links to localhost.
* This behavior can be disabled by removing the corresponding entry from the `ignoreExternalRefsTo` list in the config file.

1. How do I disable copy-paste check for specific links?
* Add a `<!-- xrefcheck: no duplication check in link -->` annotation before the link:
```md
<!-- xrefcheck: no duplication check in link -->
Links with bad copypaste:
[good link](https://good.link.uri/).
[copypasted link](https://good.link.uri/).
```
```md
A [good link](https://good.link.uri/)
followed by an <!-- xrefcheck: no duplication check in link --> [copypasted intentionally](https://good.link.uri/).
```
* You can use a `<!-- xrefcheck: no duplication check in paragraph -->` annotation to disable copy-paste check in a paragraph.
* You can use a `<!-- xrefcheck: no duplication check in file -->` annotation at the top of the file to disable copy-paste check within an entire file.

## Further work [↑](#xrefcheck)

- [ ] Support link detection in different languages, not only Markdown.
Expand Down
25 changes: 15 additions & 10 deletions src/Xrefcheck/Core.hs
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ module Xrefcheck.Core where
import Universum

import Control.Lens (makeLenses)
import Control.Lens.Combinators (makeLensesWith)
import Data.Aeson (FromJSON (..), withText)
import Data.Char (isAlphaNum)
import Data.Char qualified as C
Expand Down Expand Up @@ -70,14 +71,17 @@ instance Given ColorMode => Buildable Position where

-- | Full info about a reference.
data Reference = Reference
{ rName :: Text
{ rName :: Text
-- ^ Text displayed as reference.
, rLink :: Text
, rLink :: Text
-- ^ File or site reference points to.
, rAnchor :: Maybe Text
, rAnchor :: Maybe Text
-- ^ Section or custom anchor tag.
, rPos :: Position
, rPos :: Position
, rCheckCopyPaste :: Bool
-- ^ Whether to check bad copy/paste for this link
} deriving stock (Show, Generic, Eq, Ord)
makeLensesWith postfixFields ''Reference

-- | Context of anchor.
data AnchorType
Expand All @@ -102,9 +106,9 @@ data FileInfoDiff = FileInfoDiff
}
makeLenses ''FileInfoDiff

diffToFileInfo :: FileInfoDiff -> FileInfo
diffToFileInfo (FileInfoDiff refs anchors) =
FileInfo (DList.toList refs) (DList.toList anchors)
diffToFileInfo :: Bool -> FileInfoDiff -> FileInfo
diffToFileInfo cpcEnabledInFile (FileInfoDiff refs anchors) =
FileInfo (DList.toList refs) (DList.toList anchors) cpcEnabledInFile

instance Semigroup FileInfoDiff where
FileInfoDiff a b <> FileInfoDiff c d = FileInfoDiff (a <> c) (b <> d)
Expand All @@ -114,13 +118,14 @@ instance Monoid FileInfoDiff where

-- | All information regarding a single file we care about.
data FileInfo = FileInfo
{ _fiReferences :: [Reference]
, _fiAnchors :: [Anchor]
{ _fiReferences :: [Reference]
, _fiAnchors :: [Anchor]
, _fiCopyPasteCheck :: Bool
} deriving stock (Show, Generic)
makeLenses ''FileInfo

instance Default FileInfo where
def = diffToFileInfo mempty
def = diffToFileInfo True mempty

data ScanPolicy
= OnlyTracked
Expand Down
13 changes: 11 additions & 2 deletions src/Xrefcheck/Scan.hs
Original file line number Diff line number Diff line change
Expand Up @@ -117,18 +117,27 @@ data ScanErrorDescription
= LinkErr
| FileErr
| ParagraphErr Text
| LinkErrCpc
| FileErrCpc
| ParagraphErrCpc Text
| UnrecognisedErr Text
deriving stock (Show, Eq)

instance Buildable ScanErrorDescription where
build = \case
LinkErr -> [int||Expected a LINK after "ignore link" annotation|]
LinkErrCpc -> [int||Expected a LINK after "no duplication check in link" annotation|]
FileErr -> [int||Annotation "ignore all" must be at the top of \
markdown or right after comments at the top|]
FileErrCpc -> [int||Annotation "no duplication check in file" must be at the top of \
markdown or right after comments at the top|]
ParagraphErr txt -> [int||Expected a PARAGRAPH after \
"ignore paragraph" annotation, but found #{txt}|]
UnrecognisedErr txt -> [int||Unrecognised option "#{txt}" perhaps you meant \
<"ignore link"|"ignore paragraph"|"ignore all">|]
ParagraphErrCpc txt -> [int||Expected a PARAGRAPH after \
"no duplication check in paragraph" annotation, but found #{txt}|]
UnrecognisedErr txt -> [int||Unrecognised option "#{txt}", perhaps you meant
"ignore <link|paragraph|all>"
or "no duplication check in <link|paragraph|file>"?|]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two kind of annotation suggestions look a bit different. Maybe would be better something like

"ignore <link|paragraph|all>"
or "no duplication check in <link|paragraph|file>"

or

<"ignore link"|"ignore paragraph"|"ignore all">
or <"no duplication check in link"|"no duplication check in paragraph"|"no duplication check in file">


specificFormatsSupport :: [([Extension], ScanAction)] -> FormatsSupport
specificFormatsSupport formats = \ext -> M.lookup ext formatsMap
Expand Down
Loading