-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FunctionBodyLengthRule should not count comment lines #330
FunctionBodyLengthRule should not count comment lines #330
Conversation
let kindsInRange = syntax.tokens.filter { token in | ||
let tokenLine = contents.lineAndCharacterForByteOffset(token.offset) | ||
return tokenLine?.line == line.index | ||
}.map({ $0.type }).flatMap(SyntaxKind.init) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation is off here
Thanks for doing this @marcelofabri, this makes the rule much more useful! I was a bit worried about the impact this would have on performance, and turns out it's a fairly large one 😬. As a reference point, the With this PR, it takes 46.092s 😱. By changing the public func syntaxKindsByLine(startLine: Int? = nil,
endLine: Int? = nil) -> [(Int, [SyntaxKind])] {
let contents = self.contents as NSString
let linesAndKinds = syntaxMap.tokens.map { token -> (Int, SyntaxKind) in
let tokenLine = contents.lineAndCharacterForByteOffset(token.offset)
return (tokenLine!.line, SyntaxKind(rawValue: token.type)!)
}.filter { line, _ in
return line >= startLine && line <= endLine
}
var results = [Int: [SyntaxKind]]()
for lineAndKind in linesAndKinds {
results[lineAndKind.0] = (results[lineAndKind.0] ?? []) + [lineAndKind.1]
}
return Array(zip(results.keys, results.values))
} However, I still think that's too much of a performance regression, especially for larger projects. One thing we could do to minimize the performance impact here would be to only count the number of comment-only lines after confirming that the function exceeds the maximum line length when accounting for all lines, which is much faster. For example, in where endLine - startLine > limit && lineCount(file, startLine: startLine, endLine: endLine) > limit { Actually we could move that to its own function: ...
where exceedsLineCountExcludingComments(file, startLine, endLine, limit)
...
private func exceedsLineCountExcludingComments(file: File, _ start: Int, _ end: Int,
_ limit: Int) -> Bool {
return end - start > limit && lineCount(file, startLine: start, endLine: end) > limit
} In this case, SwiftLint goes back to linting in 2.866s, although that's not exactly a great benchmark now because SwiftLint's functions never exceed 40 lines, so the benchmark doesn't hit the slow path. Could you please apply these changes to help reduce the performance impact? If you find different ways to achieve similar performance gains, I'm certainly open to that too! |
@jpsim I think that if we called Anyway, I have committed the changes you proposed. I was worried about performance too 😬 I know that it deserves another issue, but we should think about adding performance tests to the project, even linting other projects. |
By using the associated objects approach, I was able to make it run in ~ 3.401s (without the I believe it could be better if I could drop the conversion needed to store/read the associated object: private var syntaxKindsByLines: [(Int, [SyntaxKind])] {
if let lines = objc_getAssociatedObject(self, &keyLines) as? [Int],
syntaxKinds = objc_getAssociatedObject(self, &keySyntaxKinds) as? [[String]] {
return Array(zip(lines, syntaxKinds.map { $0.flatMap(SyntaxKind.init) }))
}
let newValue = syntaxKindsByLine()
let lines = newValue.map { $0.0 }
let kinds = newValue.map { $0.1.map { $0.rawValue } }
objc_setAssociatedObject(self, &keyLines, lines,
.OBJC_ASSOCIATION_RETAIN_NONATOMIC)
objc_setAssociatedObject(self, &keySyntaxKinds, kinds,
.OBJC_ASSOCIATION_RETAIN_NONATOMIC)
return newValue
} However, this made the tests actually slower with the shortcut (because I called |
Calling counting line can be reduced by changing position of parameter enumeration. // enumerate parameters
for parameter in parameters.reverse() {
// count lines
let lineCountExcludingComments =…
// check violation
if lineCountExcludingComments > parameter.value {
…
}
} to: // count lines
let lineCountExcludingComments =…
// enumerate parameters
for parameter in parameters.reverse() {
// check violation
if LineCountExcludingComments > parameter.value {
…
}
} |
let contents = self.contents as NSString | ||
let kindsWithLines = syntaxMap.tokens.map { token -> (Int, SyntaxKind) in | ||
let tokenLine = contents.lineAndCharacterForByteOffset(token.offset) | ||
return (tokenLine!.line, SyntaxKind(rawValue: token.type)!) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SyntaxKind
s of whole the file are created here on every checking functions in the file.
.filter
should be applied before creation for reducing them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because currently .filter
needs the line. If we filtered first, lineAndCharacterForByteOffset
would be called twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marcelofabri it might be better to avoid eagerly creating the SyntaxKind
s by mapping twice:
let kindsWithLines = syntaxMap.tokens.map { token -> (Int, SyntaxToken) in
let tokenLine = contents.lineAndCharacterForByteOffset(token.offset)
return (tokenLine!.line, token)
}.filter { line, token in
return line >= (startLine ?? 0) && line <= (endLine ?? Int.max)
}.map { (line, token) -> (Int, SyntaxKind) in
return (line, SyntaxKind(rawValue: token.type)!)
}
How do you recommend we do that? XCTest supports performance tests, but those only really matter when running on a given device. What we really want is to compare with running on the previous version of the test running on the exact same device, although I don't think there's any elegant way to set that up. |
@jpsim I haven't thought about it a lot, but I was thinking about XCTest performance tests. I'm still not sure why it wouldn't work. Is it because Travis won't have a baseline to compare to? |
XCTest performance tests compare the test output with a previously recorded baseline that's associated with a unique device identifier (in this case, the Mac on which these tests are running). This wouldn't work on Travis because they use VMs to run tests, and they're not guaranteed to have the same unique device ID. Even if/when they did, the fact that these are running on shared hardware means that we can't really trust a single run of performance tests. I'm thinking a different way to measure performance in PRs would be to install the latest version of SwiftLint through homebrew and compare the total time to lint SwiftLint from the latest release and the current PR. Ideally Travis would be able to post back a comment to the PR with the before & after lint times for us to review. Even better would be to have a way to run SwiftLint in a benchmarking mode that logs out the time spent in each rule. |
@marcelofabri this is looking good. What are the next steps (if any) you want to take with this PR? |
Sorry, my saying too much about performance. 🙇 |
@@ -90,6 +90,22 @@ extension File { | |||
} | |||
} | |||
|
|||
internal func syntaxKindsByLine(startLine: Int? = nil, | |||
endLine: Int? = nil) -> [(Int, [SyntaxKind])] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation is off here
@jpsim @norio-nomura I've added a new property on Could you guys please take another look and share your thoughts? |
let longerFunctionBodyWithComments = "func abc() {" + | ||
Repeat(count: 40, repeatedValue: " // this is a comment\n").joinWithSeparator("") + | ||
"}\n" | ||
XCTAssertEqual(violations(longerFunctionBodyWithComments), []) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I had overlooked that the tests is not enough.
Maybe it should pass following tests:
let longFunctionBodyWithComments = "func abc() {" +
Repeat(count: 40, repeatedValue: "\n").joinWithSeparator("") +
"// comment only line should be ignored.\n" +
"}\n"
XCTAssertEqual(violations(longFunctionBodyWithComments), [])
let longerFunctionBodyWithComments = "func abc() {" +
Repeat(count: 41, repeatedValue: "\n").joinWithSeparator("") +
"// comment only line should be ignored.\n" +
"}\n"
XCTAssertEqual(violations(longerFunctionBodyWithComments), [StyleViolation(
ruleDescription: FunctionBodyLengthRule.description,
location: Location(file: nil, line: 1, character: 1),
reason: "Function body should span 40 lines or less: currently spans 41 lines")])
let longFunctionBodyWithMultilineComments = "func abc() {" +
Repeat(count: 40, repeatedValue: "\n").joinWithSeparator("") +
"/* multi line comment only line should be ignored.\n*/\n" +
"}\n"
XCTAssertEqual(violations(longFunctionBodyWithMultilineComments), [])
let longerFunctionBodyWithMultilineComments = "func abc() {" +
Repeat(count: 41, repeatedValue: "\n").joinWithSeparator("") +
"/* multi line comment only line should be ignored.\n*/\n" +
"}\n"
XCTAssertEqual(violations(longerFunctionBodyWithMultilineComments), [StyleViolation(
ruleDescription: FunctionBodyLengthRule.description,
location: Location(file: nil, line: 1, character: 1),
reason: "Function body should span 40 lines or less: currently spans 41 lines")])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't longerFunctionBodyWithComments
and longerFunctionBodyWithMultilineComments
not trigger any violations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forget that, I didn't realize that the comments were outside Repeat
. My bad!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That tests actually caught a bug: the number of lines that the function spans in the reason. Should we update it to subtract the comment only lines? This might be confusing for someone who isn't very familiar with the rule.
Also, if we follow that idea, any ideas on how to return the "fixed" line count on exceedsLineCountExcludingComments
? I was thinking about returning a tuple, but I'm not sure if that's the best way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be confusing for someone who isn't very familiar with the rule.
I think it would be better reason
explaining that comment only lines are ignored.
That tests actually caught a bug
As I tested, SourceKitten returns one token for one multiline comment block. But your code expects that every tokens are in one line. So, counting ignored comment only lines is not correct on multiline comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should fill in missing lines in syntaxKindsByLines
, which will also help avoid counting whitespace-only lines in the function body line count.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marcelofabri @norio-nomura this PR seems fine to me now, other than a rebase, what else needs to be done? |
Maybe making the reason message clearer by adding explains that comment and whitespace only lines are excluded. |
Current coverage is
|
I've updated the reason message to
What do you guys think? |
Maybe this instead?
|
@scottrhoyt I think we should disable codecov's automated comments and just have it as a GitHub status check instead. The automated comments are a bit too noisy for my taste. |
👏 thanks!!! |
…ents FunctionBodyLengthRule should not count comment lines
🎉 |
💯 |
Fixes #258.