Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lines counts are incorrect when symlinks are present #193

Closed
benhoyt opened this issue Sep 1, 2020 · 13 comments
Closed

Lines counts are incorrect when symlinks are present #193

benhoyt opened this issue Sep 1, 2020 · 13 comments
Labels
bug Something isn't working

Comments

@benhoyt
Copy link

benhoyt commented Sep 1, 2020

When there are one or more symlinks present in a directory, scc gives wildly incorrect line counts -- close, but not exactly equal to, the file size in bytes of the file in question. Running scc -d doesn't help, and there doesn't seem to be any --ignore-symlinks option.

To Reproduce

First create a single file, test.py, as follows:

# this is a comment.

import os

for e in os.scandir('.'):
	if e.is_file():
		print(e)

For the record, the output of wc is as follows:

$ wc *.py
 7 14 87 test.py

Then run scc -- it shows 7 lines as expected:

───────────────────────────────────────────────────────────────────────────────
Language                 Files     Lines   Blanks  Comments     Code Complexity
───────────────────────────────────────────────────────────────────────────────
Python                       1         7        2         1        4          3
───────────────────────────────────────────────────────────────────────────────

Then create a symlink link.py pointing to test.py, and run wc again to check:

$ ln -s test.py link.py
$ wc *.py
  7  14  87 link.py
  7  14  87 test.py
 14  28 174 total

However, now scc says 88 lines:

───────────────────────────────────────────────────────────────────────────────
Language                 Files     Lines   Blanks  Comments     Code Complexity
───────────────────────────────────────────────────────────────────────────────
Python                       2        88       12         2       74          5
───────────────────────────────────────────────────────────────────────────────

I would have expected scc to say 14 lines, and scc -d to say 7.

As mentioned, 88 is close to but not exactly equal to the file size (87 bytes). Seems like a hint as to what the issue is (though I don't know the codebase at all).

Desktop:

  • OS: Linux (Ubuntu 20.04)
  • scc version 2.12.0
@boyter boyter added the bug Something isn't working label Sep 1, 2020
@boyter
Copy link
Owner

boyter commented Sep 1, 2020

Now that is an unusual one. Thanks for the excellent bug report. Will investigate whats going on soon.

@boyter
Copy link
Owner

boyter commented Sep 2, 2020

Interestingly this works as you would expect on macOS

$ ls -lha
total 8
drwxr-xr-x   4 bb100123  527232157   128B  2 Sep 11:29 .
drwxr-xr-x  24 bb100123  527232157   768B  2 Sep 11:28 ..
lrwxr-xr-x   1 bb100123  527232157     7B  2 Sep 11:29 link.py -> test.py
-rw-r--r--   1 bb100123  527232157    87B  2 Sep 11:28 test.py

# bb100123 @ VCOANSYD256197 in ~/Documents/kablamo/scc/examples/temp on git:master x [11:30:15]
$ scc
───────────────────────────────────────────────────────────────────────────────
Language                 Files     Lines   Blanks  Comments     Code Complexity
───────────────────────────────────────────────────────────────────────────────
Python                       2        14        4         2        8          6
───────────────────────────────────────────────────────────────────────────────
Total                        2        14        4         2        8          6
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop $169
Estimated Schedule Effort 0.566539 months
Estimated People Required 0.035495
───────────────────────────────────────────────────────────────────────────────
Processed 174 bytes, 0.000 megabytes (SI)
───────────────────────────────────────────────────────────────────────────────


# bb100123 @ VCOANSYD256197 in ~/Documents/kablamo/scc/examples/temp on git:master x [11:30:18]
$ scc -d
───────────────────────────────────────────────────────────────────────────────
Language                 Files     Lines   Blanks  Comments     Code Complexity
───────────────────────────────────────────────────────────────────────────────
Python                       1         7        2         1        4          3
───────────────────────────────────────────────────────────────────────────────
Total                        1         7        2         1        4          3
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop $81
Estimated Schedule Effort 0.429654 months
Estimated People Required 0.022604
───────────────────────────────────────────────────────────────────────────────
Processed 87 bytes, 0.000 megabytes (SI)
───────────────────────────────────────────────────────────────────────────────

@boyter
Copy link
Owner

boyter commented Sep 2, 2020

On a 18.04 Ubuntu instance it is working as expected as well.

root@Ubuntu-1804-bionic-64-minimal ~/test # ls -lha
total 12K
drwxr-xr-x  2 root root 4.0K Sep  2 03:32 .
drwx------ 13 root root 4.0K Sep  2 03:32 ..
lrwxrwxrwx  1 root root    7 Sep  2 03:32 link.py -> test.py
-rw-r--r--  1 root root   87 Sep  2 03:32 test.py
root@Ubuntu-1804-bionic-64-minimal ~/test # scc
───────────────────────────────────────────────────────────────────────────────
Language                 Files     Lines     Code  Comments   Blanks Complexity
───────────────────────────────────────────────────────────────────────────────
Python                       2        14        8         2        4          6
───────────────────────────────────────────────────────────────────────────────
Total                        2        14        8         2        4          6
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop $169
Estimated Schedule Effort 0.566539 months
Estimated People Required 0.035495
───────────────────────────────────────────────────────────────────────────────

root@Ubuntu-1804-bionic-64-minimal ~/test # scc *.py
───────────────────────────────────────────────────────────────────────────────
Language                 Files     Lines     Code  Comments   Blanks Complexity
───────────────────────────────────────────────────────────────────────────────
Python                       2        14        8         2        4          6
───────────────────────────────────────────────────────────────────────────────
Total                        2        14        8         2        4          6
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop $169
Estimated Schedule Effort 0.566539 months
Estimated People Required 0.035495
───────────────────────────────────────────────────────────────────────────────

root@Ubuntu-1804-bionic-64-minimal ~/test # scc -d
───────────────────────────────────────────────────────────────────────────────
Language                 Files     Lines     Code  Comments   Blanks Complexity
───────────────────────────────────────────────────────────────────────────────
Python                       1         7        4         1        2          3
───────────────────────────────────────────────────────────────────────────────
Total                        1         7        4         1        2          3
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop $81
Estimated Schedule Effort 0.429654 months
Estimated People Required 0.022604
───────────────────────────────────────────────────────────────────────────────

Can you try building from master and letting me know what result you get? I wonder if there is a bug in 2.12 that might be down to the version of Go I used. For the above tests I used Go 1.15.

@benhoyt
Copy link
Author

benhoyt commented Sep 2, 2020

Interesting! I was running the snap version (2.12.0), but when I grab the source and build and use that version it works as expected (see below). Any idea what might have fixed it?

~/h/scc/test$ ../scc --version
scc version 2.13.0 (beta)
~/h/scc/test$ ../scc
───────────────────────────────────────────────────────────────────────────────
Language                 Files     Lines   Blanks  Comments     Code Complexity
───────────────────────────────────────────────────────────────────────────────
Python                       2        14        4         2        8          6
───────────────────────────────────────────────────────────────────────────────
Total                        2        14        4         2        8          6
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop $169
Estimated Schedule Effort 0.507871 months
Estimated People Required 0.029696
───────────────────────────────────────────────────────────────────────────────
Processed 174 bytes, 0.000 megabytes (SI)
───────────────────────────────────────────────────────────────────────────────

~/h/scc/test$ scc --version
scc version 2.12.0
~/h/scc/test$ scc
───────────────────────────────────────────────────────────────────────────────
Language                 Files     Lines   Blanks  Comments     Code Complexity
───────────────────────────────────────────────────────────────────────────────
Python                       2        88       12         2       74          5
───────────────────────────────────────────────────────────────────────────────
Total                        2        88       12         2       74          5
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop $1,755
Estimated Schedule Effort 1.376323 months
Estimated People Required 0.151051
───────────────────────────────────────────────────────────────────────────────

@boyter
Copy link
Owner

boyter commented Sep 2, 2020

https://github.com/boyter/scc/projects/4

There are quite a few changes from what you were using on snap to now. I think that #173 resolved it as its the only one that seems even slightly related.

Ill have a play around with that theory by trying the version just before it merged in sometime tonight after work.

@benhoyt
Copy link
Author

benhoyt commented Sep 2, 2020

Cool, thanks. Would it make sense for scc to ignore symlinks by default? At least symlinks that link to paths under the tree you're searching? It seems to me that it will always double-count in cases like that.

@boyter
Copy link
Owner

boyter commented Sep 2, 2020

Its one of those ones that I tend to get polarising opinions on both sides. It is one of the main reasons why the -d option is in there, where you want to avoid duplicates.

Happy to listen to arguments on either side though as I am prepared to shift ground for good reasons.

@benhoyt
Copy link
Author

benhoyt commented Sep 2, 2020

Interesting. I haven't thought about it deeply -- what are the arguments for counting symlinked code? Particularly with symlinks that point to files under the root path, it seems like that's the definition of double counting. For example, in the juju/juju codebase, we have a directory with a single .py file in it, and then a dozen or so symlinks pointing to that file. So it was being counted 12 or so times. Took me quite a while to figure out what was happening (--by-file was very helpful).

I guess my argument is this: for a tool that counts lines (and prints the estimated cost of development), it seems incorrect to count symlinked code. If that symlinked code is under the root path, you definitely haven't written twice as much, so it shouldn't be counted twice. If the symlink points outside the root path, you probably haven't written that code either, or at least don't want it to be counted as code under the root path.

@boyter
Copy link
Owner

boyter commented Sep 2, 2020

The main counter argument is that it is counting what is actually needed to deliver the program as a whole. Although some use it to determine how many lines will be processed by other tools as well which may or may not process symlinks.

Although i'm going to agree with you and remove symlinks by default and make it an option to process them.

#194

@boyter
Copy link
Owner

boyter commented Sep 3, 2020

So this was actually worse than I thought. Even with the fixed version it was inconsistent depending on where the symlink was.

For example if the symlinks in the example were in the directory something and you ran scc twice like so

scc ./something && scc ./something/*.py

You would get different results from both runs because of how the fix was implemented.

So... I modified it so its consistent between runs.

 bb100123 @ VCOANSYD256197 in ~/Documents/kablamo/scc on git:master o [18:17:13]
$ scc ./examples/symlink/
───────────────────────────────────────────────────────────────────────────────
Language                 Files     Lines   Blanks  Comments     Code Complexity
───────────────────────────────────────────────────────────────────────────────
Python                       1         7        2         1        4          3
───────────────────────────────────────────────────────────────────────────────
Total                        1         7        2         1        4          3
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop $81
Estimated Schedule Effort 0.385161 months
Estimated People Required 0.018912
───────────────────────────────────────────────────────────────────────────────
Processed 87 bytes, 0.000 megabytes (SI)
───────────────────────────────────────────────────────────────────────────────


# bb100123 @ VCOANSYD256197 in ~/Documents/kablamo/scc on git:master o [18:19:53]
$ scc --include-symlink ./examples/symlink/
───────────────────────────────────────────────────────────────────────────────
Language                 Files     Lines   Blanks  Comments     Code Complexity
───────────────────────────────────────────────────────────────────────────────
Python                       2        14        4         2        8          6
───────────────────────────────────────────────────────────────────────────────
Total                        2        14        4         2        8          6
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop $169
Estimated Schedule Effort 0.507871 months
Estimated People Required 0.029696
───────────────────────────────────────────────────────────────────────────────
Processed 174 bytes, 0.000 megabytes (SI)
───────────────────────────────────────────────────────────────────────────────


# bb100123 @ VCOANSYD256197 in ~/Documents/kablamo/scc on git:master o [18:20:03]
$ scc ./examples/symlink/*.py
───────────────────────────────────────────────────────────────────────────────
Language                 Files     Lines   Blanks  Comments     Code Complexity
───────────────────────────────────────────────────────────────────────────────
Python                       1         7        2         1        4          3
───────────────────────────────────────────────────────────────────────────────
Total                        1         7        2         1        4          3
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop $81
Estimated Schedule Effort 0.385161 months
Estimated People Required 0.018912
───────────────────────────────────────────────────────────────────────────────
Processed 87 bytes, 0.000 megabytes (SI)
───────────────────────────────────────────────────────────────────────────────


# bb100123 @ VCOANSYD256197 in ~/Documents/kablamo/scc on git:master o [18:20:09]
$ scc --include-symlink ./examples/symlink/*.py
───────────────────────────────────────────────────────────────────────────────
Language                 Files     Lines   Blanks  Comments     Code Complexity
───────────────────────────────────────────────────────────────────────────────
Python                       2        14        4         2        8          6
───────────────────────────────────────────────────────────────────────────────
Total                        2        14        4         2        8          6
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop $169
Estimated Schedule Effort 0.507871 months
Estimated People Required 0.029696
───────────────────────────────────────────────────────────────────────────────
Processed 174 bytes, 0.000 megabytes (SI)
───────────────────────────────────────────────────────────────────────────────

Anyway the default is off, but you can turn it on, in which case it will follow the real file, although it probably breaks if you symlink a symlink, but you know what, if you are doing that you deserve the pain. Although thinking about it I think Qt does it... so I might need to fix that... something for tomorrow anyway.

@benhoyt
Copy link
Author

benhoyt commented Sep 3, 2020

Thanks for digging in and fixing this!

Minor thing (assuming it's not too late): I'd recommend using --include-symlinks, plural. As singular it reads a bit funny to me because there may be more than one symlink. Similar to the follow_symlinks flag in Python's os.DirEntry functions (for example).

@boyter
Copy link
Owner

boyter commented Sep 3, 2020

Nothing is ever set in stone :)

Changed. Also confirmed that symlinks to symlinks will work as expected.

Going to close this one as all looks OK to me now. Will make it into the next release once I get a that last feature in.

@boyter boyter closed this as completed Sep 3, 2020
@benhoyt
Copy link
Author

benhoyt commented Sep 3, 2020

Awesome, thank you!

@boyter boyter mentioned this issue Sep 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants