Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First comment line in C# is counted as Code #72

Closed
ptupitsyn opened this issue May 2, 2019 · 6 comments
Closed

First comment line in C# is counted as Code #72

ptupitsyn opened this issue May 2, 2019 · 6 comments
Labels
bug Something isn't working

Comments

@ptupitsyn
Copy link

Describe the bug
When first line in C# file contains single-line comment (starts with //), it is counted as code.

To Reproduce

  1. Download file
    FooClass.zip
  2. Extract
  3. Scc

Expected behavior

───────────────────────────────────────────────────────────────────────────────
Language                 Files     Lines     Code  Comments   Blanks Complexity
───────────────────────────────────────────────────────────────────────────────
C#                           1        14       11         2        1          0
───────────────────────────────────────────────────────────────────────────────
Total                        1        14       11         2        1          0
───────────────────────────────────────────────────────────────────────────────

Actual behavior

───────────────────────────────────────────────────────────────────────────────
Language                 Files     Lines     Code  Comments   Blanks Complexity
───────────────────────────────────────────────────────────────────────────────
C#                           1        14       12         1        1          0
───────────────────────────────────────────────────────────────────────────────
Total                        1        14       12         1        1          0
───────────────────────────────────────────────────────────────────────────────

Desktop

  • Windows 10
@boyter boyter added the bug Something isn't working label May 2, 2019
@boyter
Copy link
Owner

boyter commented May 2, 2019

This is an unusual one. When run via command line it does indeed fail. With the following test however,

func TestCountStatsIssue72(t *testing.T) {
	ProcessConstants()
	fileJob := FileJob{
		Language: "C#",
	}

	fileJob.Content = []byte(`// Comment 1
namespace Baz
{
    using System;

    public class FooClass
    {
        public void Test(string report)
        {
          // Comment 2
          throw new NotImplementedException();
        }
    }
}`)

	CountStats(&fileJob)

	if fileJob.Lines != 14 {
		t.Errorf("Expected 14 lines")
	}

	if fileJob.Code != 11 {
		t.Errorf("Expected 11 lines got %d", fileJob.Code)
	}

	if fileJob.Comment != 2 {
		t.Errorf("Expected 2 lines got %d", fileJob.Comment)
	}

	if fileJob.Blank != 1 {
		t.Errorf("Expected 1 lines got %d", fileJob.Blank)
	}
}

It actually passes. I had a look at the encoding of the file, and the issue is that it is encoded as UTF-8-BOM. If you change it to UTF-8 it works as expected, re-encode as UTF-8-BOM and back to the failure. Just a guess (ill confirm later) but I suspect #73 is the same issue. Seeing as UTF-8-BOM adds some bytes to the start of the file this explains why the first line is counted as code.

image

ef bb bf is the BOM for this file.

I am not sure about the best way to approach resolving this. BOM is not recommended for use with UTF-8 http://www.unicode.org/versions/Unicode5.0.0/ch02.pdf

Use of a BOM is neither required nor recommended for UTF-8, but may be encountered in contexts where UTF-8 data is converted from other encoding forms that use a BOM or where the BOM is used as a UTF-8 signature. See the “Byte Order Mark” subsection in Section 16.8, Specials, for more information.

From wikipedia,

BOM use is optional. Its presence interferes with the use of UTF-8 by software that does not expect non-ASCII bytes at the start of a file but that could otherwise handle the text stream.

It is possible to check for the special characters and then strip them https://en.wikipedia.org/wiki/Byte_order_mark#Byte_order_marks_by_encoding but I would need to see if this is a common pattern for dealing with this. @dbaggerman thoughts?

@ptupitsyn is there any special reason you are using BOM for some files?

@boyter
Copy link
Owner

boyter commented May 2, 2019

Looking into it, seems removing the BOM is a pretty common pattern. I will add a check which checks for the presence of the first byte, and if so check for the full BOM and if found skip those bytes before processing the rest of the file.

@boyter
Copy link
Owner

boyter commented May 2, 2019

func TestCountStatsIssue72(t *testing.T) {
	ProcessConstants()
	fileJob := FileJob{
		Language: "C#",
	}

	fileJob.Content = []byte(`   // Comment 1
namespace Baz
{
    using System;

    public class FooClass
    {
        public void Test(string report)
        {
          // Comment 2
          throw new NotImplementedException();
        }
    }
}`)

	// Set the BOM
	fileJob.Content[0] = 239
	fileJob.Content[1] = 187
	fileJob.Content[2] = 191

	CountStats(&fileJob)

	if fileJob.Lines != 14 {
		t.Errorf("Expected 14 lines")
	}

	if fileJob.Code != 11 {
		t.Errorf("Expected 11 lines got %d", fileJob.Code)
	}

	if fileJob.Comment != 2 {
		t.Errorf("Expected 2 lines got %d", fileJob.Comment)
	}

	if fileJob.Blank != 1 {
		t.Errorf("Expected 1 lines got %d", fileJob.Blank)
	}
}

Above replicates the bug.

@boyter
Copy link
Owner

boyter commented May 2, 2019

There is a fix for this specific issue in PR #74 however I want it to be expanded to be more generic before it is merged in.

@ptupitsyn If you need a fix now you can build from that branch. Once we have the final fix merged in ill close this down.

@ptupitsyn
Copy link
Author

I understand that BOM is not recommended, and modern IDEs and editors don't insert it.
This file originates from a third party codebase, I just removed all the irrelevant stuff and used fake names.

Skipping BOM bytes seems to be a good approach.

Thanks a lot for the quick response!

@boyter
Copy link
Owner

boyter commented May 9, 2019

Merged in. Build from master to get this.

Will be made into the next release, when #76 is finished off.

@boyter boyter closed this as completed May 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants