Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Special symbols like © seem to mess with sourcekit results #42

Closed
nathankot opened this issue May 10, 2016 · 9 comments
Closed

Special symbols like © seem to mess with sourcekit results #42

nathankot opened this issue May 10, 2016 · 9 comments

Comments

@nathankot
Copy link
Collaborator

nathankot commented May 10, 2016

As per what @galeo discovered in nathankot/company-sourcekit#16, I'll post my findings here:

Completion without the copyright symbol, offset is at: CGRect(|):

$ sourcekitten complete --text '# ; import AVFoundation; CGRect()' --offset 32 | head
[{
  "sourcetext" : "origin: <#T##CGPoint#>, size: <#T##CGSize#>"
}, ... ]

Completion with the copyright symbol, offset is at CGRect(|):

$ sourcekitten complete --text '# ©; import AVFoundation; CGRect()' --offset 33 | head
[{
  "sourcetext" : "()",
}, ... ]

Completion with the copyright symbol, offset is (seemingly) incorrect at CGRect()|:

$ sourcekitten complete --text '# ©; import AVFoundation; CGRect()' --offset 34 | head
[{
  "sourcetext" : "origin: <#T##CGPoint#>, size: <#T##CGSize#>",
}, ... ]

It looks like xcode isn't considering the © character at all.

I'm not sure if this is desired behavior on Soucekit's part, but it'd be interesting to get your input guys @terhechte @seanfarley

@terhechte
Copy link
Owner

I wonder if this also applies to other characters or if this is a special case with only the © symbol.

@nathankot
Copy link
Collaborator Author

Most likely applies to others as well

On Tue, May 10, 2016 at 7:57 PM, Benedikt Terhechte <
notifications@github.com> wrote:

I wonder if this also applies to other characters or if this is a special
case with only the © symbol.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#42 (comment)

@terhechte
Copy link
Owner

Does it only happen if it is in the same line? Or also if it is somewhere before the current cursor? To remedy this, we'd probably need to open the file, jump to the correct offset, and go back to make sure that none of those special characters are in there, right?

@seanfarley
Copy link

I imagine this is due to unicode since some unicode characters count as more than one. That's a bit hand wavy, I realize, but if sourcekit is expecting a byte string (warning: this is just a guess), then the counting will be off with unicode. You can see this in python2:

$ python2.7 -c 'print len("😈")
4

In the case of "©", we can see why adding 1 seemingly works:

$ python2.7 -c 'print len("©")'
2

@nathankot
Copy link
Collaborator Author

Nice :) I propose we fix this in either sourcekittendaemon or sourcekitten:

  7> "©".utf8.count
$R2: Distance = 2
  8> "©".characters.count
$R3: Distance = 1

@nathankot
Copy link
Collaborator Author

Actually, now that I think about it this really has to be fixed in the editor integrations doesn't it, otherwise the top layers would be needing to do magic translating a character offset to a utf8 offset.

@nathankot
Copy link
Collaborator Author

In emacs:

(position-bytes (point))

nathankot added a commit to nathankot/company-sourcekit that referenced this issue May 11, 2016
@nathankot
Copy link
Collaborator Author

This has been fixed in company-sourcekit :) I'll add a note to the readme for sourcekittendaemon and close this.

@seanfarley
Copy link

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants