Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

readtags: canonicalize the input file name based on CWD ptag #3304

Merged
merged 16 commits into from
Apr 30, 2023

Conversation

masatake
Copy link
Member

@masatake masatake commented Mar 10, 2022

See #3168.

Just illustrating an idea. Still far from merging.

  • revise code
  • update the readtags.1 man page
  • consider Windows
    • driver letters (C:/)
    • backslash (\usr\include)
  • test cases

@masatake
Copy link
Member Author

Do I have to consider windows' file name like C:/foo/bar?

extra-cmds/readtags-cmd.c Outdated Show resolved Hide resolved
@masatake
Copy link
Member Author

masatake commented Sep 17, 2022

Automatically growing hashtable improves the performance incredibly!

[yamato@dev64]~/var/ctags-github% ls -lh ~/.citre/upstream-linux.tags
-rw-r--r--. 1 yamato yamato 3.0G Aug 21 18:22 /home/yamato/.citre/upstream-linux.tags
[yamato@dev64]~/var/ctags-github% time ./readtags -c -t  ~/.citre/upstream-linux.tags -l > /dev/null

./readtags -c -t ~/.citre/upstream-linux.tags -l > /dev/null  11.15s user 0.97s system 99% cpu 12.182 total
[yamato@dev64]~/var/ctags-github% time ./readtags -t  ~/.citre/upstream-linux.tags -l > /dev/null 

./readtags -t ~/.citre/upstream-linux.tags -l > /dev/null  9.40s user 0.98s system 99% cpu 10.428 total

@masatake masatake force-pushed the canon-filenames branch 2 times, most recently from 09613db to 1a1b7e7 Compare September 18, 2022 17:54
@codecov
Copy link

codecov bot commented Sep 18, 2022

Codecov Report

Patch coverage: 92.56% and project coverage change: +0.13 🎉

Comparison is base (7e5c8dc) 82.83% compared to head (9250706) 82.96%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3304      +/-   ##
==========================================
+ Coverage   82.83%   82.96%   +0.13%     
==========================================
  Files         226      226              
  Lines       54754    54962     +208     
==========================================
+ Hits        45353    45598     +245     
+ Misses       9401     9364      -37     
Impacted Files Coverage Δ
extra-cmds/readtags-cmd.c 59.72% <77.94%> (+4.66%) ⬆️
main/fname.c 96.26% <92.68%> (+35.06%) ⬆️
extra-cmds/printtags.c 91.75% <100.00%> (+0.44%) ⬆️
extra-cmds/utiltest.c 100.00% <100.00%> (ø)

... and 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

extra-cmds/readtags-cmd.c Outdated Show resolved Hide resolved
@masatake
Copy link
Member Author

I decided to just reject a Windows directory in this pull request.

@masatake masatake marked this pull request as ready for review October 10, 2022 10:01
@masatake
Copy link
Member Author

@AmaiKinono, for solving the first item in #3168, I added the -C,--canonicalize-input option to readtags. I would like you to try this pull request.

@masatake
Copy link
Member Author

I wonder if these things should be done on the ctags side.
There are tags generators other than u-ctags, so it is good that ctags has -C,--canonicalize-input option.

@masatake masatake force-pushed the canon-filenames branch 4 times, most recently from 57b8b62 to e8aa1ec Compare October 14, 2022 09:35
@k-takata k-takata changed the title readtags: canicalize the input file name based on CWD ptag readtags: canonicalize the input file name based on CWD ptag Oct 14, 2022
@masatake
Copy link
Member Author

@k-takata, thank you for fixing the title.

@masatake masatake force-pushed the canon-filenames branch 5 times, most recently from b6b5051 to 185f8b8 Compare October 15, 2022 19:52
@masatake
Copy link
Member Author

I merged main/fname.c unexpectedly in 8f2e2ee.
No test case for the functions in the file at all.

I must add the test cases for the functions to utiltest in this pull request.

man/readtags.1.rst.in Outdated Show resolved Hide resolved
@masatake masatake added this to the 6.1 milestone Dec 4, 2022
@masatake
Copy link
Member Author

#3600 must be solved first.
So we don't have to think about backslash.

@masatake
Copy link
Member Author

masatake commented Apr 29, 2023

@AmaiKinono Sorry for delaying finishing this PR.

I'm working on -A, --absolute-input in long-form, options now.
-A option canonicalizes the input field like -C but doesn't make the result relative to the value of CWD ptag. The options work on my PC. Updating the man page and writing test cases are on my TODO list.

I want to explain when -A is useful.
I'm thinking about making readtags able to take multiple tags files.

I frequently read multiple source trees in a set. e.g., systemd+linux kernel, glibc + linux kernel, gimp + gtk, and so on. The combinations of the source tree can be unpredictable.
Even in this situation, I want to run ctags on linux kernel source tree only once.
I don't want to run ctags twice: once for linux kernel only and once for linux kernel + systemd.

If readtags can take multiple tags files when searching, we can reuse the tags file for linux kernel when reading "glibc + linux", "systemd + linux", "audit + linux", ....

If readtags reports the input field of a tag in relative form, a client tool like citre must convert it to an absolute form. Converting is easy if a tool deals with only one tags file; just referring to its CWD ptag is enough. However, if a tool deals with multiple tags files, there are multiple CWD ptags. -A comes here. readtags returns absolute file names. So no conversion is needed.

That means I would like to set a list of strings to the value slot of citre-tags-file-alist in the future.

@masatake
Copy link
Member Author

I found one more interesting idea: --assume-cwd=... option.
With this option, you can override or set !_TAG_PROC_CWD in a tags file.

masatake and others added 16 commits April 30, 2023 11:16
…ileName

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Partially close universal-ctags#3168.

TODO:
* revise code,

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
…edo-tags

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Co-authored-by: AmaiKinono <amaikinono@gmail.com>
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
@masatake masatake merged commit 4622e54 into universal-ctags:master Apr 30, 2023
@AmaiKinono
Copy link
Member

If readtags can take multiple tags files when searching, we can reuse the tags file for linux kernel when reading "glibc + linux", "systemd + linux", "audit + linux", ....

That's an exciting feature (and actually requested by a lot of Citre users, on a Chinese Emacs forum). However, when parsing a tagline, a client tool needs some metadata of the tags file, so if readtags outputs taglines in multiple tags files, it may needs to align their metadata. I'll list the metadata used by Citre here:

  • time: The last update time of all the metadata. Before calling readtags, Citre compares the last modified time of the tags file, to decide if the metadata needs update. For multiple tags file, Citre could just compare time to the most latest modified time of the tags files.
  • remotep: Whether the tags file is a remote file. If it is, we add the remote prefix to the input field. We should forbid the user to combine tags files from different hosts, so this one needs no special treatment.
  • dir: The current working directory when generating the tags file. We don't need this if we use -A.
  • os: Whether dir is unix-style (so we know if the tags file is generated on Unix or Windows). This has several usage, but should become irrelevant if we use -A.
  • kind-table: A hash table constructed from !TAG_KIND_DESCRIPTION ptag, used to convert single-letter kind to full-length kind. This one is a bit tricky as different tags files may have different !TAG_KIND_DESCRIPTION ptag. One solution is adding an option to readtags to print the kind field in its full length form. Another solution is to let Citre convert single letter kinds based on a table constructed from ctags --list-kinds-full. This shouldn't be a big deal as the kind is only used for showing a tag to the user, and using full-length kind when generating the tags file is encouraged anyway.

So, the conclusion is, readtags may need another option to output the kind field in its full-length form, but it's okay to not have it. But this is only for Citre, and we may need to consider what to do if tags files have different ptag values, as ptags are intended for client tools to utilize.

That means I would like to set a list of strings to the value slot of citre-tags-file-alist in the future.

I find the relevant code in Citre to be quite convoluted. I plan to deprecate citre-tags-file-alist, and make it possible to specify tags file (or tags files, in the future) using a file-local variable. So, users could set it in .dir-locals.el or modify it in find-file-hook.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants