Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verilog: create tag of instance's reference name #3469

Closed
my2817 opened this issue Aug 25, 2022 · 33 comments
Closed

Verilog: create tag of instance's reference name #3469

my2817 opened this issue Aug 25, 2022 · 33 comments
Assignees

Comments

@my2817
Copy link
Contributor

my2817 commented Aug 25, 2022

I use ctags in my work, with emacs+citre. I write some HACK code to meet my needs. And put the info at here if some others are searching for same solutions.

  1. add a new kind K_REFERENCE to ctreate tag of instance's reference name. So that we can get a list from the tags file that how many times or where a module been referenced.
  2. (HACK) To give info of a instance's reference name. Update instance's scope field. Append instance's reference name to scope field.

master...my2817:ctags:master


The name of the parser: verilog

The content of input file:

module test (/*AUTOARG*/);
   input i_a, i_b;
   output o_c;

   ref1 int1 ();
   ref1 int2 ();
   ref3 # (.A (aaa),
           .B (bbb))
   int3 ();

endmodule // test

The tags output you are not satisfied with:


The tags output you expect:


i_a     ./v_test/test.v /^   input i_a, i_b;$/;"        kind:port       line:21 language:Verilog        scope:module:test       roles:def
i_b     ./v_test/test.v /^   input i_a, i_b;$/;"        kind:port       line:21 language:Verilog        scope:module:test       roles:def
int1    ./v_test/test.v /^   ref1 int1 ();$/;"  kind:instance   line:24 language:Verilog        scope:module:test:ref1  roles:def
int2    ./v_test/test.v /^   ref1 int2 ();$/;"  kind:instance   line:25 language:Verilog        scope:module:test:ref1  roles:def
int3    ./v_test/test.v /^   int3 ();$/;"       kind:instance   line:28 language:Verilog        scope:module:test:ref3  roles:def
o_c     ./v_test/test.v /^   output o_c;$/;"    kind:port       line:22 language:Verilog        scope:module:test       roles:def
ref1    ./v_test/test.v /^   ref1 int1 ();$/;"  kind:reference  line:24 language:Verilog        scope:module:test       roles:def
ref1    ./v_test/test.v /^   ref1 int2 ();$/;"  kind:reference  line:25 language:Verilog        scope:module:test       roles:def
ref3    ./v_test/test.v /^   ref3 # (.A (aaa),$/;"      kind:reference  line:26 language:Verilog        scope:module:test       roles:def
test    ./v_test/test.v /^module test (\/*AUTOARG*\/);$/;"      kind:module     line:20 language:Verilog        roles:def
@masatake
Copy link
Member

Though I don't know well about Verilog, what you have done is interesting. I have worked hard on improving ctags for person like you.

Can we say ref1 is an alias for int1?
ref1 can be used as just another name for int1 till ref1 int2 ();?

@masatake
Copy link
Member

I read a document about Verilog quickly.
Now I know ref1 is a name of module defined somewhere.
In such a case, we should use reference tags (https://docs.ctags.io/en/latest/output-tags.html?highlight=reference%20tags#reference-tags) for extracting ref1.

ref1 is a module defined somewhere, and is referenced for defining an instance.
So if we don't mind the length of the name of a role, an ideal parser emits

ref1    ./v_test/test.v /^   ref1 int1 ();$/;"  kind:module  line:24 language:Verilog        scope:module:test       roles:definingAnInstance
ref1    ./v_test/test.v /^   ref1 int2 ();$/;"  kind:reference  line:25 language:Verilog        scope:module:test       roles:definingAnInstance

About ref3, I could not understand.

   ref3 # (.A (aaa),
           .B (bbb))

Could you explain the text after #?

int1    ./v_test/test.v /^   ref1 int1 ();$/;"  kind:instance   line:24 language:Verilog        scope:module:test:ref1  roles:def

scope: field is not suitable for the purpose.
typeref: field is better.

@masatake
Copy link
Member

Now I understand ref3.
I should skip the newline in the middle of lines.

ref3 #(.A (aaa), .B(bbb)) int3()

masatake added a commit to masatake/ctags that referenced this issue Aug 25, 2022
This one is conceptually based on
universal-ctags/ctags@master...my2817:ctags:master
reported and written by @my2817 at universal-ctags#3469.

The test case is also taken from universal-ctags#3469 submitted by @my2817.

This commit does two things:

* fill typeref fields of instances with module names, and
* extract the module names used for defining instances as
  defInstance role of module kind.

TODO: Update ctags-lang-verilog(7).
@masatake
Copy link
Member

See #3470.

@my2817
Copy link
Contributor Author

my2817 commented Aug 26, 2022

Thanks all of your hard work. I do the changes at a old version of ctags. And don't even notice the usage of reference tag.

@masatake
Copy link
Member

/tmp$ git clone https://github.com/masatake/ctags
...                                        
/tmp$ cd ctags                         
/tmp/ctags$ git checkout origin/verilog-reftag
...                       
HEAD is now at 07a74fcc1 Verilog: fill typeref fields of instances                                                       
/tmp/ctags$ ./autogen.sh;./configure;make
...           
/tmp/ctags$ cat foo.v            
module test (/*AUTOARG*/);                                                                                                                                                                                          
   input i_a, i_b;                                                                                                                                                                                                  
   output o_c;                  
                                                                                                          
   ref1 int1 ();                                                                                          
   ref1 int2 ();                                                                                          
   ref3 # (.A (aaa),                                                                                      
           .B (bbb))                                                                                      
   int3 ();                                                                                               
                                                                                                          
endmodule // test
                                                     
/tmp/ctags$ ./ctags -o - ./foo.v                                                                          
i_a     ./foo.v /^   input i_a, i_b;$/;"        p       module:test                                       
i_b     ./foo.v /^   input i_a, i_b;$/;"        p       module:test                                       
int1    ./foo.v /^   ref1 int1 ();$/;"  i       module:test     typeref:module:ref1                       
int2    ./foo.v /^   ref1 int2 ();$/;"  i       module:test     typeref:module:ref1                       
int3    ./foo.v /^   int3 ();$/;"       i       module:test     typeref:module:ref3                       
o_c     ./foo.v /^   output o_c;$/;"    p       module:test                                               
test    ./foo.v /^module test (\/*AUTOARG*\/);$/;"      m          
                                                                                                          
/tmp/ctags$ ./ctags --extras=+r -o - ./foo.v                                                              
i_a     ./foo.v /^   input i_a, i_b;$/;"        p       module:test                
i_b     ./foo.v /^   input i_a, i_b;$/;"        p       module:test                
int1    ./foo.v /^   ref1 int1 ();$/;"  i       module:test     typeref:module:ref1
int2    ./foo.v /^   ref1 int2 ();$/;"  i       module:test     typeref:module:ref1
int3    ./foo.v /^   int3 ();$/;"       i       module:test     typeref:module:ref3                       
o_c     ./foo.v /^   output o_c;$/;"    p       module:test                                               
ref1    ./foo.v /^   ref1 int1 ();$/;"  m       module:test                                      
ref1    ./foo.v /^   ref1 int2 ();$/;"  m       module:test              
ref3    ./foo.v /^   ref3 # (.A (aaa),$/;"      m       module:test                
test    ./foo.v /^module test (\/*AUTOARG*\/);$/;"      m                          
                                                                                                          
/tmp/ctags$ ./ctags --extras=+r --fields=+r -o - ./foo.v                 
i_a     ./foo.v /^   input i_a, i_b;$/;"        p       module:test     roles:def
i_b     ./foo.v /^   input i_a, i_b;$/;"        p       module:test     roles:def
int1    ./foo.v /^   ref1 int1 ();$/;"  i       module:test     typeref:module:ref1     roles:def
int2    ./foo.v /^   ref1 int2 ();$/;"  i       module:test     typeref:module:ref1     roles:def                        
int3    ./foo.v /^   int3 ();$/;"       i       module:test     typeref:module:ref3     roles:def                        
o_c     ./foo.v /^   output o_c;$/;"    p       module:test     roles:def                                                
ref1    ./foo.v /^   ref1 int1 ();$/;"  m       module:test     roles:defInstance                                        
ref1    ./foo.v /^   ref1 int2 ();$/;"  m       module:test     roles:defInstance                                        
ref3    ./foo.v /^   ref3 # (.A (aaa),$/;"      m       module:test     roles:defInstance                                
test    ./foo.v /^module test (\/*AUTOARG*\/);$/;"      m       roles:def                                                

/tmp/ctags$ 

@hirooih
Copy link
Contributor

hirooih commented Aug 26, 2022

@my2817

Thank you for your proposal.
First let me confirm your request. Below is my understanding. Please correct me if I'm misunderstanding.


The tags output you are not satisfied with:

The output is empty in your report. But it should be:

$ ./ctags --fields=+lnrzKZ --extras= -o - test.v 
i_a     test.v  /^   input i_a, i_b;$/;"        kind:port       line:2  language:Verilog        scope:module:test       roles:def
i_b     test.v  /^   input i_a, i_b;$/;"        kind:port       line:2  language:Verilog        scope:module:test       roles:def
int1    test.v  /^   ref1 int1 ();$/;"  kind:instance   line:5  language:Verilog        scope:module:test       roles:def
int2    test.v  /^   ref1 int2 ();$/;"  kind:instance   line:6  language:Verilog        scope:module:test       roles:def
int3    test.v  /^   int3 ();$/;"       kind:instance   line:9  language:Verilog        scope:module:test       roles:def
o_c     test.v  /^   output o_c;$/;"    kind:port       line:3  language:Verilog        scope:module:test       roles:def
test    test.v  /^module test (\/*AUTOARG*\/);$/;"      kind:module     line:1  language:Verilog        roles:def

What you are expecting is adding the following 3 lines.

ref1    ./v_test/test.v /^   ref1 int1 ();$/;"  kind:reference  line:24 language:Verilog        scope:module:test       roles:def
ref1    ./v_test/test.v /^   ref1 int2 ();$/;"  kind:reference  line:25 language:Verilog        scope:module:test       roles:def
ref3    ./v_test/test.v /^   ref3 # (.A (aaa),$/;"      kind:reference  line:26 language:Verilog        scope:module:test       roles:def

You need them to get:

how many times or where a module been referenced.

I did not know about citre at all. (we should add citre on u-ctag document.
It seems that citre does not utilize the information added, as far as I took a look at the document quickly.


As @masatake san wrote we should use typeref field.
This is one of the items on the TODO list:

typeref field for instances (cf. #2703)

Proposed by @masatake san; #2703 (comment)

I didn't think it was useful, so I haven't implemented it yet. I honestly don't understand yet, but let's implement it.

(Oh, you are the contributor of #2703, @my2817.)


@masatake san,

Can we say ref1 is an alias for int1?

No.

ref1 can be used as just another name for int1 till ref1 int2 ();?

No.

ref1 is the name of a module. int1 is the name of an instance of module ref1.
For example, ref1 is a logic block of a HW component, i.e. a memory. We can use the component several times in a design. We give a name for each instance (device).
I hope this explains.

The followings are OK.

/tmp/ctags$ ./ctags -o - ./foo.v
...
int1    ./foo.v /^   ref1 int1 ();$/;"  i       module:test     typeref:module:ref1                       
int2    ./foo.v /^   ref1 int2 ();$/;"  i       module:test     typeref:module:ref1                       
int3    ./foo.v /^   int3 ();$/;"       i       module:test     typeref:module:ref3                       
...

But the followings are wrong. Either ref1 or ref3 is not module definition.

/tmp/ctags$ ./ctags --extras=+r -o - ./foo.v                                                              
...
int1    ./foo.v /^   ref1 int1 ();$/;"  i       module:test     typeref:module:ref1
int2    ./foo.v /^   ref1 int2 ();$/;"  i       module:test     typeref:module:ref1
int3    ./foo.v /^   int3 ();$/;"       i       module:test     typeref:module:ref3                       
...
ref1    ./foo.v /^   ref1 int1 ();$/;"  m       module:test                                      
ref1    ./foo.v /^   ref1 int2 ();$/;"  m       module:test              
ref3    ./foo.v /^   ref3 # (.A (aaa),$/;"      m       module:test                
...

I'm not sure about the followings:

/tmp/ctags$ ./ctags --extras=+r --fields=+r -o - ./foo.v                 
...
int1    ./foo.v /^   ref1 int1 ();$/;"  i       module:test     typeref:module:ref1     roles:def
int2    ./foo.v /^   ref1 int2 ();$/;"  i       module:test     typeref:module:ref1     roles:def                        
int3    ./foo.v /^   int3 ();$/;"       i       module:test     typeref:module:ref3     roles:def                        
...
ref1    ./foo.v /^   ref1 int1 ();$/;"  m       module:test     roles:defInstance                                        
ref1    ./foo.v /^   ref1 int2 ();$/;"  m       module:test     roles:defInstance                                        
ref3    ./foo.v /^   ref3 # (.A (aaa),$/;"      m       module:test     roles:defInstance                                
...
``` 

If the `roles:definstance` change the meaning of the kind field, `m` in this case, it should be OK.  But the behavior should be documented.

masatake added a commit to masatake/ctags that referenced this issue Aug 26, 2022
This one is partially based on
universal-ctags/ctags@master...my2817:ctags:master
reported and written by @my2817 at universal-ctags#3469.

Capturing the module names of instances is also proposed in universal-ctags#3469.
However, it is not in the scope of this commit.
masatake added a commit to masatake/ctags that referenced this issue Aug 26, 2022
This one is partially based on
universal-ctags/ctags@master...my2817:ctags:master
reported and written by @my2817 at universal-ctags#3469.

Capturing the module names of instances is also proposed in universal-ctags#3469.
However, it is not in the scope of this commit.
@masatake
Copy link
Member

The followings are OK.

/tmp/ctags$ ./ctags -o - ./foo.v
...
int1    ./foo.v /^   ref1 int1 ();$/;"  i       module:test     typeref:module:ref1                       
int2    ./foo.v /^   ref1 int2 ();$/;"  i       module:test     typeref:module:ref1                       
int3    ./foo.v /^   int3 ();$/;"       i       module:test     typeref:module:ref3                       
...

I made a pull request only focusing on adding typeref fields.
@hirooih could you look into #3471?

masatake added a commit to masatake/ctags that referenced this issue Aug 27, 2022
This one is partially based on
universal-ctags/ctags@master...my2817:ctags:master
reported and written by @my2817 at universal-ctags#3469.

Capturing the module names of instances is also proposed in universal-ctags#3469.
However, it is not in the scope of this commit.
@masatake
Copy link
Member

2.(HACK) To give info of a instance's reference name. Update instance's scope field. Append instance's reference name to scope field.

We merged the change based on your proposal. We use typeref field instead of scope field to record the info.

  1. add a new kind K_REFERENCE to ctreate tag of instance's reference name. So that we can get a list from the tags file that how many times or where a module been referenced.

Let's think about 1.

modref inst();

modref is a module but it is not a definition.

Originally ctags is a tool for tagging definitions.
However, there were demands to tag references.
A typical usage for reference tags is for answering the question "where is this function called?". #569

Eventually, we introduced the concept of "reference tags" to Universal Ctags.
I liked the concept of "kinds" of the definitions. This concept made ctags different from etags (and gtags). The kinds answer a question like "what is this?"

When I designed "reference tags", I introduced "roles" of the references.
The roles answer questions like "how is it used?" or "how it is referenced?".

Though we introduced the concept "role", it is not used widely in Universal Ctags.
The reasons are:
A. designing roles is not easy.
B. tagging reference tags can make a tags file too large.

I don't have much experience in designing roles. So I think it is not easy.
To break the situation, we must try designing.

modref inst();

modref is a module. So it should have the "module" kind.
The question is about the role of modref module.
My idea:
modref module is referenced for defining an instance inst.
So its role is defining an instance. However, it is too long.
So my choice was defInstane.
Do you have any good ideas?

You can see roles defined in ctags with ./ctags --list-roles=all.

To make delay the decision, we can choose something generic role names like "used" or "referenced".

Designing roles and kinds for a language is one of the most exciting activities in ctags development. Let's enjoy it.

@masatake
Copy link
Member

Extracting reference tags is disabled by default.
--extras=+r is for enabling it.
Roles for tags are not printed by default.
--fields=+r is for printing them.

masatake added a commit to masatake/ctags that referenced this issue Aug 28, 2022
…e tags

This one is conceptually based on
universal-ctags/ctags@master...my2817:ctags:master
reported and written by @my2817 at universal-ctags#3469.

The test case is also taken from universal-ctags#3469 submitted by @my2817.
masatake added a commit to masatake/ctags that referenced this issue Aug 28, 2022
…e tags

This one is conceptually based on
universal-ctags/ctags@master...my2817:ctags:master
reported and written by @my2817 at universal-ctags#3469.

The test case is also taken from universal-ctags#3469 submitted by @my2817.
@masatake
Copy link
Member

See #3473.
I used "defInstance" as the name for the role.
Changing it is easy.

citre has the ability to display reference tags.

@my2817
Copy link
Contributor Author

my2817 commented Aug 28, 2022

I have tested on #3473. It works well.
Thank you for your help and patience.

@hirooih
Copy link
Contributor

hirooih commented Aug 28, 2022

@masatake san,

I'm a little bit confused.
We added typeref field on #3471 and it was merged. (And it is enabled by default.)
Are you proposing to add #3473 which provides the same information as typeref?


In Nov. 2020 @my2817 contributed #2703, which provided instance kind. We added typeref to the kind this time.
A instance kind is actually a reference tag of a module.

I've took a look #569 and #680, but not sure how a reference tag looks like?
I thought Its kind was reference and it had typeref field, but I could not find a good example.
If this is correct, what we should do is to change instance kind to reference kind in Verilog parser.

@masatake
Copy link
Member

masatake commented Aug 28, 2022

A instance kind is actually a reference tag of a module.

This is an incorrect understanding of ctags.

modref inst();

inst is a newly introduced name. So inst should be extracted as a definition tag.
modref is not a newly introduced name. It is defined as a module SOMEWHERE. So modref HERE should be extracted as a reference tag.

inst is an instance. So instance kind is used for tagging.
modref is a module. So module kind is used for tagging.
inst is an instance of modref module, so the tag of inst fills the typefield field with modref.

For definition tags, ctags assigns an implicit role called definition to the tag.
inst has the definition role.

For reference tags, a parser assigns a role to the tag explicitly as I showed #3473.
In #3473, I introduced defInstance role in module kind and I assigned it to modref.

I will repeat the same explanation with targeting C language.

1 typedef double d;  /* Edited */
2 struct point {
3   d x, y;
4 };
5 struct point P;
name line# kind role note
d 1 typedef definition
double 1 typedef (or typename) /* edited (was: typeref*) */ defType a reference tag. double is defined in a C compiler?
point 2 struct definition
d 3 typedef defMember (or defVar) a reference tag.
x 3 member definition
y 3 member definition
point 5 struct defVar a reference tag.
P 5 var definition

I assumed the relationship between d and x is similar to the relationship between modref and inst.

@hirooih
Copy link
Contributor

hirooih commented Aug 29, 2022

@masatake san,

Thank you for your explanation. We are getting closer.

A instance kind is actually a reference tag of a module.

I should write this as:

A instance kind is a definition (role) of a reference type of a module.

d on the line 3 and point on line 5 differs from my understanding in your table above.

  • d on line3: kind: typeref, role: defType
  • point (or struct point) on line 5: kind: typeref, role: defType (or defStruct)

I assumed the relationship between d and x is similar to the relationship between modref and inst.

This is my understanding, too.

  • Module modref was defined somewhere else. (double in above)
  • modref: kind: typeref, role: defType (or defModule). (double in above)
  • inst: kind: module_instance, role: definition (x or y in above)

$ cat foo.c
typedef d double;
struct point {
  d x, y;
};
struct point P;
$ ./ctags -o - --sort=no --extras=+r --fields=+rK foo.c
d       foo.c   /^typedef d double;$/;" typedef typeref:typename:double file:   roles:def
point   foo.c   /^struct point {$/;"    struct  file:   roles:def
x       foo.c   /^  d x, y;$/;" member  struct:point    typeref:typename:d      file:   roles:def
y       foo.c   /^  d x, y;$/;" member  struct:point    typeref:typename:d      file:   roles:def
P       foo.c   /^struct point P;$/;"   variable        typeref:struct:point    roles:def

We implemented the typeref field on #3471:

int1    ./foo.v /^   ref1 int1 ();$/;"  i       module:test     typeref:module:ref1

The new PR, #3473, will emit the following line:

ref1	input.v	/^   ref1 int1 ();$/;"	m	module:test	roles:defInstance

Here kind is module. This means roles:defInstance changes the meaning of kind field. It would confuse client tools.
I think a dedicated type, i.e.typeref, kind is better if we add this line.

And #3473 is equivalent with adding the following lines on the example in C above:

d       foo.c   /^  d x, y;$/;" typeref  typeref:typename:double      file:   roles:defType
P       foo.c   /^struct point P;$/;"   typeref        typeref:struct:point    roles:defType (or defStruct)

The current C parser does not emit them. I think they are verbose.

@masatake
Copy link
Member

I'm sorry I took a big mistake.

typedef d double;

This is wrong.

What I should write was:

typedef double d;

@hirooih, can I update my comment #3469 (comment) ?

@hirooih
Copy link
Contributor

hirooih commented Aug 29, 2022

Oh, I overlooked, too. (VS code was warning me on the line.)

@hirooih, can I update my comment #3469 (comment) ?

Sure.

Here is my fixed version.

$ cat foo.c
typedef double d;
struct point {
  d x, y;
};
struct point P;
$ ./ctags -o - --sort=no --extras=+r --fields=+rK foo.c
d       foo.c   /^typedef double d;$/;" typedef typeref:typename:double file:   roles:def
point   foo.c   /^struct point {$/;"    struct  file:   roles:def
x       foo.c   /^  d x, y;$/;" member  struct:point    typeref:typename:d      file:   roles:def
y       foo.c   /^  d x, y;$/;" member  struct:point    typeref:typename:d      file:   roles:def
P       foo.c   /^struct point P;$/;"   variable        typeref:struct:point    roles:def

Important points are not changed.

@masatake
Copy link
Member

masatake commented Aug 30, 2022

Updated. I found another mistake in the table about the C language input.
In the table, I wrote typeref. It is wrong I should write typedef instead.
All I updated are marked with /* Edited */.

@hirooih, could you update your original comments if my changes make your comment inconsistent?

@hirooih
Copy link
Contributor

hirooih commented Aug 30, 2022

@hirooih, could you update your original comments if my changes make your comment inconsistent?

Your fix is rather worse in the following point:

I think a dedicated type, i.e.typeref, kind is better if we add this line.

Your fix uses the same kind as the tags for roles:def.
Again, your proposal is equivalent with adding the following lines on the example in C above (I changed the kind fields):

d       foo.c   /^  d x, y;$/;" typedef  typeref:typename:double      file:   roles:defType
P       foo.c   /^struct point P;$/;"   struct        typeref:struct:point    roles:defType (or defStruct)

A client tool has to distinguish:

d       foo.c   /^typedef double d;$/;" typedef typeref:typename:double file:   roles:def

or

point   foo.c   /^struct point {$/;"    struct  file:   roles:def

from

d       foo.c   /^  d x, y;$/;" typedef  typeref:typename:double      file:   roles:defType

or

P       foo.c   /^struct point P;$/;"   struct        typeref:struct:point    roles:defType (or defStruct)

respectively by using roles field.
A client tool which does not know about new roles field cannot distinguish them.
And the new lines do not give us no new information.

@masatake
Copy link
Member

Again, your proposal is equivalent with adding the following lines on the example in C above (I changed the kind fields):

d       foo.c   /^  d x, y;$/;" typedef  typeref:typename:double      file:   roles:defType
P       foo.c   /^struct point P;$/;"   struct        typeref:struct:point    roles:defType (or defStruct)

Yes.

A client tool has to distinguish:

d       foo.c   /^typedef double d;$/;" typedef typeref:typename:double file:   roles:def

from

d       foo.c   /^  d x, y;$/;" typedef  typeref:typename:double      file:   roles:defType

respectively by using roles field.

Yes.

A client tool which does not know about new roles field cannot distinguish them.

Yes.

So I didn't set --extras=+r by default.

So a client which does not know about new roles field should not use --extras=+r.
Most of the client tools don't know the new roles field.
In the Universe, citre is of of the two exception. citre knows about new roles field.
Another one is readtags.

If a client tool knows about the new roles field, what kind of information ctags can provide for it?

And the new lines do not give us no new information.

You may be correct.
x and y have typeref fields, so questions like "where type d is used for defining members of structs" can be answered without the reference tag for d IF a user has much time.

@my2817 wrote:

add a new kind K_REFERENCE to ctreate tag of instance's reference name. So that we can get a list from the tags file that how many times or where a module been referenced.

How can we count them?
If we don't have reference tags:

set i 0
foreach $tag in (list-tags "tags"); do
   if (has-typeref field $tag) and (equal (kind-of-typeref $tag) "struct")
      and (equal (name-of-typeref $tag) "d")
      then i++
done
print i

We can count "d" but it takes a long time if the tags file is large.
This algorithm is O(N); N is the number of tags in "tags: file.

If we implement reference tags and --sort is not NO:

set i 0
foreach $tag in (list-tags "tags" having "d" as name); do
   if (roles-includes $tag "defMember")
   then i++
done
print i

This algorithm is O(log(N)).

@hirooih
Copy link
Contributor

hirooih commented Aug 30, 2022

Both algorithms are O(N), I think...

Am I missing anything?
--sort should not matter. This may be a hint...

@masatake
Copy link
Member

Assume (list-tags "tags" having "d" as name) uses the same algorithm used in libreadtags.
It uses binary search.

See http://ctags.sourceforge.net/tool_support.html.
--sort is the matter.

@masatake
Copy link
Member

upstream-linux.tags is not small. It is about 3GB. I accessed this file with citre/emacs as part of my daily job frequently. The tags is generated with --sort=yes.

Assume a situation finding "schedule".

[yamato@dev64]~/.citre% ls -lh upstream-linux.tags
-rw-r--r--. 1 yamato yamato 3.0G Aug 21 18:22 upstream-linux.tags
[yamato@dev64]~/.citre% cat  upstream-linux.tags  > /dev/null
[yamato@dev64]~/.citre% time grep -e '^schedule\>' upstream-linux.tags|wc -l
39
grep --color=auto -e '^schedule\>' upstream-linux.tags  1.95s user 0.61s system 99% cpu 2.576 total
wc -l  0.00s user 0.00s system 0% cpu 2.575 total
[yamato@dev64]~/.citre% time readtags -t upstream-linux.tags schedule | wc -l                        
39
readtags -t upstream-linux.tags schedule  0.00s user 0.00s system 63% cpu 0.003 total
wc -l  0.00s user 0.00s system 70% cpu 0.001 total
[yamato@dev64]~/.citre% time grep -e '^schedule\>' upstream-linux.tags|wc -l 
39
grep --color=auto -e '^schedule\>' upstream-linux.tags  1.91s user 0.51s system 99% cpu 2.432 total
wc -l  0.00s user 0.00s system 0% cpu 2.431 total
[yamato@dev64]~/.citre% time readtags -t upstream-linux.tags schedule | wc -l
39
readtags -t upstream-linux.tags schedule  0.00s user 0.00s system 70% cpu 0.003 total
wc -l  0.00s user 0.00s system 65% cpu 0.002 total

readtags is much faster than grep because readtags uses binary search.
grep is slow? No. It is very fast.

[yamato@dev64]~/.citre% time readtags -t upstream-linux.tags -Q '(eq? $name "schedule")' -l | wc -l 
39
readtags -t upstream-linux.tags -Q '(eq? $name "schedule")' -l  12.54s user 0.89s system 99% cpu 13.489 total
wc -l  0.00s user 0.00s system 0% cpu 13.488 total

readtags name <<< grep < readtags -l

readtags name uses binary search.
readtags -l uses linear search.

I wonder whether "schedule" defined in linux is for function, struct or member.

[yamato@dev64]~/.citre% time readtags -t upstream-linux.tags -Q '(eq? $name "schedule")' -F '(list $kind "\n")' -l | sort | uniq -c | sort -nr
     16 member
     10 local
      7 parameter
      3 label
      1 prototype
      1 function
      1 export
readtags -t upstream-linux.tags -Q '(eq? $name "schedule")' -F  -l  12.62s user 0.92s system 99% cpu 13.598 total
sort  0.00s user 0.00s system 0% cpu 13.598 total
uniq -c  0.00s user 0.00s system 0% cpu 13.597 total
sort -nr  0.00s user 0.00s system 0% cpu 13.597 total
[yamato@dev64]~/.citre% time readtags -t upstream-linux.tags -F '(list $kind "\n")' schedule | sort | uniq -c | sort -nr 
     16 member
     10 local
      7 parameter
      3 label
      1 prototype
      1 function
      1 export
readtags -t upstream-linux.tags -F '(list $kind "\n")' schedule  0.00s user 0.00s system 74% cpu 0.002 total
sort  0.00s user 0.00s system 89% cpu 0.002 total
uniq -c  0.00s user 0.00s system 73% cpu 0.002 total
sort -nr  0.00s user 0.00s system 90% cpu 0.003 total

It seems that "schedule" is used frequently as the name for members.
However, "schedule" in Linux kernel is schedule():

[yamato@dev64]~/.citre% time readtags -t upstream-linux.tags -Q '(and (eq? $name "schedule") (eq? $kind "function"))' -F '(list $input ": " $line "=>" $pattern "\n")' -l
/home/yamato/var/linux/kernel/sched/core.c: 6563=>/^asmlinkage __visible void __sched schedule(void)$/
readtags -t upstream-linux.tags -Q  -F  -l  13.08s user 0.90s system 99% cpu 14.035 total
[yamato@dev64]~/.citre% time readtags -t upstream-linux.tags -Q '(eq? $kind "function")' -F '(list $input ": " $line "=>" $pattern "\n")' schedule 
/home/yamato/var/linux/kernel/sched/core.c: 6563=>/^asmlinkage __visible void __sched schedule(void)$/
readtags -t upstream-linux.tags -Q '(eq? $kind "function")' -F  schedule  0.00s user 0.00s system 66% cpu 0.003 total

@hirooih
Copy link
Contributor

hirooih commented Aug 31, 2022

Assume (list-tags "tags" having "d" as name) uses the same algorithm used in libreadtags.
It uses binary search.

Now I understand.
If a word to be searched is at the beginning of line, the current implementation of libreadtags can search it quickly by using binary search. That's it.

I never used readtags (and libreadtags) and don't know its implementation details. So I could not understand at all from your explanation yesterday:-)

I have two suggestions.

  • A reference tag which you are proposing on this thread is different from the reference tag documented. Please define a new terminology for it.
  • As I wrote above it is better to give "the new reference tag" a new kind, for example typeref.

I cannot stop you from adding the new reference tag to C parser.
I don't like adding it to Verilog parser, but I cannot stop you, again.

@masatake
Copy link
Member

A reference tag which you are proposing on this thread is different from the reference tag documented. Please define a new terminology for it.

How are they different?

@hirooih
Copy link
Contributor

hirooih commented Aug 31, 2022

How are they different?

I see. In the example of manual only followings are reference tags, and there are no reference of variables.

TYPE        reftag.c        /^#undef TYPE$/;"       d       file:
foo.h       reftag.c        /^#include "foo.h"/;"   h
stdio.h     reftag.c        /^#include <stdio.h>/;" h

d (define?) kind is used for the reference tag of TYPE.
I withdraw my two suggestions.

When you implement the reference tag for variables, let's add it on this examples.

@masatake
Copy link
Member

Oh, I see.
In the manual, I wrote (and should wrote) only implemented things.
In this issue, I have written about an imaginary (or ideal) parser.

d (define?) kind is used for the reference tag of TYPE.

d means macros.

$  ~/bin/ctags --list-kinds-full=C | grep ^d
d       macro      yes     no      2      C      macro definitions
$  ~/bin/ctags --list-roles=C.d | grep ^d
d/macro    condition off     used in part of #if/#ifdef/#elif conditions
d/macro    undef     on      undefined
$  cat /tmp/foo.h
#undef TEST
#ifdef TEST
#endif
$  ~/bin/ctags --fields=+Kr --extras=+r -o - /tmp/foo.h
TEST	/tmp/foo.h	/^#undef TEST$/;"	macro	roles:undef
$  ~/bin/ctags --fields=+Kr --extras=+r -o - --roles-C++.d=+'{condition}' /tmp/foo.h
TEST	/tmp/foo.h	/^#ifdef TEST$/;"	macro	roles:condition
TEST	/tmp/foo.h	/^#undef TEST$/;"	macro	roles:under

As I wrote above it is better to give "the new reference tag" a new kind, for example typeref.

In my understanding, your idea is "flat approach";use only kinds to categorize tags.
My idea is two-layer approach for categorizing;use kinds for the first level categorizing, and roles for the second level.

The number of kinds a parser can define is limited: [a-eg-z][A-Z].
So, I was afraid the flat approach can hit the limit.

I cannot stop you from adding the new reference tag to C parser.
I don't like adding it to Verilog parser, but I cannot stop you, again.

Understandable. Designing roles is wild West of ctags.
None knows how roles should be in a parser.

However, I found an obviously useful case.
#2428
Reference tags for external entities can help a client tool solve names and their locations.
So, about reference tags, I'm only working on extracting external entities.

@hirooih
Copy link
Contributor

hirooih commented Sep 1, 2022

However, I found an obviously useful case.
#2428

Which is "the obviously useful case" in this ticket?


BTW after understanding your intention, I have no reason to refuse the PR #3473.
Before reviewing the PR let me ask you one more question.

If you will add reference tags for C variable types, how will you name the role of them?

d       foo.c   /^  d x, y;$/;" typedef  typeref:typename:double      file:   roles:defType
P       foo.c   /^struct point P;$/;"   struct        typeref:struct:point    roles:defType (or defStruct)

In this example defType (or defStruct) is used.

From the manual:

If a tag is definition tag, the roles field has “def” as its value.

How about "declaration tag", the roles field "decl" as its value?


#3472 tags a reference to a module. "defInstance" is not a proper name.
It is same as reference tags for C variables. This is a typical use-case, so we should give it a general name.

But I found decl is already used for a name of roles. I hope the followings are similar use-cases...:

$ ./ctags --list-roles|grep decl
Basic          f/function        decl               on      declared
RpmSpec        p/patch           decl               on      declared for applying later

@masatake
Copy link
Member

masatake commented Sep 4, 2022

Which is "the obviously useful case" in this ticket?

I will add an explanation about it to docs/*.rst. Please, wait for a while.

If you will add reference tags for C variable types, how will you name the role of them?

This is a question hard to answer.
I don't have any established design or idea.
(An exception is about reference tags for external entities.)

If you will add reference tags for C variable types, how will you name the role of them?
...
In this example defType (or defStruct) is used.

After reading your comment decl looks better.

I wonder where a type is referenced in C language.
A. cast operation
B. sizeof operator
C. return type of a function
D. parameter and variable type of a parameter
E. typedef
F. struct/union member

decl can cover C, D, E, and F.


#3472 tags a reference to a module. "defInstance" is not a proper name.
It is same as reference tags for C variables. This is a typical use-case, so we should give it a general name.

But I found decl is already used for a name of roles. I hope the followings are similar use-cases...:

Generalizing across languages is impossible.
Reusing a role name for different purposes is o.k.

I assume a client tool supporting roles may work in the target language-specific way.
Actually, citre has a special for Verilog (https://github.com/universal-ctags/citre/blob/master/citre-lang-verilog.el). So using the same role in different languages never becomes an issue.

Even about kinds, we cannot expect parsers are well consistent.

@hirooih
Copy link
Contributor

hirooih commented Sep 4, 2022

decl can cover C, D, E, and F.

ref or `reference for A and B?

Generalizing across languages is impossible.
...
Even about kinds, we cannot expect parsers are well consistent.

I understand this and agree with you.
But it is still better not to be unnecessarily different.
I always refer the names of C parser when I name a kind or etc.

@masatake
Copy link
Member

masatake commented Sep 4, 2022

decl can cover C, D, E, and F.

ref or `reference for A and B?

I'm not sure. cast for A and size for B are my idea.

We can increase granularity:
C. freturnDecl
D. varDecl
E. typeDecl
F. memberDecl

None knows how the role should be. So there is no answer.
I think it is hard for us to keep compatibility.
(In this reason, I thought about parser versioning.)
It is up to what kind of tool will you develop based on the information available via roles field.

So just decl may be the best as starting.
referened or used are the most generic role name.
decl is a bit more specific. If we have really no idea, referened or used can be a choice. I will update my pull request with decl.

masatake added a commit to masatake/ctags that referenced this issue Sep 4, 2022
…e tags

This one is conceptually based on
universal-ctags/ctags@master...my2817:ctags:master
reported and written by @my2817 at universal-ctags#3469.

The test case is also taken from universal-ctags#3469 submitted by @my2817.
@hirooih
Copy link
Contributor

hirooih commented Sep 4, 2022

I'm not sure. cast for A and size for B are my idea.

We can increase granularity:
C. freturnDecl
D. varDecl
E. typeDecl
F. memberDecl

I see. I did not think of it. But this is one of the good choices.

masatake added a commit to masatake/ctags that referenced this issue Sep 4, 2022
…e tags

This one is conceptually based on
universal-ctags/ctags@master...my2817:ctags:master
reported and written by @my2817 at universal-ctags#3469.

The test case is also taken from universal-ctags#3469 submitted by @my2817.
masatake added a commit to masatake/ctags that referenced this issue Sep 4, 2022
…ce tags

This one is conceptually based on
universal-ctags/ctags@master...my2817:ctags:master
reported and written by @my2817 at universal-ctags#3469.

The test case is also taken from universal-ctags#3469 submitted by @my2817.
masatake added a commit to masatake/ctags that referenced this issue Sep 5, 2022
…ce tags

This one is conceptually based on
universal-ctags/ctags@master...my2817:ctags:master
reported and written by @my2817 at universal-ctags#3469.

The test case is also taken from universal-ctags#3469 submitted by @my2817.
@masatake
Copy link
Member

masatake commented Sep 6, 2022

@my2817 We have reimplemented all your ideas in more Universal Ctags way.

As the extended game of this issue, I will work on extracting include, and import.

@masatake masatake closed this as completed Sep 6, 2022
kumarstack55 pushed a commit to kumarstack55/ctags that referenced this issue Sep 11, 2022
…ce tags

This one is conceptually based on
universal-ctags/ctags@master...my2817:ctags:master
reported and written by @my2817 at universal-ctags#3469.

The test case is also taken from universal-ctags#3469 submitted by @my2817.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants