Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

main: use the interval tree for filling scope field #3678

Merged
merged 26 commits into from
Dec 11, 2023

Conversation

masatake
Copy link
Member

@masatake masatake commented Mar 26, 2023

CXX main: insert tags to the interval table when "end:" field is set

Close #3637.

With this change, the combination of {scope=_intervaltab}{postrun}
regex flags works.

$ cat /tmp/input.c
int
foo(void)
{
        switch (x)
        {
        case A:
                break;
        case B:
                break;
        }
        return 0;
}

int
bar(void)
{
        switch (x)
        {
        case C:
                break;
        case D:
                break;
        }
        return 0;
}

$ cat args.ctags
--sort=no
--kinddef-C=H,caseHandler,case handler
--regex-C=/^[ \t]*case[ \t]*+([a-zA-Z0-9_]+):.*/\1/H/{scope=_intervaltab}{postrun}

$ ../../../ctags --options=NONE --options=args.ctags -o - /tmp/input.c
foo /tmp/input.c    /^foo(void)$/;" f       typeref:typename:int
bar /tmp/input.c    /^bar(void)$/;" f       typeref:typename:int
A   /tmp/input.c    /^      case A:$/;"     H       function:foo
B   /tmp/input.c    /^      case B:$/;"     H       function:foo
C   /tmp/input.c    /^      case C:$/;"     H       function:bar
D   /tmp/input.c    /^      case D:$/;"     H       function:bar

The scope fields for A, B, C, and D are filled well.

TODO

  • write about scope=interval in the man pages
  • rename _interval to interval
  • handle the case when the starting line of a tag is changed.
  • mline-regex-flags
  • optscript interface
  • optimize, reducing vStringNCopyS
  • add a case for updating lineNumber
  • add more description for the commit main: delay reading all inputs when the parser just prepares its base parser

@codecov
Copy link

codecov bot commented Mar 26, 2023

Codecov Report

Attention: 73 lines in your changes are missing coverage. Please review.

Comparison is base (4cd897e) 85.20% compared to head (45457aa) 85.25%.

Files Patch % Lines
main/rbtree.c 68.63% 53 Missing ⚠️
main/lregex.c 85.55% 13 Missing ⚠️
main/entry.c 90.76% 6 Missing ⚠️
main/rbtree_augmented.h 93.75% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3678      +/-   ##
==========================================
+ Coverage   85.20%   85.25%   +0.05%     
==========================================
  Files         229      230       +1     
  Lines       55369    55502     +133     
==========================================
+ Hits        47175    47320     +145     
+ Misses       8194     8182      -12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@masatake masatake changed the title CXX,lregex: use the interval tree for filling scope CXX,lregex: use the interval tree for filling scope field Mar 26, 2023
man/ctags-optlib.7.rst.in Outdated Show resolved Hide resolved
@masatake masatake force-pushed the lregex--intervaltab branch 2 times, most recently from 5e3ede5 to ad68d6e Compare April 3, 2023 20:49
@masatake masatake changed the title CXX,lregex: use the interval tree for filling scope field lregex: use the interval tree for filling scope field Apr 3, 2023
@masatake masatake changed the title lregex: use the interval tree for filling scope field main: use the interval tree for filling scope field Apr 3, 2023
@masatake
Copy link
Member Author

masatake commented Apr 3, 2023

I added insertToIntervalTabMaybe in a commit and removed it in another commit in this pull request.
I should not do such a try-and-error in a pull request.

I must reorganize this pull request.

@masatake masatake force-pushed the lregex--intervaltab branch 5 times, most recently from 41168a8 to 66b297d Compare April 5, 2023 23:05
@masatake masatake marked this pull request as ready for review April 5, 2023 23:06
@masatake masatake force-pushed the lregex--intervaltab branch 2 times, most recently from 1a6fa63 to 6d2cb7a Compare April 9, 2023 00:19
@masatake
Copy link
Member Author

masatake commented Apr 9, 2023

@pidgeon777 Could you review man/ctags-optlib.7.rst.in ?

I added three items to our man page:

  • the postrun flag,
  • the scope=intervaltab flag, and
  • an example of "using the interval table"

I'm not good at English. So I wonder if what I wrote makes sense.

@masatake masatake force-pushed the lregex--intervaltab branch 2 times, most recently from 2ac2237 to 3ba4e09 Compare April 9, 2023 15:44
@pidgeon777
Copy link

pidgeon777 commented Apr 16, 2023

@pidgeon777 Could you review man/ctags-optlib.7.rst.in ?

I added three items to our man page:

  • the postrun flag,
  • the scope=intervaltab flag, and
  • an example of "using the interval table"

I'm not good at English. So I wonder if what I wrote makes sense.

Later today I'll give it a look 👍.

@masatake
Copy link
Member Author

masatake commented May 9, 2023

I found a bug about running nested subparsers. It is a stopper of merging this pull request.

@masatake masatake force-pushed the lregex--intervaltab branch 2 times, most recently from 3c092e1 to 69063b0 Compare December 9, 2023 17:59
@masatake masatake added this to the 6.1 milestone Dec 9, 2023
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
(cherry picked from commit 572d6b1)
…hen setting

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
(cherry picked from commit d64f762)
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
(cherry picked from commit 6363e5e)
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
(cherry picked from commit 9972125)
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
(cherry picked from commit 62d1d47)
…fensive

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
(cherry picked from commit 627bf65)
…y when setting

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
(cherry picked from commit 3a1949f)
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
(cherry picked from commit 179b20e)
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
(cherry picked from commit 306a556)
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
(cherry picked from commit 6c774ab)
…h/interval-tree (c920eeb)

rbtree.[ch] and rbtree_augmented.h are imported from the repo.

Reapply the following changes for adjusting the source to the ctags source tree.

* clean up whitespaces (c5f7501),
* adapt to ctags coding style (b85d98b),
  + provide more portable container_of
  + rename the top-level ifdef conditoin guarding from including .h twice,
  + use CTAGS_INLINE instead of inline for portability,
  + use CTAGA_ATTR_ALIGNED instead of __attribute__((aligned(sizeof(long)))),
* apply suggestions from code review (for windows) (e9626ab)
  + `unsigned long` is 32-bit on 64-bit Windows, `uintptr_t` should be used instead.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
(cherry picked from commit 1097d30)
…sheh/interval-tree (c920eeb)

Apply the following changes for adjusting the source to the ctags
source tree.

* clean up whitespaces (c5f7501), and
* use CTAGS_INLINE instead of inline for portability(b85d98b).

This commit is for just importing for implementing interval tables in
u-ctags in the future.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
(cherry picked from commit b61865f)
We call the interval tree "intervaltab".

If the cork queue is enabled in a parser, a tag entry
having non-zero value as its "end:" field is stored to
the intervaltab.

We can search tags in intervaltab with

   int queryIntervalTabByLine(unsigned long lineNum),
   int queryIntervalTabByRange(unsigned long start, unsigned long end), or
   int queryIntervalTabByCork(int corkIndex).

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
(cherry picked from commit 4a2d580)
…ember is updated

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
(cherry picked from commit 30bc335)
Close universal-ctags#3637.

With this change, the combination of {scope=intervaltab}{postrun}
regex flags works. The flags allows you can use tags extracted with
built-in C code (including C code translated with optlib2c) as the
source of scope information.

    $ cat /tmp/input.c
    int
    foo(void)
    {
	    switch (x)
	    {
	    case A:
		    break;
	    case B:
		    break;
	    }
	    return 0;
    }

    int
    bar(void)
    {
	    switch (x)
	    {
	    case C:
		    break;
	    case D:
		    break;
	    }
	    return 0;
    }

    $ cat args.ctags
    --sort=no
    --kinddef-C=H,caseHandler,case handler
    --regex-C=/^[ \t]*case[ \t]*+([a-zA-Z0-9_]+):.*/\1/H/{scope=intervaltab}{postrun}

    $ ../../../ctags --options=NONE --options=args.ctags -o - /tmp/input.c
    foo	/tmp/input.c	/^foo(void)$/;"	f	typeref:typename:int
    bar	/tmp/input.c	/^bar(void)$/;"	f	typeref:typename:int
    A	/tmp/input.c	/^	case A:$/;"	H	function:foo
    B	/tmp/input.c	/^	case B:$/;"	H	function:foo
    C	/tmp/input.c	/^	case C:$/;"	H	function:bar
    D	/tmp/input.c	/^	case D:$/;"	H	function:bar

The scope fields for A, B, C, and D are filled well.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
(cherry picked from commit 15ea95d)
… parser

This commit fixes a bug appeared in SUBPARSER_SUB_RUNS_BASE
base/subparser combination where --regex-<LANG> is used for
implementing the subparser.

In such a combination, the subparser may call
scheduleRunningBaseparser() in its "parser" method.
Calling scheduleRunningBaseparser is only for running
the base parser. The base parser may run the subparser.

The expected steps are <E>:
1. A subparser schedules running its base parser.
2. The base parser runs.
   The patterns of --lang-<SUBPARSER> options are applied during in
   this step.

However, the original code did <W>:
1. A subparser schedules running its base parser.
   The patterns of --lang-<SUBPARSER> options are applied after
   scheduling.
2. The base parser runs.

This out of order execution didn't issue because the extracted tags
are the same in <E> and <W>. However, the execution can be a trouble
if the application of the pattern of --lang-<SUBPARSER> option
accesses the information of tags already extracted by the base parser.

In <E>, there are no information of tags when applying the
patterns because the base parser is not run yet at the step 1.

Currently ctags doesn't provide the way for accessing the information
of tags already extracted by a base parser to its subparser. But we
have a plan to introduce such a feature in soon.

TODO: a test case

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
(cherry picked from commit efe6413)
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
(cherry picked from commit 49eb201)
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
(cherry picked from commit d4c8105)
…ex patterns

The original code reads all inputs if the parser has mline-regex-<LANG> or
_mtable-regex-<LANG> patterns. It didn't consider for regex-<LANG> patterns.

That caused regex-<LANG> patterns defined in .ctags were never run
if a crafted built-in parser stoped parsing before reaching EOF.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
(cherry picked from commit fac3024)
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
(cherry picked from commit 2256ddf)
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
(cherry picked from commit db23b45)
…} flags

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
…fields

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
@masatake masatake merged commit f144bc8 into universal-ctags:master Dec 11, 2023
44 of 46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Question] Defining new regex-based tags with scope defined by built-in ctags kinds
2 participants