Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add nim language parser and fix makefile #401

Closed
wants to merge 1 commit into from

Conversation

jangko
Copy link
Contributor

@jangko jangko commented Jun 26, 2015

and now, i have test it with valgrind on ubuntu 15.04

@masatake
Copy link
Member

Could you try fuzz target like:

make fuzz LANGUAGES=Nim VG=1 SHRINK=1

The target tries breaking your parser.

valgrind --leak-check=full ./ctags --options=NONE ---kinds= --fields=* --libexec-dir=./libexec --data-dir=./data -G -o - --language-force=Nim Units/php-bug681824.d/input.php

With above command line, valgrind reports some problems:

==17795== Invalid read of size 1
==17795== at 0x4A0AC13: strcmp (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==17795== by 0x406BC0: lookupKeyword (keyword.c:146)
==17795== by 0x42895F: endOperator.isra.32 (nim.c:848)
==17795== by 0x428A17: getOperator (nim.c:875)
==17795== by 0x428DA9: parseToken (nim.c:1285)
==17795== by 0x428DA9: rawGetTok (nim.c:1530)
==17795== by 0x428DA9: getToken (nim.c:1536)
==17795== by 0x429568: getTok (nim.c:1595)
==17795== by 0x429568: debug_token (nim.c:1620)
==17795== by 0x430DF0: newCommentStmt (nim.c:3640)
==17795== by 0x430DF0: simpleStmt (nim.c:4129)
==17795== by 0x42CE33: complexOrSimpleStmt (nim.c:4228)
==17795== by 0x42D427: parseAll (nim.c:4318)
==17795== by 0x42D427: findNimTags (nim.c:4330)
==17795== by 0x40EFFF: createTagsForFile (parse.c:1687)
==17795== by 0x4105DE: createTagsWithFallback (parse.c:1731)
==17795== by 0x4105DE: parseFile (parse.c:1809)
==17795== by 0x40959D: createTagsForEntry (main.c:241)
==17795== Address 0x4cf3a20 is 0 bytes inside a block of size 32 free'd
==17795== at 0x4A08B5D: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==17795== by 0x411E9A: eRealloc (routines.c:241)
==17795== by 0x413B7F: vStringResize (vstring.c:34)
==17795== by 0x413B7F: vStringAutoResize (vstring.c:52)
==17795== by 0x413B7F: vStringPut (vstring.c:98)
==17795== by 0x429188: scanComment (nim.c:779)
==17795== by 0x429188: parseToken (nim.c:1273)
==17795== by 0x429188: rawGetTok (nim.c:1530)
==17795== by 0x429188: getToken (nim.c:1536)
==17795== by 0x429568: getTok (nim.c:1595)
==17795== by 0x429568: debug_token (nim.c:1620)
==17795== by 0x42AF20: identOrLiteral (nim.c:2306)
==17795== by 0x42AF20: primary (nim.c:3044)
==17795== by 0x42C2A8: simpleExprAux (nim.c:2462)
==17795== by 0x42C364: simpleExpr (nim.c:2470)
==17795== by 0x42DB45: parseExpr (nim.c:2897)
==17795== by 0x42A400: primarySuffix (nim.c:2420)
==17795== by 0x42AC04: primary (nim.c:3049)
==17795== by 0x42A4C9: primary (nim.c:2935)
==17795==
==17795== Invalid read of size 1
==17795== at 0x4A0AC13: strcmp (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==17795== by 0x406ADB: addKeyword (keyword.c:123)
==17795== by 0x4289AB: endOperator.isra.32 (nim.c:852)
==17795== by 0x428A17: getOperator (nim.c:875)
==17795== by 0x428DA9: parseToken (nim.c:1285)
==17795== by 0x428DA9: rawGetTok (nim.c:1530)
==17795== by 0x428DA9: getToken (nim.c:1536)
==17795== by 0x429568: getTok (nim.c:1595)
==17795== by 0x429568: debug_token (nim.c:1620)
==17795== by 0x430DF0: newCommentStmt (nim.c:3640)
==17795== by 0x430DF0: simpleStmt (nim.c:4129)
==17795== by 0x42CE33: complexOrSimpleStmt (nim.c:4228)
==17795== by 0x42D427: parseAll (nim.c:4318)
==17795== by 0x42D427: findNimTags (nim.c:4330)
==17795== by 0x40EFFF: createTagsForFile (parse.c:1687)
==17795== by 0x4105DE: createTagsWithFallback (parse.c:1731)
==17795== by 0x4105DE: parseFile (parse.c:1809)
==17795== by 0x40959D: createTagsForEntry (main.c:241)
==17795== Address 0x4cf3a20 is 0 bytes inside a block of size 32 free'd
==17795== at 0x4A08B5D: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==17795== by 0x411E9A: eRealloc (routines.c:241)
==17795== by 0x413B7F: vStringResize (vstring.c:34)
==17795== by 0x413B7F: vStringAutoResize (vstring.c:52)
==17795== by 0x413B7F: vStringPut (vstring.c:98)
==17795== by 0x429188: scanComment (nim.c:779)
==17795== by 0x429188: parseToken (nim.c:1273)
==17795== by 0x429188: rawGetTok (nim.c:1530)
==17795== by 0x429188: getToken (nim.c:1536)
==17795== by 0x429568: getTok (nim.c:1595)
==17795== by 0x429568: debug_token (nim.c:1620)
==17795== by 0x42AF20: identOrLiteral (nim.c:2306)
==17795== by 0x42AF20: primary (nim.c:3044)
==17795== by 0x42C2A8: simpleExprAux (nim.c:2462)
==17795== by 0x42C364: simpleExpr (nim.c:2470)
==17795== by 0x42DB45: parseExpr (nim.c:2897)
==17795== by 0x42A400: primarySuffix (nim.c:2420)
==17795== by 0x42AC04: primary (nim.c:3049)
==17795== by 0x42A4C9: primary (nim.c:2935)
==17795==

@jangko
Copy link
Contributor Author

jangko commented Jun 26, 2015

aha!, turn out I already misused addKeyword without realizing that it actually never make a copy of the string. so the problem can be solved if i maintain a separate list of keyword that dinamically created during parsing.

thanks to @masatake for pointing the problem. I will fix it soon, and make extensive test with the fuzz

kinda like playing tower defense video game, the bugs flooding in when we make more defense in the code

exciting though

@masatake
Copy link
Member

The critical difference from video games is that you can enjoy the result of clearing the stage with all nim programmers who need code completion, code browsing and code navigation.

Happy hacking.

@jangko jangko force-pushed the master branch 2 times, most recently from c3d5e0a to d6355ac Compare June 29, 2015 11:40
@jangko
Copy link
Contributor Author

jangko commented Jun 29, 2015

I already fixed many things related to previous issues:

  • make units LANGUAGES=Nim VG=1
  • make fuzz LANGUAGES=Nim VG=1 SHRINK=1
  • valgrind --leak-check=full ./ctags --options=NONE ---kinds= --fields=* --libexec-dir=./libexec --data-dir=./data -G -o - --language-force=Nim Units/php-bug681824.d/input.php
  • add more test units
  • also test with injected noise for few characters: 'a', '0', '!', '@', '#', '$', '%', '^'(so slow)

are there more ways to break the parser?

@masatake
Copy link
Member

sparse says(make SPARSE=1):

In parseReturnOrRaise, kind is defined twice. The parameter kind is not used.

In parseReturnOrRaise nimKind value is assigned to TNodeKind typed variable. Really dangerous.

@masatake
Copy link
Member

Generally I don't care the detail of styles.
However it must be consistent in a file.

I'm talking about the position of braces.

In some lines you write:

if (...) {

In another lines you write:

if (...)
{
   ...

In yet another lines you write:

if (...)
  {
   ...

Sometimes you don't put space after if/while before '('. Sometimes you don't.

( Too large to review at once. I will do it incrementally.)

@masatake
Copy link
Member

The initial if and while at the top of fillLexer can be unified with do/while.

@masatake
Copy link
Member

column filed of nimParser is used?

@jangko
Copy link
Contributor Author

jangko commented Jun 30, 2015

The initial if and while at the top of fillLexer can be unified with do/while. 

done

column field of nimParser is used?

not used, i already removed it. You really have hawkeye, are you an ex-sniper or what?

sparse says(make SPARSE=1):
In parseReturnOrRaise, kind is defined twice. The parameter kind is not used.
In parseReturnOrRaise nimKind value is assigned to TNodeKind typed variable. Really dangerous.

I agree with you, that was very dangerous. Thats why I never regret turn to Nim and use C only occasionally, sometimes C can really dangerous if we are not really careful with it. fixed at two more places

about if/while and curly braces..... looks like I leave C behind me too long. fixed most of them if not all of them

@masatake
Copy link
Member

masatake commented Jul 1, 2015

I'm sorry but it will take long time for merging.

I wonder why this parser is long.
I am thinking two things:

  1. There are some dead-codes. column field you fixed is one of examples.
  2. There are some generic code that should be part of ctags/main part ideally.
    python.c had a nestlevel feature. It is now separated. The feature is put in nestleve.c.
    As the result python.c becomes small.

I'm thinking about writing a tool to find dead-code based on ctags's c-parser.

About 2. do you have any idea? You know well about nim.c:-P.

@jangko
Copy link
Contributor Author

jangko commented Jul 1, 2015

I'm sorry but it will take long time for merging.

i'am not in hurry. altough we are not making nuclear plant control system, we are still going to make better software. no need to be in hurry.

I wonder why this parser is long.

i know why. because Nim type system is more sophisticated than any other languages available in ctags' parsers directory. 😁 , don't take it seriuosly....

  1. There are some dead-codes. column field you fixed is one of examples.

i know about that, they were artefacts from official/original nim parser. I leave many 'do nothing' if/else blocks . I'll try to remove them(including irrelevant stuff) only if they have no side effects to the rest of the parser.

I'm thinking about writing a tool to find dead-code based on ctags's c-parser.

that would be awesome. 👍

There are some generic code that should be part of ctags/main part ideally. python.c had a
nestlevel feature. It is now separated. The feature is put in nestleve.c. As the result python.c
becomes small.

i'm going to investigate it, i also want to reduce the parser size.
altough both python and nim use indentation, in nim, indentation is part of the grammar, perhaps i need to rewrite much part of the parser if nestlevel.c is going to be used.
i will try to remove both dead-codes and irrelevant codes as much as possible, while thinking how to restructure the parser safely.

@masatake
Copy link
Member

Finally I got a time to write the tool.

The tool reports manyu unused enumerators and one typedef.

check "enumerators (values inside an enumeration)"
==============================
    nkAddr: 1 - NOT USED
    nkArgList: 1 - NOT USED
    nkBlockExpr: 1 - NOT USED
    nkBlockType: 1 - NOT USED
    nkBreakState: 1 - NOT USED
    nkCStringToString: 1 - NOT USED
    nkCharLit: 1 - NOT USED
    nkChckRange: 1 - NOT USED
    nkChckRange64: 1 - NOT USED
    nkChckRangeF: 1 - NOT USED
    nkCheckedFieldExpr: 1 - NOT USED
    nkClosedSymChoice: 1 - NOT USED
    nkClosure: 1 - NOT USED
    nkConstTy: 1 - NOT USED
    nkConv: 1 - NOT USED
    nkDerefExpr: 1 - NOT USED
    nkDotCall: 1 - NOT USED
    nkElifExpr: 1 - NOT USED
    nkElseExpr: 1 - NOT USED
    nkEnumFieldDef: 1 - NOT USED
    nkExportExceptStmt: 1 - NOT USED
    nkFastAsgn: 1 - NOT USED
    nkFloat128Lit: 1 - NOT USED
    nkFloat32Lit: 1 - NOT USED
    nkFloat64Lit: 1 - NOT USED
    nkFloatLit: 1 - NOT USED
    nkGotoState: 1 - NOT USED
    nkHiddenAddr: 1 - NOT USED
    nkHiddenCallConv: 1 - NOT USED
    nkHiddenDeref: 1 - NOT USED
    nkHiddenStdConv: 1 - NOT USED
    nkHiddenSubConv: 1 - NOT USED
    nkImportAs: 1 - NOT USED
    nkImportExceptStmt: 1 - NOT USED
    nkInt16Lit: 1 - NOT USED
    nkInt32Lit: 1 - NOT USED
    nkInt64Lit: 1 - NOT USED
    nkInt8Lit: 1 - NOT USED
    nkIteratorTy: 1 - NOT USED
    nkMetaNode_Obsolete: 1 - NOT USED
    nkMutableTy: 1 - NOT USED
    nkNone: 1 - NOT USED
    nkObjConstr: 1 - NOT USED
    nkObjDownConv: 1 - NOT USED
    nkObjUpConv: 1 - NOT USED
    nkOfInherit: 1 - NOT USED
    nkOpenSymChoice: 1 - NOT USED
    nkParForStmt: 1 - NOT USED
    nkPattern: 1 - NOT USED
    nkRange: 1 - NOT USED
    nkReturnToken: 1 - NOT USED
    nkSharedTy: 1 - NOT USED
    nkState: 1 - NOT USED
    nkStaticTy: 1 - NOT USED
    nkStmtListType: 1 - NOT USED
    nkStrLit: 1 - NOT USED
    nkStringToCString: 1 - NOT USED
    nkSym: 1 - NOT USED
    nkTableConstr: 1 - NOT USED
    nkType: 1 - NOT USED
    nkUInt16Lit: 1 - NOT USED
    nkUInt32Lit: 1 - NOT USED
    nkUInt64Lit: 1 - NOT USED
    nkUInt8Lit: 1 - NOT USED
    nkUIntLit: 1 - NOT USED
    tkAtomic: 1 - NOT USED
    tkColonColon: 1 - NOT USED
    tkEnd: 1 - NOT USED
    tkFunc: 1 - NOT USED
    tkInfixOpr: 1 - NOT USED
    tkInterface: 1 - NOT USED
    tkOut: 1 - NOT USED
    tkPostfixOpr: 1 - NOT USED
    tkPrefixOpr: 1 - NOT USED
    tkSpaces: 1 - NOT USED
    wAcyclic: 1 - NOT USED
    wAddr: 1 - NOT USED
    wAlign: 1 - NOT USED
    wAlignas: 1 - NOT USED
    wAlignof: 1 - NOT USED
    wAnd: 1 - NOT USED
    wAs: 1 - NOT USED
    wAsm: 1 - NOT USED
    wAsmNoStackFrame: 1 - NOT USED
    wAssertions: 1 - NOT USED
    wAtomic: 1 - NOT USED
    wAuto: 1 - NOT USED
    wBind: 1 - NOT USED
    wBlock: 1 - NOT USED
    wBool: 1 - NOT USED
    wBorrow: 1 - NOT USED
    wBoundchecks: 1 - NOT USED
    wBreak: 1 - NOT USED
    wBreakpoint: 1 - NOT USED
    wByCopy: 1 - NOT USED
    wByRef: 1 - NOT USED
    wCallconv: 1 - NOT USED
    wCase: 1 - NOT USED
    wCast: 1 - NOT USED
    wCatch: 1 - NOT USED
    wCdecl: 1 - NOT USED
    wChar: 1 - NOT USED
    wChar16_t: 1 - NOT USED
    wChar32_t: 1 - NOT USED
    wChecks: 1 - NOT USED
    wClass: 1 - NOT USED
    wClosure: 1 - NOT USED
    wCodegenDecl: 1 - NOT USED
    wColonColon: 1 - NOT USED
    wCompile: 1 - NOT USED
    wCompileTime: 1 - NOT USED
    wCompilerproc: 1 - NOT USED
    wComputedGoto: 1 - NOT USED
    wConcept: 1 - NOT USED
    wConst: 1 - NOT USED
    wConst_cast: 1 - NOT USED
    wConstexpr: 1 - NOT USED
    wConstructor: 1 - NOT USED
    wContinue: 1 - NOT USED
    wConverter: 1 - NOT USED
    wDeadCodeElim: 1 - NOT USED
    wDebugger: 1 - NOT USED
    wDecltype: 1 - NOT USED
    wDefault: 1 - NOT USED
    wDefer: 1 - NOT USED
    wDefine: 1 - NOT USED
    wDelegator: 1 - NOT USED
    wDelete: 1 - NOT USED
    wDeprecated: 1 - NOT USED
    wDestroy: 1 - NOT USED
    wDestructor: 1 - NOT USED
    wDirty: 1 - NOT USED
    wDiscard: 1 - NOT USED
    wDiscardable: 1 - NOT USED
    wDistinct: 1 - NOT USED
    wDiv: 1 - NOT USED
    wDo: 1 - NOT USED
    wDot: 1 - NOT USED
    wDouble: 1 - NOT USED
    wDynamic_cast: 1 - NOT USED
    wDynlib: 1 - NOT USED
    wEffects: 1 - NOT USED
    wElif: 1 - NOT USED
    wElse: 1 - NOT USED
    wEmit: 1 - NOT USED
    wEnd: 1 - NOT USED
    wEnum: 1 - NOT USED
    wEquals: 1 - NOT USED
    wError: 1 - NOT USED
    wExcept: 1 - NOT USED
    wExperimental: 1 - NOT USED
    wExplicit: 1 - NOT USED
    wExport: 1 - NOT USED
    wExportc: 1 - NOT USED
    wExtern: 1 - NOT USED
    wFalse: 1 - NOT USED
    wFastcall: 1 - NOT USED
    wFatal: 1 - NOT USED
    wFieldChecks: 1 - NOT USED
    wFinal: 1 - NOT USED
    wFinally: 1 - NOT USED
    wFloat: 1 - NOT USED
    wFloatchecks: 1 - NOT USED
    wFor: 1 - NOT USED
    wFriend: 1 - NOT USED
    wFrom: 1 - NOT USED
    wFunc: 1 - NOT USED
    wGcSafe: 1 - NOT USED
    wGeneric: 1 - NOT USED
    wGensym: 1 - NOT USED
    wGlobal: 1 - NOT USED
    wGoto: 1 - NOT USED
    wGuard: 1 - NOT USED
    wHeader: 1 - NOT USED
    wHint: 1 - NOT USED
    wHints: 1 - NOT USED
    wIf: 1 - NOT USED
    wImmediate: 1 - NOT USED
    wImplicitStatic: 1 - NOT USED
    wImport: 1 - NOT USED
    wImportCompilerProc: 1 - NOT USED
    wImportCpp: 1 - NOT USED
    wImportObjC: 1 - NOT USED
    wImportc: 1 - NOT USED
    wIn: 1 - NOT USED
    wInOut: 1 - NOT USED
    wInclude: 1 - NOT USED
    wIncompleteStruct: 1 - NOT USED
    wInfChecks: 1 - NOT USED
    wInheritable: 1 - NOT USED
    wInject: 1 - NOT USED
    wInjectStmt: 1 - NOT USED
    wInline: 1 - NOT USED
    wInt: 1 - NOT USED
    wInterface: 1 - NOT USED
    wIs: 1 - NOT USED
    wIsnot: 1 - NOT USED
    wIterator: 1 - NOT USED
    wLet: 1 - NOT USED
    wLib: 1 - NOT USED
    wLine: 1 - NOT USED
    wLinearScanEnd: 1 - NOT USED
    wLinedir: 1 - NOT USED
    wLinetrace: 1 - NOT USED
    wLink: 1 - NOT USED
    wLinksys: 1 - NOT USED
    wLocks: 1 - NOT USED
    wLong: 1 - NOT USED
    wMacro: 1 - NOT USED
    wMagic: 1 - NOT USED
    wMerge: 1 - NOT USED
    wMethod: 1 - NOT USED
    wMinus: 1 - NOT USED
    wMixin: 1 - NOT USED
    wMod: 1 - NOT USED
    wMutable: 1 - NOT USED
    wNamespace: 1 - NOT USED
    wNanChecks: 1 - NOT USED
    wNew: 1 - NOT USED
    wNil: 1 - NOT USED
    wNilchecks: 1 - NOT USED
    wNimcall: 1 - NOT USED
    wNoForward: 1 - NOT USED
    wNoInit: 1 - NOT USED
    wNoInline: 1 - NOT USED
    wNoconv: 1 - NOT USED
    wNodecl: 1 - NOT USED
    wNoexcept: 1 - NOT USED
    wNoreturn: 1 - NOT USED
    wNosideeffect: 1 - NOT USED
    wNot: 1 - NOT USED
    wNotin: 1 - NOT USED
    wNullptr: 1 - NOT USED
    wObjChecks: 1 - NOT USED
    wObject: 1 - NOT USED
    wOf: 1 - NOT USED
    wOff: 1 - NOT USED
    wOn: 1 - NOT USED
    wOneWay: 1 - NOT USED
    wOperator: 1 - NOT USED
    wOptimization: 1 - NOT USED
    wOr: 1 - NOT USED
    wOut: 1 - NOT USED
    wOverflowchecks: 1 - NOT USED
    wOverride: 1 - NOT USED
    wPacked: 1 - NOT USED
    wPassc: 1 - NOT USED
    wPassl: 1 - NOT USED
    wPatterns: 1 - NOT USED
    wPop: 1 - NOT USED
    wPragma: 1 - NOT USED
    wPrivate: 1 - NOT USED
    wProc: 1 - NOT USED
    wProcVar: 1 - NOT USED
    wProfiler: 1 - NOT USED
    wProtected: 1 - NOT USED
    wPtr: 1 - NOT USED
    wPublic: 1 - NOT USED
    wPure: 1 - NOT USED
    wPush: 1 - NOT USED
    wRaise: 1 - NOT USED
    wRaises: 1 - NOT USED
    wRangechecks: 1 - NOT USED
    wReads: 1 - NOT USED
    wRef: 1 - NOT USED
    wRegister: 1 - NOT USED
    wReinterpret_cast: 1 - NOT USED
    wRequiresInit: 1 - NOT USED
    wReturn: 1 - NOT USED
    wSafecall: 1 - NOT USED
    wSafecode: 1 - NOT USED
    wShallow: 1 - NOT USED
    wShl: 1 - NOT USED
    wShort: 1 - NOT USED
    wShr: 1 - NOT USED
    wSideeffect: 1 - NOT USED
    wSigned: 1 - NOT USED
    wSize: 1 - NOT USED
    wSizeof: 1 - NOT USED
    wStacktrace: 1 - NOT USED
    wStar: 1 - NOT USED
    wStatic: 1 - NOT USED
    wStatic_assert: 1 - NOT USED
    wStatic_cast: 1 - NOT USED
    wStdErr: 1 - NOT USED
    wStdIn: 1 - NOT USED
    wStdOut: 1 - NOT USED
    wStdcall: 1 - NOT USED
    wStruct: 1 - NOT USED
    wSubsChar: 1 - NOT USED
    wSwitch: 1 - NOT USED
    wSyscall: 1 - NOT USED
    wTags: 1 - NOT USED
    wTemplate: 1 - NOT USED
    wThis: 1 - NOT USED
    wThread: 1 - NOT USED
    wThreadVar: 1 - NOT USED
    wThread_local: 1 - NOT USED
    wThrow: 1 - NOT USED
    wTrue: 1 - NOT USED
    wTry: 1 - NOT USED
    wTuple: 1 - NOT USED
    wType: 1 - NOT USED
    wTypedef: 1 - NOT USED
    wTypeid: 1 - NOT USED
    wTypename: 1 - NOT USED
    wUnchecked: 1 - NOT USED
    wUndef: 1 - NOT USED
    wUnion: 1 - NOT USED
    wUnroll: 1 - NOT USED
    wUnsigned: 1 - NOT USED
    wUsing: 1 - NOT USED
    wVar: 1 - NOT USED
    wVarargs: 1 - NOT USED
    wVirtual: 1 - NOT USED
    wVoid: 1 - NOT USED
    wVolatile: 1 - NOT USED
    wWarning: 1 - NOT USED
    wWarnings: 1 - NOT USED
    wWatchPoint: 1 - NOT USED
    wWchar_t: 1 - NOT USED
    wWhen: 1 - NOT USED
    wWhile: 1 - NOT USED
    wWith: 1 - NOT USED
    wWithout: 1 - NOT USED
    wWrite: 1 - NOT USED
    wWrites: 1 - NOT USED
    wXor: 1 - NOT USED
    wYield: 1 - NOT USED
check "class, struct, and union members"
==============================
check "typedefs"
==============================
    TSpecialWords: 1 - NOT USED
check "variable definitions"
==============================
check "union names"
==============================
check "structure names"
==============================
check "function definitions"
==============================
    NimParser: 1 - THIS IS THE ENTRY POINT
check "macro definitions"
==============================

@masatake
Copy link
Member

You can use indent command at the top ctags srcdir like:

$ indent parsers/nim.c -o parsers/nim.c.out       

@jangko
Copy link
Contributor Author

jangko commented Aug 10, 2015

wow, that is awesome! i'll fix it as soon as i have more time.

@vhda
Copy link
Contributor

vhda commented Aug 10, 2015

I'm curious on how you created such a report.
Did you use a lint tool, like "splint", for example?

@masatake
Copy link
Member

@vhda, I used a tool named ctags.

#!/bin/bash

SRC=parsers/nim.c
ENTRY_POINT=NimParser

run_ctags()
{
    ./ctags --quiet --options=NONE -o - --excmd=number "$1"
}

count()
{
    while IFS=$'\t' read name file line kind scope file_scope; do
    if [[ "${kind}" = $1 ]]; then
        c=$(grep "\<$name\>" $SRC | wc -l)
        if [[ $c = 1 ]]; then
        echo -n "   $name": "$c"
        echo ' - NOT USED'
        elif  [[ $c = 0 ]]; then
        echo -n "   $name": "$c"
        echo ' - SCRIPT BUG'
        else
        : echo ' - ok used'
        fi

    fi
    done
}

check()
{
    ./ctags --list-kinds=C | grep ^$1 | {
    IFS=' ' read letter desc;
    echo check \"${desc}\"
    echo ==============================
    run_ctags $SRC | count $1
    }
}

for c in e m t v u s f d; do
    check $c
done

@masatake
Copy link
Member

I got time to think about this change.

@jangko, as you wrote, your parser comes from nim interpreter(compiler?) implementation itself.

So you may want to make the difference between your parser and the original code small.
In such case you may now want to remove unsed code from your parser. Removing them makes
the difference large.

In other hand, ctags people here want a parser to be small and follow the convention (e.g. coding style) of ctags. For modifications which make ctags people happy make the difference large.

This is the fundamental issue in merging your code. I would like to recognize this issue well.
If you don't understand well, tell me.

So the question is where are you standing on? Nim implementation side or ctags side?

If standing on Nim implementation side, we will accept your code with minimal modification. However, you have to be a primary maintainer of your parser.

If standing on ctags side, we have to rearrange your code massively. For the purpose we need much more sophisticated whitebox test cases. However, ctags people will maintain the parse with your help.

We have to agree with you about this basic policy.

@jangko
Copy link
Contributor Author

jangko commented Aug 14, 2015

nim interpreter(compiler?)

it is a true compiler with builtin VM to execute code at compiletime that provide excellent metaprogramming

So the question is where are you standing on? Nim implementation side or ctags side?

i understand well the issue, the way i built the parser reflecting which side i'm standing on
i choose to be with Nim implementation side, coz it will be easier to keep it up to date with any changes in the language side(it will be)

@masatake
Copy link
Member

@jangko, I see. o.k. In that case you don't have to remove unused code only if keeping the code helps you maintain the parser. Please, restore the code you removed during discussing here.

I will rethink the way for merging your parser. It takes some time.

@jangko
Copy link
Contributor Author

jangko commented Aug 14, 2015

@masatake: i also need to remove two more bugs in the parser that i found recently. don't bother to merge it soon

@masatake
Copy link
Member

I have to write one thing.

The purpose of the parser in the compiler and that of ctags are different.
The compiler one should reject wrong input.
The ctags one should try to accespt (if possible). e.g. it just ignores a syntactically wrong line.

@jangko
Copy link
Contributor Author

jangko commented Aug 14, 2015

The purpose of the parser in the compiler and that of ctags are different.

yes they are different. and nim.c already modified to do exactly like that since it's inception

@masatake
Copy link
Member

Good!

@@ -16,6 +16,8 @@ OPT = /O2

ctags: ctags.exe

dctags: dctags.exe

ctags.exe: respmvc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need this change in nim parser?
(I don't think so.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not only for nim parser, anyone who need to build debug mode will need that

@masatake
Copy link
Member

masatake commented Oct 2, 2015

I'm sorry for my working so slow.

It seems that your parser add keyword to the ctags keyword table with calling addKeyword during parsing. As far as I know the added keyword cannot be removed. So if you run ctags for two files like:

$ ctags a.nim b.nim
, keywords added during parsing a.nim affect on the result of parsing b.nim.

Is my guessing correct?

@masatake
Copy link
Member

masatake commented Oct 2, 2015

BTW, I recommend you not to use your master branch for hacking nim parser.

@jangko
Copy link
Contributor Author

jangko commented Oct 2, 2015

It seems that your parser add keyword to the ctags keyword table with calling addKeyword during parsing. As far as I know the added keyword cannot be removed......affect on the result of parsing b.nim

i am not aware of this. thank you.
nim parser really need to add keyword during parsing. is there any possibility ctags will provide local keyword table for parser, and not only global one?

and a signal for parser when input source is changed, so the parser know it is the time to free the local keyword table

@masatake
Copy link
Member

masatake commented Oct 2, 2015

How about htable.h and htable.c?
(However, they are not tested well.)

A function for hashing is needed....
See line 1151 of http://sourceforge.net/p/droite/code/HEAD/tree/es-lang-c-stdc99/src/es-lang-c-stdc99.c

ctags calls findNimTags for each input file.
So you can allocate and destory a hashtable for an input file at the head and end of findNimTags.

Before implementing, I recommend you to write a test case with which the current implementation generates unwanted tags file. "units" test assumes only one input file is given. Two test more than two input files, you must use "tmain".

@masatake masatake added Parsers and removed Parsers labels Oct 2, 2015
@data-man
Copy link

@jangko

Are you planning to finish this PR and adapt it to the current versions of Nim & ctags?
If not, I will continue with your permission. :)

@jangko
Copy link
Contributor Author

jangko commented Apr 1, 2018

If not, I will continue with your permission. :)

permission granted. thank you very much

@data-man
Copy link

Sorry, I missed @jangko's message. :(

@masatake: This PR can be closed. I'll create the new.

@jangko
Copy link
Contributor Author

jangko commented May 20, 2018

@data-man: 👍

@qingkong1998
Copy link

objectiveC language is supported well,and swift is not supported,I asked this question for three months or long ,I hope crags can improve supporting swift and objectiveC language,it can make convenient much more,thanks for all effort。

@qingkong1998
Copy link

other question is that can crags list the outline for supported language that is sorted every class function,variable
and so on due into they belonged to ,for example:
class A:
member b
member c
........

class D:
member e
member f
........

and so no like above,not just mixed list together?

@0xACE
Copy link

0xACE commented Feb 6, 2019

@jangko

Are you planning to finish this PR and adapt it to the current versions of Nim & ctags?
If not, I will continue with your permission. :)

@data-man do you still have plans to submit a patch for ctags to implement nim support?

I noticed there is a nim ctags in nim 0.19.4 although it ends up with basically a empty tags file...

@krux02
Copy link

krux02 commented Oct 22, 2019

I hope this PR gets reactivated at some point, because it would be useful for me.
I tried it in the Nim source directory. This is my result:

$ ctags -e $(find . -iname '*.nim')
ctags: main/keyword.c:125: addKeyword: Assertion `("Already in table" == ((void *)0))' failed.
SIGABRT

I recommend to drastically simplyfy the parser. I don't know what it does, but does too much in my opinion. It should only care for the most important parts:

  • top level when statements (parse both branches, skip condition)
  • top level function/method/macro/template definitions

Everything else can be skipped.

@0xACE
Copy link

0xACE commented Oct 22, 2019

@krux02 I had trouble with this PR so i created this 0xACE/ntags until things get better... That patch I applied is old, but atleast it works, I uploaded it just to ease your pain. When ntags doesn't point to a definition, I fallback to nimsuggest. I hope it makes things better for you.

Btw. I'm sort of tempted to move from vim to emacs. As i recall last time i checked nim-mode had some severe bugs, is it still troublesome to use?

@krux02
Copy link

krux02 commented Oct 22, 2019

Regarding emacs. I can't say that I like it, but on the other side I don't think there is anything better than it. My contribution to it is for the most part to disable all the crap that gets in the way of using emacs as a code editor. That means everything that blocks the input needs to die. The only thing that I actually improved is syntax highlighting, this works pretty fast my now, The rest I set to a state where it is disabled by default or I put in the readme a note how to disable it. But I think there is still some input blocking from nim-smie that I still don't understand. Regarding nimsuggest, I got used to not use it. I was too frustrated with it not working that I gave up on it long ago. ctags and jumping to compilation errors is enough for me. Then I also improved gdb support, which matters because emacs is a better gdb frontend than the terminal user interface (but not by much). So what works great so far:

  • syntax highlighting.
  • jump to errors from compile buffer
  • working with big files (if smie doesn't get in the way)
  • working with multiple projects.
  • ctags (I had a regular expression once that worked as well)

Since you work in vim, how is the world there? What is frustrating in the vim world? Since I thought about testing vim as well.

Almost forgot to say it. Emacs has its own TAGS file format. In ctags you generate it with ctags -e where -e is for emacs. So I don't think I can use your ntags tool.

@0xACE
Copy link

0xACE commented Oct 22, 2019

I see, if I work on ntags -e I'll let you know, but dont keep high expectations ntags is a very basic toy, a lot of the time I have to fallback to nimsuggest.

I use native vim which has poor support, and is very basic. But the good news is that leorize has built a good neovim plugin, it seems like the complete package for full nim support, I guess all it is missing is opening up documentation (unless he already implemented it)... Lately he has focused on making it easier to implement native vim support which I guess I might try to implement if I don't try out emacs first...

in my native vim setup i have:

  • syntax highlighting (basically a collection of regex i guess)
  • jump to errors from compile buffer
  • working with big files (not sure if the neovim plugin handles long lines well but I think he addresses it.)
  • working with multiple projects (well really all my setup is call is nim c or utilize PMunch's nimcr
  • ntags, and where ntags fails i fallback to nimsuggest

But i keep my setup very minimal...

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants