-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fparser2] capture symbol information #201
Comments
fparser2 implements the rules specified in the Fortran spec. However, in certain cases, more than one rule can match and it is not possible to determine which rule is valid without having access to symbol table information The two known cases are 1: statement function or array access, where it is not possible to determine whether a statement is a statement function or an array access when it is the latest declaration in a declaration block or the first executable statement in an execution block. 2: function or array access, where it is not possible to determine whether a statement is a (non-intrinsic) function or an array access. There may be a third where a symbol is referenced before it is declared e.g. Fparser2 should add a symbol table which it populates as code is parsed. Whether we need to then have the parser go in two phases or simply use the current information in the symbol table as we go along I do not know. Note, there will still be cases where we don't know what something is (e.g. due to it being declared in another module which has not been parsed and we need to decide what to do in this circumstance. We could abort, always parse all referenced modules (I don't think this is feasible in general), or have an additional node which says it is one of n matches i.e. we catch the ambiguity. |
This is also a problem when trying to distinguish between an array slice and a character string section.
See failing test in This is due to two sub-rules in |
Since I think the following is related / can be resolved by the same means, I put it here: Variable names, procedure names, etc. can shadow intrinsic functions. Consider the following example:
The problem is the next to last line, where it is not obvious that this is an array element access and not a function call. |
(Sorry for the double-post: meant to append the following but hit Ctrl+Enter instead of Enter. That happens when you write slack in another window at the same time.) There are even more evil situations possible when the offending name is defined in a used module, for example:
(I will neither confirm nor deny that such beauty exists in a, say, operational code base) Any ideas how to overcome such situations? I can also put this in a separate issue since it might not even be mitigated by a symbol table alone but could require parsing the used module first. |
The operational code base that you may or may not be referring to would not be the only one. Another more liquid oriented model also has such a lovely example. Actually I happen to be writing some (minimal) documentation that explains the general problem of matching multiple rules but that does not help solve the problem. The current plan is to keep symbol and context information as we go along and then use that in rules to check constraints. This will sort out many of the ambiguities. In your first ibits example this will sort the problem out as we will be able to determine that ibits is actually an array and therefore not match it as an intrinsic. Your second example is a general problem which compilers solve using .mod files. If we keep symbol and context information we should know what has not been defined. At this point we are thinking of having various options At the moment we do 4. I don't like 3. unless a user explicitly asks for this to happen. I think we should try to do 1. and allow an option for 2. and in fact 1. could actually produce a file for 2. for future reference (i.e. our simple version of a mod file). |
Thanks for the quick reply! The ibits-example was indeed intended as one of those cases that can be fixed fairly easy once symbol information is kept and might be worth testing against. I agree with your opinion on the second example. In our downstream tool we also do a very simplistic version of 1. to fill in type information. |
A certain liquid-orientated model has code that redefines |
Another example is the false matching of a structure constructor as an array access (designator). In general we would need to know whether the name was the name of a structure or the name of an array. |
Add basic symbol table functionality (towards #201)
Some of the things we need to do:
use
statementsThe text was updated successfully, but these errors were encountered: