Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discard reference -- SIMICS-21584 #237

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

lwaern-intel
Copy link
Contributor

@lwaern-intel lwaern-intel commented Nov 9, 2023

I don't like the way the compatibility feature is implemented, but doing it at AST building level was the best I could do. Any attempt to try to do it at parser level fell apart.

@syssimics
Copy link
Contributor

Verification #12479725: fail

@syssimics
Copy link
Contributor

Verification #12484192: pass

@syssimics
Copy link
Contributor

Verification #12510780: pass

Comment on lines 1210 to 1215
# TODO/HACK: The discard ref is exposed like this to allow it to be as
# keyword-like as possible while still allowing it to be shadowed.
# Once we remove support for discard_ref_shadowing the discard ref
# should become a proper keyword and its codegen be done via dedicated
# dispatch
if name == '_' and tree.site.dml_version() != (1, 2):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not suffice as a reminder, since the comment is not adjacent to the code that will be touched when the compat is removed. One may argue that you mention discard_ref_shadowing in the comment so grep will find it, but you may consider a # when removing, remember to also fix .. comment near the enabled_compat-checking code.

Copy link
Contributor Author

@lwaern-intel lwaern-intel Nov 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the test? Not next to the declaration of the compatibility feature itself?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of the code in dmlparse that checks enabled_compat, but I suppose compat.py makes even more sense.

else:
init = eval_initializer(site, tgt.ctype(), src_ast, location,
scope, False)
name = 'tmp%d' % (i,)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_tmp%d? (because what if extern int tmp0(void); exists and the initializer of a subsequent assignment calls it)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, the problem was not introduced by you. Still valid but not blocking then.

Copy link
Contributor Author

@lwaern-intel lwaern-intel Nov 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to note that any issue this could only possibly cause would be incredibly niche. In particular, this could only present an issue if -g is switched on, because otherwise the C variable name will get uniquified. (Notice that all codegen involved here uses scope and not lscope, so DML resolution will never take these temporary variables into account; the only possible issue is if the wrong thing happens at C level).

But yes; if -g is switched on, and the initializer references some C literal tmp0 (made accessible via extern), then this could present an issue. But it's rather far-fetched.

py/dml/compat.py Outdated
@feature
class discard_ref_shadowing(CompatFeature):
'''This compatibility feature allows declarations (within methods or
objects) to be named '_'. This will cause the discard reference `_` to be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/'_'/'\_'/, or backtick. Unescaped underscores in markdown are scary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh right

@@ -2632,6 +2659,12 @@ def ident(t):
def ident(t):
t[0] = t[1]

def ident_enforce_not_discard_ref(site, ident):
if (str(ident) == '_'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

str should be redundant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably. I did it because reserved also does it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, that's probably python2 legacy, the unicode vs str vs bytes discrepancy. Everything is strings now (unicode errors are caught in toplevel), so this can be removed in reserved too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

t[0] = ast.foreach(site(t), t[2], t[5], t[7])

@prod_dml14
def hashforeach(t):
'statement_except_hashif : HASHFOREACH ident IN LPAREN expression RPAREN statement'
ident_enforce_not_discard_ref(site(t, 2), t[2])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of these calls. Perhaps easier to make a single call in the ident production rule? And create a separate ident_or_underscore production rule, which would be like ident but without that call?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. I tried, but ran into reduce/reduce conflicts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, but there should be some middle path: There should be no reduce/reduce after HASHFOREACH, so there you can use ident, and in other cases (the start token of statement, I suppose) there will be reduce/reduce so for those can do ident_or_underscore with a manual ident_enforce_not_discard_ref. Could make sense if it's not a vast majority of all cases?

Copy link
Contributor Author

@lwaern-intel lwaern-intel Nov 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such ad-hoc approaches would be very fragile to parser changes (i.e. a change could force rewriting an existing ident to ident_or_underscore) not to mention would introduce an inconsistency (either ident_or_underscore is used or a manual ident_enforce_not_discard_ref is.) This would only serve to make the parser even more difficult to understand. So I feel it'd be worse.

I agree that the ident_enforce_not_discard_ref is sub-par, and would welcome any superior approach which is at least as general and consistently applicable. ident_decl (i.e. ident with the check and warning) would be preferrable, if LALR didn't ruin our day.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue the opposite wrt fragility: your current approach requires that any new rule that uses ident remembers to call ident_enforce_not_discard_ref, if not the deprecation will silently be incomplete.

But we probably just misunderstand each other. More efficient if I try to code up what I mean, perhaps I will understand better when I see the R/R conflicts myself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made an attempt, now pushed to this PR.

Copy link
Contributor

@mandolaerik mandolaerik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved if you fix or dismiss remaining comments.

py/dml/ctree.py Outdated
rt = safe_realtype_shallow(typ)
# There is a reasonable implementation for this case (memcpy), but it
# never occurs today
assert not isinstance(typ, TArray)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sloppy. should use rt instead, as should isinstance(typ, TExternStruct).

@@ -5,6 +5,8 @@
dml 1.4;
device test;

/// DMLC-FLAG --no-compat=discard_ref_shadowing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this, the test passed in t126 before I added objident_or_underscore.

However, I suppose the error would still have been captured in the 7 build, so one may argue that adding the flag reduces test coverage in CI (the file is never compiled without the flag). So perhaps this commit should be dropped.

@syssimics
Copy link
Contributor

Verification #12518623: fail

@syssimics
Copy link
Contributor

Verification #12518616: pass

if compat.discard_ref_shadowing not in dml.globals.enabled_compat:
if (compat.discard_ref_shadowing not in dml.globals.enabled_compat
# forgive the `param _ auto` declaration
and not site(t).filename().endswith('dml-builtins.dml'):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't have a param _ auto declarqtion!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, sorry, I was so sure we had that I didn't even bother to check. So the top commit is rubbish, now removed.

raise ESYNTAX(site(t, 1), str(t[1]), "reserved word")

@prod_dml14
@lex.TOKEN(ident_rule('ident', ['_']))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just do

ident : _

also in my WIP I called the token DISCARDREF.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_ follows the convention that identifier-formed tokens use that as their name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And thanks for spotting my stupid copy-paste

@mandolaerik mandolaerik force-pushed the lw/21584 branch 2 times, most recently from 8cfdcb9 to 9c1b160 Compare November 17, 2023 09:12
Allowed for method-local bindings and index variables.
Method-local bindings named '_' are not added to scope
Index variables named '_' do not get a parameter created for them
@syssimics
Copy link
Contributor

Verification #12598423: fail

@syssimics
Copy link
Contributor

Verification #12598439: fail


"\_" may be used as an identifier for local variables, as well as other
method-local bindings such as the method parameters, and the bound identifier
in `foreach`/`#foreach`/`#select` statements, and message component paramters of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/paramters/parameters

Copy link
Contributor

@mandolaerik mandolaerik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Two remarks:

  • I would vote for also adding struct memebers named _. This can just as well be added in a separate PR, though.
  • Looks like you preserve old behaviour in 1.2, thus introducing a discrepancy between 1.2 and 1.4. Doing the same in 1.2 and 1.4 would possibly give less code, but OTOH this doesn't really matter (dying code) and the code is already written.

@@ -91,6 +91,9 @@ def lookup(self, name, local = False):
def add(self, sym):
if not isinstance(sym, Symbol):
raise TypeError(repr(sym) + " is not a Symbol")
if (sym.name == '_' and sym.site is not None
and sym.site.dml_version() != (1, 2)):
raise ICE(sym.site, "Symbol with name '_' added to Symbtab")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

symbtab

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants