Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an "Unsigned" page to gotchas section #484

Closed
pmetras opened this issue Jan 10, 2022 · 8 comments
Closed

Add an "Unsigned" page to gotchas section #484

pmetras opened this issue Jan 10, 2022 · 8 comments

Comments

@pmetras
Copy link

pmetras commented Jan 10, 2022

For users that already know other programming language where unsigned integers don't exist, or even for new comers who never touched a programming language but have a mathematical culture, Pony use of unsigned integers as default type in numerous functions can lead to surprises.

This section should present explain that some assumptions that are true with signed integers are not with unsigned integers. For instance, in usual ranges of variables, x > y implies that x - 10 > y - 10. This is frequently not true with unsigned integers.

I wrote usual ranges of variables and frequently because this logical implication is as right with unsigned integers as with signed ones, but it becomes false when used on the limit of the integers definition interval. Unfortunately, for unsigned integers, developers tend to forget that this is [0, max_value] and as they manipulate more frequently low values, it's very easy to write expressions out of that definition interval. With the previous example, the implication is false if x = 15 and y = 5, for instance.

This type of gotcha can be avoided when the developer reminds to never substract unsigned integers.

But it can also happen without the developer writing a substraction when the operation is done by a Pony stdlib function using an unsigned integer as default. For instance, even if we use a value of -1 for the increment, the following code gives the expected result of not executing the loop as an integer cursor can't go up with a negative increment:

use "collections"

actor Main
  new create(env: Env) =>
    for i in Range[ISize](1, 5, -1) do
      env.out.print("i=" + i.string())
    end
    env.out.print("OK")

But if the user does not specify the type of the Range and uses the default instead, that is USize, writing the for line like for i in Range(1, 5, -1) do, then this loop is executed twice and prints the values 1 and 0... This is correct, or at least explainable to the baffled developer, as long as we consider that -1 can be set to an unsigned integer variable which is the case with the Pony language.

There are many traps where a developer not used to unsigned integers wrap around may fall, like testing that a variable is negative, and I think it would be worth a gotcha page.

@SeanTAllen
Copy link
Member

SeanTAllen commented Jan 10, 2022

This doesn't really have anything to do with being signed or unsigned. Integer overflow and underflow are applicable to both signed and unsigned integer math.

This is documented in the tutorial already.

https://tutorial.ponylang.io/expressions/arithmetic.html#ponys-default-integer-arithmetic

I don't think adding a gotcha for something that is already documented and is familiar from many other languages is required. There's nothing novel about this behavior in Pony. The gotchas section loses value with each entry that is added as there becomes more and more items that are likely to be familiar to readers causing them to potentially miss truly novel issues they might hit.

I think it could be reasonable to explain overflow and underflow more explicitly, with an example or two, for those who aren't familiar with them in the previously linked tutorial section where discussed.

@rhagenson
Copy link
Member

This seems like more of a wraparound gotcha than an unsigned integer gotcha. Nothing about unsigned integers change having to consider wraparound, but it does change at what value the wraparound happens at -- that being at 0. The default type, signed or unsigned, is based on the expected use-case of a function/class. For example, for Range it is more commonly needed to increment from positive to positive so it uses unsigned as the default.

@pmetras
Copy link
Author

pmetras commented Jan 10, 2022

I know that's a wrap around problem but for new comers coming from languages where unsigned integer don't exist like Java or Python, this type of behaviour can be surprising.

That's not really a gotcha for others already knowing C/C++ or languages where unsigned integer exist.

Yes, the wrap around behaviour of all types of integers has to be well explained in the page of the Tutorial that Sean mentioned, but it does not prevent showing examples where it occurs when the user does not expect them.

I think that the gotcha page is justified because unsigned integers are default types in Pony in many situations. Writing code that underflows is very easy with unsigned types. That's what I've tried to explain in the 2 examples when writing the issue. For sure, when you are accustomed to signed/unsigned integers, that's no more a gotcha. But gotchas are for Pony learner, aren't they? For how long have you been using Pony? Are you best to evaluate what are gotchas in that context?

@rhagenson
Copy link
Member

I have been using Pony for a short time and took it upon myself to becoming something of "The Tutorial Guy" so you are making quite the assumption that I have been using Pony too long to see the difficulties in learning it.

I have also taught programming in various capacities for the better part of 10 years from skills levels of "never programmed" to "founded multiple startups" so although I know I am imperfect in my understanding of the contexts everyone brings to Pony, I do have a decent gauge for generalizability.

Gotchas are for Pony learners, but the concepts which are a trip up some are not the same concepts that trip up others.

Your response here, in my opinion, goes against the CoC, specifically its social rules. Neither Sean nor I disagreed that the topic at hand can be a gotcha for some Pony learners; we did disagree that the focus should be on wraparound versus the original focus on unsigned integers.

Do you understand where your response suggesting that Sean and I, who are trying to identify the best route forward to solve the problem at hand, are somehow unqualified to be commenting on that route makes the Pony community less positive?

@pmetras
Copy link
Author

pmetras commented Jan 10, 2022

Sorry, I did not imply that you are unqualified but that someone with long experience with Pony can be blind to problems that new comers face and that the level of surprise that Pony code can give you is related to how long someone has been using it and exposed to them. I feel sorry that you take my writing as personal as it was not my intent. It was only to question how one can say it is a gotcha or not. Perhaps I'd rather link to Wikipedia definition where it explains that it must be counter-intuitive.

Regarding the Range example, I've tried to write it as simple as possible but make it visible where the problem is for a beginner to spot the problem, writing explicitly that the increment is negative. But when the increment and the interval bounds are given by the content of variables, this is less intuitive to determine when the default type USize will underflow. Perhaps the Range is a too much advanced example as Range has a complex behaviour depending on the type that could have its own gotcha page. But in that example, the user has no control on the arithmetic operations done by the class: is it using safe arithmetic or the checked one? So she does not expect to be surprise by its use.

The reason for this issue is that I wanted to emphasise that ubiquitous unsigned integers in Pony, contrarily to other languages where this type does not exist, can lead beginners to assume that they behave like normal integers (signed ones). And that's not the case in many situations where the use case underflows a value without the user expecting it. But also for instance writing:

  if x < 0 then
    // Do something
  end

and forgetting that this test will never trigger when x is unsigned. That's not a runtime underflow but using x outside of it's definition and the compiler does not prevent it nor warn of the possible unreachable code.

I think that the gotcha aspect is larger than the closed algebra of numbers wrapping around, but to the fact that when using unsigned integers in code you are working at the lower bound of a definition range and it is easy to fall on the other side if one is not cautious of what he writes.

@pmetras
Copy link
Author

pmetras commented Jan 11, 2022

Or another example in the mood of the previous one where the Do something is never executed, is

let x = get_value_from_somewhere()
if x < 0 then
  do_something()
end
if <= -1 then
  do_something_else()
end

where the do_something is never called while do_something_else is always called. If the reader did not get that get_value_from_somewhere returned an unsigned integer, because she is accustomed to another language where they don't exist, she will be surprised.

@rhagenson
Copy link
Member

Following from this issue and discussion on Sync Call 2022-01-11 around 3:00, we decided to reserve Gotchas for unique problems in Pony. This decision is to avoid a "sea of gotchas" where what is unique to Pony gets overlooked because we either do not have a dedicated section for these unique problems, or we allow such a section to involve too many definitions of "unique" -- rendering the term meaningless.

We are closing this thread as it stands to allow the Tutorial improvement to continue over on #486 where an expanded plan for explaining possibly unfamiliar topics around numerics -- namely number widths, signed vs unsigned integers, and overflow/underflow -- has been detailed.

@pmetras
Copy link
Author

pmetras commented Jan 14, 2022

As you want. Perhaps did not you understood that the gotcha that I wanted to explain is not about what is an unsigned integer, but about the impact of Pony choosing unsigned integers as its base type and the consequences of this choice. None of the examples that I tried to construct in this issue would occur if Pony had chosen to use signed integer instead, like many other languages do (even C or Rust!) and that it could surprise new comers. For a language that claims to be secure, that's a strange deisgn decision. There are so many pages on the Web about keeping away from unsigned integers in other languages that it is weird that Pony favours them in so many places as they can be potential pitfalls for new comers...

Nevertheless, improve the section about underflows/overflows, bit ranges, etc in #486. I don't think that many users would start learning programming with Pony as a first language and not knowing about these concepts that are not Pony specific either, but why not? I'll find other subjects or projects where I can better help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants