-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CHIP-0033: Add additional partial headers #114
CHIP-0033: Add additional partial headers #114
Conversation
Thanks, @felixbrucker! This CHIP has been assigned CHIP-33. It is now a |
Let me add a few use cases where this would greatly help the pool operator day-to-day:
(If it makes sense to add any of that directly into the CHIP, anyone please feel free to re-use the above as you see fit.) |
Don't you get the IP address of the farmer when it contacts the pool - either in the originating source address or maybe X-Forwarded-For? One problem is that the harvester may be harvesting plots that are not associated with your pool at all - so the various |
We do, but that's usually not helpful, as they typically all share the same outgoing public IP.
Yes, that's true for the capacity/plot count headers and is also addressed in the CHIP: "It will make sense to also add capacities and plot counts scoped to the plotnft of the proof of the partial". It'd probably make sense only sending these stats scoped to the plotnft/launcherid of the partial. |
X-Forwarded-For should have the original internal IP address in most cases - it does for me (although in my case it is an IPv6 addy). At least this is my understanding... |
I might suggest using some RFC language - perhaps saying something like implementations SHOULD (MUST?) only send |
X-Forwarded-For is added by HTTP proxies in the chain, and they use the IP they see. If you have a private/NATed IP (as is typical for IPv4 end user sites), all a proxy (or the final HTTP server) will see is the public IP you were NATed to. (The NAT itself operates on a lower layer and doesn't add HTTP headers.) For IPv6 the situation can be slightly different, if you have a full public subnet (like a /64 or /56) delegated to your end user site and your internal devices actually use that. (I assume that your IPv6 address being visible externally falls under this case.) |
Yes, as early already said, the farmer ip is neither suitable nor available for identification of the farmer (i'm assuming that's the point of bringing it up?) |
In the chip i just documented the current PRs state, earl has code for scoped capacity/plot counts based on the PR iirc, which can and should be integrated. Whether they are additive (both scoped and unscoped are included) or replacing (unscoped headers are not present) i'll leave up for debate, i'm fine with either. They should have distinctive names tho, indicating that it is the scoped capacity/plot count (which is the case for earls code) Maybe it makes sense to link it in here as well, and merge into the PR based on what the community prefers (see above). |
My concern here is that providing the farmer peer id, this proposal makes it harder for me as a farmer to protect my privacy. Not all farmers want to "show their hand", and make public the total amount of space they are farming. As it is, with the harvester id in the header, I have to run different harvesters to mask my total space: otherwise that value can be used to correlate me across different plotnfts and even across different pools. Running multiple harvesters is well-documented, so that is not an unsurmountable burden, and large farmers generally do that anyway. But if I now also have to run multiple farmers to avoid being correlated, that is a whole additional level of effort. If this proposal goes through, then people running a normal single-farmer setup will suddenly be completely exposed for the amount of space they have. I think privacy should be the default. We shouldn't be causing people to automatically give out information that can can be used to identify them. Maybe there are other approaches that could accomplish the same thing on an opt-in basis? Something as simple as a user-settable farmer "name" that is only returned if set? Then people can choose to allow their farmers to be identified if they want to. Ideally we'd also replace harvester id with a similar setup, but I suspect that's no longer possible at this point. |
Is this referring to the estimated space on pool A and B with different plot nfts but same farmer being able to be matched, assuming the farmer id is publicly accessible through both pools?
To be fair users of a normal setup generally do not split their harvesters to mask their capacity on different pools, and as such are already "completely exposed".
We can already identify the pseudonym by launcher id and harvester id, farmer id is just another part of the already established chain of software components in a farming setup which is not identified yet.
Having a name is confusing, because a name is not an identifier. If the concern is that the farmer id is matchable across pools why not combine it with a part of the plotnft, then it is scoped per plotnft but unique per farmer. We just need to make sure some part of it is still easily identifiable/matching the farmer peer id for humans, so users can match a pool side farmer to a physical farmer and give it an appropriate name.
That would be a breaking change in the pooling protocol |
Let me recap, just to make sure I understand the scenario correctly: a person farms a certain amount of space and wants to mask the total amount. So this person splits this space over several different plotnfts (to circumvent the obvious matching by launcher id) and also uses multiple different harvesters (to circumvent matching by harvester id). That person may also distribute the different plotnfts over different pools. For this case, the proposed change would clearly be a change in identifiability. The person would have to adapt their setup to also use multiple different farmers, in analogue to the harvesters, to circumvent matching by farmer id. Did I get that right? |
Yes, exactly. Thank you for being so much more succinct. |
Yes.
True, small farmers probably don't care. But multiple harvesters is very common on >100TiBe farms.
Right! The last part of that chain of software components is the node id, at which point the farmer would be completely identified. And to avoid identification, the farmer would then have to run completely separate farms, which would be a real pain.
I think the idea you're proposing is to concatenate the plotnft id and the farmer id, then hash that? I agree that would work.
I know. And that makes it impossible. I don't know what requirement prompted it to be included originally, but I would have proposed not including it, if I had been aware at the time. |
Yes, but the reasoning for multiple harvesters is generally not to mask their capacity on different pools but to split resources/hdd access |
Agreed! I think we're saying the same thing on this point. :-) |
Thanks for clarifying, @fizpawiz. So a suggested way forward:
|
I like this idea, personally. Would love to do have the option to do the same with the harvester id.
I'm still confused why anyone wants the stats. Self-reported values can be manipulated, especially when the code reporting them is open-source. What happens when someone tweaks their rig to report more space than they have, then falsely claim the pool is ripping them off, pointing to the statistics as "proof"? Or tweaks to under-report space, and then claims to have found some "weird trick" to improve returns? "I'll tell you the trick for just $99!" The partials are the proof of space, which can be trusted. If the reported space is used for ranking farmers on the pool leaderboard, you can be sure people will "adjust" their reported space to move up. Now the pools will need to add code to catch people faking their reported space, and decide what to do when they detect it. And what if it is close, and so it is hard to be sure? Most pools take 1%, so a small tweak by an unscrupulous large farmer can really make the pool look bad, or good, at the whim of the farmer. And accusations would be difficult, or impossible, to prove one way or the other, making such accusations in public potentially very sticky from a PR perspective.
Yes, that would be appropriate and important if sending statistics. |
There are many reasons, for example users want to see how their farm performs compared to its actual capacity, users want to see/get notified if their plot count changes/drops, users want to monitor (re)plotting progress .. etc If you don't need it you can disable it, not a problem.
Nothing, nothing happens
Nobody said anything of the like, as you said its a self reported metric that can not be trusted, why would anyone use it for leaderboards
pools use effort to track their performance, not unverifiable user reported metrics, so no he can not, he can only make his own farm look good or bad
Nothing changes to how it is right now with the exception that the farmer in question would screenshot his pool harvester page instead of his chia gui to show his "real" capacity |
What are the next steps for this? It has been a while since the last update. Can we review/merge the PR in chia-blockchain now? |
...
Makes sense to me. I worry about pools using this self-reported info, but that's up to them to protect from users gaming the pool by tweaking the stats. I think we landed on a plan forward, @xearl4 so clearly described above. Would love to also have the option to mask the harvester id too. I don't actually know the CHIP process very well. What is the next step, and who takes it? |
At this point it sounds like we have general consensus from the community (other than potentially masking the harvester ID), and we have an implementation in place. I'll move the CHIP to I'll leave the CHIP in Assuming no additional changes are needed after two weeks in Best case scenario, this CHIP is finalized and the implementation is in |
This CHIP is now in |
How does requiring the users who requested this change to change the config option to true and leaving the "small subset of people" you allude to alone, not also accomplish the same thing? Overall I think this is a good change, and if you guys say its an often requested change for pool farmers then we should get it into the code base, but the sticking point for me is defaulting to on with new options 1 and 2, defaulting on new options that change privacy that users might be relying on. |
A summary of this CHIP's status:
Fee free to continue the discussion here. If necessary, we also could try to set up a Zoom call for anyone interested to discuss these settings. |
If a user wants to opt in for the ability to name their harvesters (which based on this message seems that will only be a possibility if this integration is in place) then wouldn't the pool be able to provide a GIF on the naming page showing how to unmask their harvester IDs for easier identification? The options here seem to be: Obfuscate the ids, negative being that users might need to remove obfuscation to more easily work with pool support and name their harvesters (both of these activities are initiated by the user actively seeking information). OR Do not obfuscate the ids, negative being that users might need to add obfuscation if they are concerned about security. These users would need to be aware of the potential security concerns and need to take action based on this. Seeing how the first options negative is focused on what users are actively doing (seeking pool support or trying to name their harvesters) documentation explaining the solution can be provided where the user is actively engaging already while the second options negatives are entirely passive; I personally, would like to see option 1 exercised. Furthermore, to just obfuscate the farmer ID would not be sufficient to resolve the negatives in option 2 so I also highly recommend that the CHIP and associated reference client PR be updated to also obfuscate the harvester ID. |
Substitute harvester for farmer here and everywhere below, users can already name their harvesters as partials include the harvester id
As stated before, this is out of scope for this CHIP and PR (additional partial headers). Instead i would recommend the concerned parties to create a new CHIP for (harvester) peer id obfuscation. |
I am not sure if this is meant to negate any of the other logic in my previous statement but it seems that if users are already naming their systems then they are already looking at the dashboard and a notice could be added that they need to remove obfuscation when X version releases to maintain the functionality? Whereas for security concerned individuals there is no active location where they are checking for this information.
I respectfully disagree, as you and xearl have stated harvester IDs are already not obfuscated BUT this chip and associated PR make it so that lack of obfuscation becomes a security concern (as it can now be tied to a farmer ID). If this chip and associated PR are not adopted then there is no change to security but if adopted there is a decrease in security. To resolve this decrease in security it seems imperative to obfuscate both the farmer and harvester IDs. At the very least obfuscating the farmer ID should most certainly be the default |
I think that's a misunderstanding. The harvester IDs are already a security concern, as fizpawiz' comments also clearly confirm. A farmer reusing the same harvester(s) to farm plots belonging to different plotnfts already allows pools to match multiple plotnfts to the same farmer. The security concern is already there, it's part of the Chia Pool Protocol 1.0 specification. |
Agree to disagree, for me obfuscating the harvester id which is part of the pooling protocol and the json payload is out of scope for a pr adding additional headers. |
It accomplishes a similar thing, but -- in my opinion -- with a lot of wasted effort. The original PR which lead to this CHIP implemented a few separate things:
Regarding 2 and 3, I get the feeling that users at this level of concern about privacy would generally prefer not to send any new information to the pool, not even the obfuscated IDs which still leak information that was previously not visible (that a farmer is using multiple different farmer processes; even though no one has complained about this information leakage yet.) And this is the unnecessary effort I mentioned initially: why implement obfuscation and whatnot for something that no one will ever use or want to use? If the UX is going to be that we have to tell pool users who like to have this useful information to first enable config options anyway, I'd personally rather have two simple "include farmer peer id" (default false, because newly introduced privacy concern) and "include harvester stats" (default false, because newly introduced privacy concern) config options instead. |
What about default obfuscation for the farmer peer ID? |
I think this is an excellent solution that allows for the best of both worlds (defaults to current security levels and allows for users to easily add the extra reporting with the seamless and direct farmer id to use for lookups)! |
@felixbrucker given the latest feedback, would you like me to move the |
I personally disagree with the default value being false for the peer id option |
I will update the chip to reflect the new config options as that seems like a better solution than obfuscation |
OK, then I will move the CHIP to |
@felixbrucker has made some updates to this CHIP, so I am moving it back to |
It has been over a week, i'd like to see this implemented in it's current form |
I can put the CHIP into |
I'm just responding to this, i don't know what the proper next state for this chip as nobody else commented |
Yup, I'll move it to |
This CHIP is now in |
This CHIP is now |
This is the corresponding CHIP for Chia-Network/chia-blockchain#17788