Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blacklist should include homographs of ICANN TLDs #255

Closed
0xhaven opened this issue Sep 26, 2019 · 12 comments
Closed

Blacklist should include homographs of ICANN TLDs #255

0xhaven opened this issue Sep 26, 2019 · 12 comments

Comments

@0xhaven
Copy link

0xhaven commented Sep 26, 2019

We expect Handshake names to be resolved, trusted, and displayed in an equivalent way to regular ICANN TLDs.
We also support registration of IDNA (punycode) names.

Because of this, simply blacklisting exact string matches of ICANN TLDs is not sufficient. It's extremely easy to conduct IDN homograph attacks by registering a punycode domain name that visually resembles (or is identical to) an ICANN TLD.

Depending on how you configure the "confusable" homoglyph database, this would involve blacklisting at least 100s of millions of names (my broad search found ~3 billion or ~92 GiBs of hash data). Initially I hoped, we could store everything in a static data structure committed to in the genesis block but it's probably necessary to just commit to a description of the blacklist grammar for and use that for dynamic detection of homographs collisions when handling (OPEN) TXOs.

My current work on enumerating the homographs is here: https://github.com/namebasehq/idn-homographs

I'll also write some much simpler code to detect homographs, but I need guidance on exactly how we should encode that algorithm in consensus.

Out-of-scope:

  1. Homographs of other (non-TLD) reserved names.
  2. Similar attacks on Handshake names that have been registered.
@0xhaven
Copy link
Author

0xhaven commented Sep 26, 2019

@boymanjor

@tynes
Copy link
Contributor

tynes commented Sep 26, 2019

I agree that blacklisting homographs is a way to protect users, but I am not sure if doing it at the level of blockchain consensus is the correct way to do it. That means that changing the blacklist is likely a hardfork, which adds a lot of risk to the system as a contentious hardfork will split the namespace and make the system much less usable as a whole.

I think that blacklists should be implemented at the policy level, which means that different versions of the software can implement different blacklists and the blacklists can change over time. The cost for a user to change blacklists should be very low, so that competition drives the best blacklist to the top. If a user cannot convince their software or service provider to change the blacklist, then they should be able to easily run their own node with a new blacklist or change to a different service provider.

Since the blockchain itself is a data availability layer and DNS is built on top of it at the application layer, adding DNS logic to consensus doesn't really make sense. I agree that it isn't ideal that users can purchase a homomorph of com and then not have it be served because it is blacklisted, but this is experimental blockchain software and users should DYOR before participating. A user can purchase a homomorph of com and use the 512 bytes of data availability for non DNS uses. Having incentivized availability is not something that is easy to come by.

The blacklist will need to be updated if a new ICANN TLD is created. That is already going to be a time of governance and would add more complexity to the decision making process. The points of governance should be as simple as possible, otherwise it becomes hard to scale socially.

@kilpatty
Copy link
Contributor

Just following up from the Telegram chat here.

Agree w/ Mark when it comes to having the blacklist be a policy blacklist. While it may not be perfectly ideal when it comes to using HNS for DNS/Root Zone Resolution, that is not the sole use of Handshake as a potential naming system - only 1 specific use case.

I think we should be super careful about enforcing the preferences of specific use cases of Handshake "permanently" in consensus.

@tynes
Copy link
Contributor

tynes commented Sep 28, 2019

For transparency purposes, the telegram chat can be found here: https://t.me/hns_tech

Please join us and voice your opinion.

@pinheadmz
Copy link
Member

I agree that this should not be a consensus-layer feature, but an application-level feature.

Firefox has punycode detection: http://kb.mozillazine.org/Network.IDN_show_punycode
As does Chrome: https://www.chromium.org/developers/design-documents/idn-in-google-chrome

There are also several chrome extensions to help mitigate: https://chrome.google.com/webstore/search/punycode

...and I found a great test case from this blog post: https://www.xudongz.com/blog/2017/idn-phishing/

You can test it out with your own browser right now:

⚠️ DANGER ATTACK URL TESTING PURPOSES ONLY ⚠️

www.аррӏе.com

When I click that evil link, my browser (Chrome, default settings) displays the punycode:
https://www.xn--80ak6aa92e.com/

You can also test it on a TLD with nothing actually registered:

⚠️ DANGER ATTACK URL TESTING PURPOSES ONLY ⚠️

www.GitHub.cοm

My browser exposes this as http://www.github.xn--cm-jbc/

@stp-ip
Copy link

stp-ip commented Oct 7, 2019

Generally we have two opposing things.
One is flexibility/accessibility in terms of internationalization of TLDs and domains.
The other one is security by providing less loopholes for people to defraud others.

In my personal rough and short view the following arguments could be made:

  • blacklisting should either be enforced globally (consensus mechanism) or not at all
  • blacklisting should include ICANN as well as handshake names or else we are enforcing more security on legacy infra than handshake's
  • introducing policy for a blacklist sounds censorship problematic

Further considerations. This problem of similar looking names is a huge problem and Google's Chrome team is working on this for quite some time.
In general CAs or EV certs are usually the separating/trusting factor for people to judge, which with Let's Encrypt becomes less of a good signal.

The specific implementation would have a lot of issues, but handshake is here to support a more secure root and with that a more secure DNS system. Enabling identically looking registrations does seem counterproductive to this goal.

TL;DR:
Consensus based blacklist with huge open questions on accessible and fair (fair in terms of not focusing on english characters only) implementation.

@tynes
Copy link
Contributor

tynes commented Oct 8, 2019

On Consensus Blacklisting

I think that the best redeeming property of blacklisting at the protocol level is that it will reduce the maximum size of the tree. When a name is blacklisted at the protocol level, it means that a UTXO can never represent that name, and there will never be a namestate associated with it. This means that attempting to OPEN such a name would result in an invalid transaction.

Attempting to come up with a single blacklist that will work forever is not an easy task. I look at it from this perspective because updating such a blacklist would be a hardfork. As the attackers get better at attacking, they will exploit the system and then the blacklist would need to be updated to take into account the new attacks. This is an iterated "cat and mouse" type of game. I do not think it is possible to come up with a list that will last forever. There is also a lot of risk of introducing bias based on who creates such a list. This is particularly dangerous because every hard fork introduces a "point of governance" where the risk of a contentious chain split is higher as well as potentially introducing economic changes and changes in power dynamics. Read about Ethereum's ProgPow if you are curious to see how hardforks can impact a community.


On Policy Blacklisting

Blocking names at the policy level allows for the UTXO of the name to exist, but the associated data cannot be read by the application layer on top of the tree.

introducing policy for a blacklist sounds censorship problematic

I do not understand how a policy based blacklist could result in a censorship problem because the cost of exit is relatively low. If your favorite service provider is censoring particular names, then you can either switch to another service provider or just run a full node. Running a hsd full node is cheap enough that an Infura like situation is unlikely to happen. Censorship by service providers can still happen irregardless of consensus rules.

A policy based blacklist could also be used to introduce Pihole like ad-blocking features. I see this as a huge plus, because Pihole like ad-blocking is nice - all machines on the network get it, and its something that people actually want and use today. This is orthogonal to this particular discussion because this should be considered either way.


On Homographs

If homographs of ICANN names are blocked, then that puts ICANN names in a privileged position over Handshake names on the network, as Handshake names have homographs too. h/t @kilpatty

Many people have been scammed by cryptocurrency projects and I think that we should do everything that we can to prevent people from scamming. I think that part of the blockchain movement (in the West at least) is about empowering individuals, so I think that education around domains for user safety could help community members. People have been saying that for years and the situation has always been the same - users prefer convenience and would rather fill their minds with entertainment instead of precaution, but the cryptocurrency world requires precaution to navigate safely. This is a tradeoff between censorship and usability, and there is no easy answer. I'm sure that different community members will have differing opinions over time.

To fight scamming, I think its possible to develop an algorithm to calculate all of the punycode based homographs of a set of names. I think that it makes sense to create an open source project that can do this, and then allow users to configure it (like what @jacobhaven posted above). Most users will not want to do all of this, so community members with good reputation can create blacklists and make them public. The cost of creating a new blacklist needs to be extremely low.


ENS Blacklist

ENS introduced a blacklist to help prevent scammers and squatters. You can see in their architecture diagram that the blacklist is decoupled from the ENS contracts. This leads me to believe that anybody can implement their own blacklist as long as it implements the correct interface.

See https://medium.com/the-ethereum-name-service/the-ens-blacklist-406016319e67

I think that we should be learning from the ENS community and collaborating when it makes sense.

@rsolari
Copy link
Contributor

rsolari commented Oct 8, 2019

algorithm to calculate all of the punycode based homographs of a set of names

The Chrome team has done good work to mitigate punycode attacks. This page describes their rules for when to show punycode vs. unicode:

https://www.chromium.org/developers/design-documents/idn-in-google-chrome

@tynes
Copy link
Contributor

tynes commented Oct 9, 2019

Potentially slightly offtopic but related as it was mentioned above:

While it may not be perfectly ideal when it comes to using HNS for DNS/Root Zone Resolution, that is not the sole use of Handshake as a potential naming system - only 1 specific use case.
I think we should be super careful about enforcing the preferences of specific use cases of Handshake "permanently" in consensus.

Note: please correct any poor logic

There are 680 million HNS to mine out of 2.04 billion in the total supply. With 2000 HNS created every block and a halving interval of about 3.25 years (170000 blocks). Since Handshake difficulty adjusts every block, using time to calculate won't be as inaccurate as using time to calculate such things on Bitcoin (halvings came early).

2000 * 6 * 24 * 365 * 3.25 = ~341,640,000 newly created HNS after the first halving.

680,000,000 (initial supply)
341,640,000 (halving 1 emissions)
178,820,000 (halving 2 emissions)
85,410,000 (halving 3 emissions)

1,285,870,000 total HNS at ~10 years into the project or just more than 50% of the total supply. Depending on the miner economics, transaction fees may or may not be a large part of their income because of the 3.25 year halvings. Miner income pays for the security of the network. If the security budget is low, then it puts the DNS application at risk. I think its a good idea to really focus on Handshake as a naming system, but if that single application does not drive enough miner revenue, then other applications that want to use Handshake as a data availability layer may help to keep the network as a whole secure. 512 bytes of incentivized availability is pretty nice.

Some good posts on the topic:
https://www.truthcoin.info/blog/security-budget/
http://randomwalker.info/publications/mining_CCS.pdf

@turbomaze
Copy link
Contributor

I'm leaning towards a policy blacklist because:

  1. Resolvers can always add their own blacklist regardless of what we decide here.
  2. The attacker cat/mouse game discussed above requires repeated hardforks if the blacklist is consensus.

@pinheadmz
Copy link
Member

See #301 (comment)

Like the gTLD reservation-expiration issue, this sort of problem must be solved at the application layer, not the consensus or even resolver layer (see how chrome handles punycode, etc).

@pinheadmz
Copy link
Member

Please see https://github.com/pinheadmz/holdmyhand which requires update to hsd (review requested!) #558

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants