Blacklist should include homographs of ICANN TLDs #255

0xhaven · 2019-09-26T08:06:54Z

We expect Handshake names to be resolved, trusted, and displayed in an equivalent way to regular ICANN TLDs.
We also support registration of IDNA (punycode) names.

Because of this, simply blacklisting exact string matches of ICANN TLDs is not sufficient. It's extremely easy to conduct IDN homograph attacks by registering a punycode domain name that visually resembles (or is identical to) an ICANN TLD.

Depending on how you configure the "confusable" homoglyph database, this would involve blacklisting at least 100s of millions of names (my broad search found ~3 billion or ~92 GiBs of hash data). Initially I hoped, we could store everything in a static data structure committed to in the genesis block but it's probably necessary to just commit to a description of the blacklist grammar for and use that for dynamic detection of homographs collisions when handling (OPEN) TXOs.

My current work on enumerating the homographs is here: https://github.com/namebasehq/idn-homographs

I'll also write some much simpler code to detect homographs, but I need guidance on exactly how we should encode that algorithm in consensus.

Out-of-scope:

Homographs of other (non-TLD) reserved names.
Similar attacks on Handshake names that have been registered.

0xhaven · 2019-09-26T08:07:04Z

@boymanjor

tynes · 2019-09-26T19:32:09Z

I agree that blacklisting homographs is a way to protect users, but I am not sure if doing it at the level of blockchain consensus is the correct way to do it. That means that changing the blacklist is likely a hardfork, which adds a lot of risk to the system as a contentious hardfork will split the namespace and make the system much less usable as a whole.

I think that blacklists should be implemented at the policy level, which means that different versions of the software can implement different blacklists and the blacklists can change over time. The cost for a user to change blacklists should be very low, so that competition drives the best blacklist to the top. If a user cannot convince their software or service provider to change the blacklist, then they should be able to easily run their own node with a new blacklist or change to a different service provider.

Since the blockchain itself is a data availability layer and DNS is built on top of it at the application layer, adding DNS logic to consensus doesn't really make sense. I agree that it isn't ideal that users can purchase a homomorph of com and then not have it be served because it is blacklisted, but this is experimental blockchain software and users should DYOR before participating. A user can purchase a homomorph of com and use the 512 bytes of data availability for non DNS uses. Having incentivized availability is not something that is easy to come by.

The blacklist will need to be updated if a new ICANN TLD is created. That is already going to be a time of governance and would add more complexity to the decision making process. The points of governance should be as simple as possible, otherwise it becomes hard to scale socially.

kilpatty · 2019-09-26T20:30:24Z

Just following up from the Telegram chat here.

Agree w/ Mark when it comes to having the blacklist be a policy blacklist. While it may not be perfectly ideal when it comes to using HNS for DNS/Root Zone Resolution, that is not the sole use of Handshake as a potential naming system - only 1 specific use case.

I think we should be super careful about enforcing the preferences of specific use cases of Handshake "permanently" in consensus.

tynes · 2019-09-28T00:05:12Z

For transparency purposes, the telegram chat can be found here: https://t.me/hns_tech

Please join us and voice your opinion.

pinheadmz · 2019-09-30T16:17:50Z

I agree that this should not be a consensus-layer feature, but an application-level feature.

Firefox has punycode detection: http://kb.mozillazine.org/Network.IDN_show_punycode
As does Chrome: https://www.chromium.org/developers/design-documents/idn-in-google-chrome

There are also several chrome extensions to help mitigate: https://chrome.google.com/webstore/search/punycode

...and I found a great test case from this blog post: https://www.xudongz.com/blog/2017/idn-phishing/

You can test it out with your own browser right now:

⚠️ DANGER ATTACK URL TESTING PURPOSES ONLY ⚠️

www.аррӏе.com

When I click that evil link, my browser (Chrome, default settings) displays the punycode:
https://www.xn--80ak6aa92e.com/

You can also test it on a TLD with nothing actually registered:

⚠️ DANGER ATTACK URL TESTING PURPOSES ONLY ⚠️

www.GitHub.cοm

My browser exposes this as http://www.github.xn--cm-jbc/

stp-ip · 2019-10-07T15:01:07Z

Generally we have two opposing things.
One is flexibility/accessibility in terms of internationalization of TLDs and domains.
The other one is security by providing less loopholes for people to defraud others.

In my personal rough and short view the following arguments could be made:

blacklisting should either be enforced globally (consensus mechanism) or not at all
blacklisting should include ICANN as well as handshake names or else we are enforcing more security on legacy infra than handshake's
introducing policy for a blacklist sounds censorship problematic

Further considerations. This problem of similar looking names is a huge problem and Google's Chrome team is working on this for quite some time.
In general CAs or EV certs are usually the separating/trusting factor for people to judge, which with Let's Encrypt becomes less of a good signal.

The specific implementation would have a lot of issues, but handshake is here to support a more secure root and with that a more secure DNS system. Enabling identically looking registrations does seem counterproductive to this goal.

TL;DR:
Consensus based blacklist with huge open questions on accessible and fair (fair in terms of not focusing on english characters only) implementation.

tynes · 2019-10-08T18:59:14Z

On Consensus Blacklisting

I think that the best redeeming property of blacklisting at the protocol level is that it will reduce the maximum size of the tree. When a name is blacklisted at the protocol level, it means that a UTXO can never represent that name, and there will never be a namestate associated with it. This means that attempting to OPEN such a name would result in an invalid transaction.

Attempting to come up with a single blacklist that will work forever is not an easy task. I look at it from this perspective because updating such a blacklist would be a hardfork. As the attackers get better at attacking, they will exploit the system and then the blacklist would need to be updated to take into account the new attacks. This is an iterated "cat and mouse" type of game. I do not think it is possible to come up with a list that will last forever. There is also a lot of risk of introducing bias based on who creates such a list. This is particularly dangerous because every hard fork introduces a "point of governance" where the risk of a contentious chain split is higher as well as potentially introducing economic changes and changes in power dynamics. Read about Ethereum's ProgPow if you are curious to see how hardforks can impact a community.

On Policy Blacklisting

Blocking names at the policy level allows for the UTXO of the name to exist, but the associated data cannot be read by the application layer on top of the tree.

introducing policy for a blacklist sounds censorship problematic

I do not understand how a policy based blacklist could result in a censorship problem because the cost of exit is relatively low. If your favorite service provider is censoring particular names, then you can either switch to another service provider or just run a full node. Running a hsd full node is cheap enough that an Infura like situation is unlikely to happen. Censorship by service providers can still happen irregardless of consensus rules.

A policy based blacklist could also be used to introduce Pihole like ad-blocking features. I see this as a huge plus, because Pihole like ad-blocking is nice - all machines on the network get it, and its something that people actually want and use today. This is orthogonal to this particular discussion because this should be considered either way.

On Homographs

If homographs of ICANN names are blocked, then that puts ICANN names in a privileged position over Handshake names on the network, as Handshake names have homographs too. h/t @kilpatty

Many people have been scammed by cryptocurrency projects and I think that we should do everything that we can to prevent people from scamming. I think that part of the blockchain movement (in the West at least) is about empowering individuals, so I think that education around domains for user safety could help community members. People have been saying that for years and the situation has always been the same - users prefer convenience and would rather fill their minds with entertainment instead of precaution, but the cryptocurrency world requires precaution to navigate safely. This is a tradeoff between censorship and usability, and there is no easy answer. I'm sure that different community members will have differing opinions over time.

To fight scamming, I think its possible to develop an algorithm to calculate all of the punycode based homographs of a set of names. I think that it makes sense to create an open source project that can do this, and then allow users to configure it (like what @jacobhaven posted above). Most users will not want to do all of this, so community members with good reputation can create blacklists and make them public. The cost of creating a new blacklist needs to be extremely low.

ENS Blacklist

ENS introduced a blacklist to help prevent scammers and squatters. You can see in their architecture diagram that the blacklist is decoupled from the ENS contracts. This leads me to believe that anybody can implement their own blacklist as long as it implements the correct interface.

See https://medium.com/the-ethereum-name-service/the-ens-blacklist-406016319e67

I think that we should be learning from the ENS community and collaborating when it makes sense.

rsolari · 2019-10-08T19:16:52Z

algorithm to calculate all of the punycode based homographs of a set of names

The Chrome team has done good work to mitigate punycode attacks. This page describes their rules for when to show punycode vs. unicode:

https://www.chromium.org/developers/design-documents/idn-in-google-chrome

tynes · 2019-10-09T01:47:25Z

Potentially slightly offtopic but related as it was mentioned above:

While it may not be perfectly ideal when it comes to using HNS for DNS/Root Zone Resolution, that is not the sole use of Handshake as a potential naming system - only 1 specific use case.
I think we should be super careful about enforcing the preferences of specific use cases of Handshake "permanently" in consensus.

Note: please correct any poor logic

There are 680 million HNS to mine out of 2.04 billion in the total supply. With 2000 HNS created every block and a halving interval of about 3.25 years (170000 blocks). Since Handshake difficulty adjusts every block, using time to calculate won't be as inaccurate as using time to calculate such things on Bitcoin (halvings came early).

2000 * 6 * 24 * 365 * 3.25 = ~341,640,000 newly created HNS after the first halving.

680,000,000 (initial supply)
341,640,000 (halving 1 emissions)
178,820,000 (halving 2 emissions)
85,410,000 (halving 3 emissions)

1,285,870,000 total HNS at ~10 years into the project or just more than 50% of the total supply. Depending on the miner economics, transaction fees may or may not be a large part of their income because of the 3.25 year halvings. Miner income pays for the security of the network. If the security budget is low, then it puts the DNS application at risk. I think its a good idea to really focus on Handshake as a naming system, but if that single application does not drive enough miner revenue, then other applications that want to use Handshake as a data availability layer may help to keep the network as a whole secure. 512 bytes of incentivized availability is pretty nice.

Some good posts on the topic:
https://www.truthcoin.info/blog/security-budget/
http://randomwalker.info/publications/mining_CCS.pdf

turbomaze · 2019-11-01T00:13:43Z

I'm leaning towards a policy blacklist because:

Resolvers can always add their own blacklist regardless of what we decide here.
The attacker cat/mouse game discussed above requires repeated hardforks if the blacklist is consensus.

pinheadmz · 2020-01-13T18:15:06Z

See #301 (comment)

Like the gTLD reservation-expiration issue, this sort of problem must be solved at the application layer, not the consensus or even resolver layer (see how chrome handles punycode, etc).

pinheadmz · 2021-02-10T20:49:08Z

Please see https://github.com/pinheadmz/holdmyhand which requires update to hsd (review requested!) #558

0xhaven mentioned this issue Oct 10, 2019

[consensus] gTLDs are always reserved #256

Closed

pinheadmz closed this as completed Jan 13, 2020

dnsguru mentioned this issue Feb 4, 2021

dns/server: blacklist rfc2606 domains in the resolver #544

Closed

pinheadmz mentioned this issue Feb 9, 2021

Make Root nameserver more flexible for plugins #558

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Blacklist should include homographs of ICANN TLDs #255

Blacklist should include homographs of ICANN TLDs #255

0xhaven commented Sep 26, 2019 •

edited

Loading

0xhaven commented Sep 26, 2019

tynes commented Sep 26, 2019

kilpatty commented Sep 26, 2019

tynes commented Sep 28, 2019

pinheadmz commented Sep 30, 2019

stp-ip commented Oct 7, 2019

tynes commented Oct 8, 2019 •

edited

Loading

rsolari commented Oct 8, 2019

tynes commented Oct 9, 2019 •

edited

Loading

turbomaze commented Nov 1, 2019

pinheadmz commented Jan 13, 2020

pinheadmz commented Feb 10, 2021

Blacklist should include homographs of ICANN TLDs #255

Blacklist should include homographs of ICANN TLDs #255

Comments

0xhaven commented Sep 26, 2019 • edited Loading

0xhaven commented Sep 26, 2019

tynes commented Sep 26, 2019

kilpatty commented Sep 26, 2019

tynes commented Sep 28, 2019

pinheadmz commented Sep 30, 2019

stp-ip commented Oct 7, 2019

tynes commented Oct 8, 2019 • edited Loading

On Consensus Blacklisting

On Policy Blacklisting

On Homographs

ENS Blacklist

rsolari commented Oct 8, 2019

tynes commented Oct 9, 2019 • edited Loading

turbomaze commented Nov 1, 2019

pinheadmz commented Jan 13, 2020

pinheadmz commented Feb 10, 2021

0xhaven commented Sep 26, 2019 •

edited

Loading

tynes commented Oct 8, 2019 •

edited

Loading

tynes commented Oct 9, 2019 •

edited

Loading