TLDExtractor

C# Library

TLDExtractor is similar to tldextract library (in Python). By using TLDExtractor, you can accurately extract subdomain, domain, and domain suffix (effective TLD) of a given domain name.

It is a common mistake to split a domain name with '.' character and consider the last part as a domain suffix. For example, consider 'www.yahoo.co.uk', in this domain, 'co.uk' is the suffix not 'uk'.

TLDExtractor utilize the Public Suffix List to correctly identify the suffix for a given domain name. To extract subdomain, domain, and domain suffix, you can use TLDExtractor.Extract methods:

var result = TLDExtractor.Extract("www.yahoo.co.uk");
// result.ToString() -> 
//    {ExtractResult(subdomain='www', domain='yahoo', suffix='co.uk', suffix type='ICANN')}

// nom.ae is a private suffix submitted by Dave McCormack <dave.mccormack@nymnom.com>
var result = TLDExtractor.Extract("www.test.nom.ae");
// result.ToString() -> 
//    {ExtractResult(subdomain='www', domain='test', suffix='nom.ae', suffix type='Private')}

// you can pass a Uri
Uri guestUrl = new Uri("http://www.example.com/mine.html");
var result = TLDExtractor.Extract(guestUrl);
// result.ToString() -> 
//    {ExtractResult(subdomain='www', domain='example', suffix='com', suffix type='ICANN')}

If the domain name is not valid, TLDExtractor.Extract raises TLDExtractorException with an appropriate message explaining the problem. According to RFC 1035, domain names must be equal or less than 255 characters, and each label such as yahoo or www must be equal or less than 63 characters. Moreover, domain labels cannot be empty. For example, test..com is not valid.

In addition to TLDExtractor.Extract method, you can use TLDExtractor.TryExtract method to extract domain parts from a domain or URL, the only difference these two is that TLDExtractor.TryExtract does not raise any TLDExtractorException exception. It returns false if the given domain name is not a valid one.

ExtractResult result;
bool isExtracted = TLDExtractor.TryExtract("www.yahoo.co.uk", out result);

if(isExtracted)
{
    // do something
}
else
{
    // something went wrong
}

By default, TLDExtractor downloads the Public Suffix List from the Internet if it doesn't exist in the working directory or it is too old (it is created more than 30 days). You can override these default values by

TLDExtractor.SuffixListFilePath = "change_the_default_file_name.txt";

TLDExtractor.RenewAfterNDays = 60; 

TLDExtractor.RenewAfterNDays = -1; // to disable renewal mechanism

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
TLDExtractor		TLDExtractor
TLDExtractorLib		TLDExtractorLib
.gitattributes		.gitattributes
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
TLDExtractor.sln		TLDExtractor.sln

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TLDExtractor

C# Library

About

Releases

Packages

Languages

License

DissectMalware/TLDExtractor

Folders and files

Latest commit

History

Repository files navigation

TLDExtractor

C# Library

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages