Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mantis skips discovery phase for TLDs that are reserved for government entities #43

Open
0xbharath opened this issue Sep 4, 2024 · 4 comments
Assignees
Labels
bug Something isn't working good first issue Good for newcomers hackathon Should be picked up during hackathons hacktoberfest hacktoberfest issues help wanted Extra attention is needed

Comments

@0xbharath
Copy link
Collaborator

Describe the bug
Mantis seems to skip discovery phase for TLDs reserved for country/govt entities.

To Reproduce

mantis onboard -o gov:my -t gov.my

[2024-09-04 15:36:36,773] --> INFO: MANTIS Workflow - STARTED
[2024-09-04 15:36:36,773] --> INFO: Executing workname workflowName='default' schedule='daily between 00:00 and 04:00' cmd=[] scanNewOnly=False workflowConfig=[Module(moduleName='discovery', tools=['Subfinder', 'Amass'], order=1), Module(moduleName='prerecon', tools=['FindCDN', 'Naabu'], order=2), Module(moduleName='activehostscan', tools=['HTTPX_Active', 'HTTPX'], order=3), Module(moduleName='activerecon', tools=['Wafw00f'], order=4), Module(moduleName='scan', tools=['DNSTwister', 'Nuclei', 'Corsy'], order=5), Module(moduleName='secretscanner', tools=['SecretScanner'], order=6)]
[2024-09-04 15:36:36,793] --> INFO: Inserting user input into database

0it [00:00, ?it/s]

PRERECON: 100%|

ACTIVEHOSTSCAN: 100%|

System (please complete the following information):

Docker based setup on Ubuntu 24.04.

Additional context

This seems to happen due to the library that is used to categorize the input provided.

@0xbharath 0xbharath added bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed labels Sep 4, 2024
@0xbharath
Copy link
Collaborator Author

The issue seems to be in the usage of tldextract library in the file mantis/utils/asset_type.py .

>>> tldextract.extract("example.com").registered_domain
'example.com'
>>> tldextract.extract("nic.in").registered_domain
''

tldextract uses the public suffix list for parsing TLDs https://publicsuffix.org/list/public_suffix_list.dat

@0xbharath 0xbharath added hackathon Should be picked up during hackathons hacktoberfest hacktoberfest issues labels Sep 12, 2024
@dmdhrumilmistry
Copy link

shouldn't this issue be fixed at the source?

@0xbharath
Copy link
Collaborator Author

Ideally, yes. It would be tricky to get the library to impart this changes. We are trying to see if we can find a workaround or use a different library to fix this issue.

@dmdhrumilmistry
Copy link

dmdhrumilmistry commented Oct 5, 2024

After thinking about it, I don't think there's something wrong with the library. nic.in is supposed to be used as TLD. so if you're using library to extract registered domain from string consisting only TLD then it should return empty string.

>>> import tldextract
# querying str with TLD only
>>> tldextract.extract("com").registered_domain
''
>>> tldextract.extract("nic.in").registered_domain
''

# querying str with labels + tld
>>> tldextract.extract("example.com").registered_domain
'example.com'
>>> tldextract.extract("subdomain.example.com").registered_domain
'example.com'
>>> tldextract.extract("example.nic.in").registered_domain # works since it has label + TLD
'example.nic.in'
>>> tldextract.extract("subdomain.example.nic.in").registered_domain
'example.nic.in'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers hackathon Should be picked up during hackathons hacktoberfest hacktoberfest issues help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants