parser for fqdn by Mzack9999 · Pull Request #2459 · projectdiscovery/httpx

Mzack9999 · 2026-03-21T17:48:37Z

Proposed changes

Close #1773

Checklist

Pull request is created against the dev branch
All checks passed (lint, unit/integration/regression tests etc.) with my changes
I have added tests that prove my fix is effective or that my feature works
I have added necessary documentation (if appropriate)

Summary by CodeRabbit

Bug Fixes
- Improved domain extraction by parsing HTML attributes, meta tags, and embedded scripts for more accurate results.
- Better normalization and validation of discovered domains, and safer handling of large or complex script contents to avoid hangs.
Tests
- Added extensive test coverage and benchmarks covering many HTML/JS patterns, edge cases, false-positive rejection, and performance scenarios.

neo-by-projectdiscovery-dev · 2026-03-21T17:48:50Z

Neo - PR Security Review

No security issues found

Highlights

Adds HTML parsing with goquery to extract domains from URL-bearing attributes (href, src, action, formaction, poster, etc.)
Adds JavaScript AST parsing with goja to extract domains from string literals and template literals in inline scripts
Implements comprehensive domain validation including TLD verification, package name rejection, and file extension filtering
Includes 512KB size limit for JS parsing and panic recovery to prevent DoS

Hardening Notes

Consider adding a timeout for HTML/JS parsing operations in domains.go to prevent resource exhaustion on extremely large or deeply nested documents
The regex pattern [a-z0-9.-]* in potentialDomainRegex at line 19 is safe due to Go's RE2 engine, but consider documenting this explicitly in a comment
In extractDomainsFromHTML (line 70), consider adding a maximum depth limit for nested HTML structures as an additional safeguard

_{Comment @pdneo help for available commands. · Open in Neo}

coderabbitai · 2026-03-21T17:48:57Z

Walkthrough

This PR enhances BodyDomainGrab by parsing HTML (goquery) and JavaScript (goja AST) when HTML is detected, extracting candidate hostnames from element attributes, srcset, meta tags, and script contents, then normalizing/validating domains and using regex as a fallback.

Changes

Cohort / File(s)	Summary
Core Domain Extraction `common/httpx/domains.go`	Rewrote `BodyDomainGrab` to detect HTML bodies, parse with goquery, extract domains from `href`, `src`, `action`, `srcset`, `meta[content]`, data-* attributes and inline `<script>` contents; added JS AST parsing via goja (skip >512KB). Introduced helpers `addDomainCandidate` and `hostnameFromURL`, moved regex matching to fallback.
Tests & Fixtures `common/httpx/domains_test.go`, `common/httpx/test-data/sample_with_js.html`	Added extensive unit tests and benchmarks covering HTML/JS extraction, many edge cases (empty/broken bodies, deduplication, false-positive patterns), helper tests for `hostnameFromURL`/`addDomainCandidate`, and an embedded HTML fixture exercising varied URL patterns.
Dependencies `go.mod`	Added direct dependency `github.com/dop251/goja` (JS parser) and introduced transitive `github.com/go-sourcemap/sourcemap` in the module graph.

Sequence Diagram(s)

sequenceDiagram
    participant Client as HTTP Response Body
    participant Detector as Content Detector
    participant HTMLParser as HTML Parser (goquery)
    participant JSParser as JS Parser (goja)
    participant Regex as Regex Fallback
    participant Validator as Domain Normalizer/Validator

    Client->>Detector: provide response bytes
    alt body starts with '<' (HTML)
        Detector->>HTMLParser: parse HTML
        HTMLParser->>HTMLParser: extract href/src/srcset/action/data-*/meta
        HTMLParser->>JSParser: send inline script texts
        JSParser->>JSParser: parse JS AST (skip >512KB) and extract string/template literals
        JSParser->>Validator: emit domain candidates
    end
    Client->>Regex: run regex fallback on raw body
    Regex->>Validator: emit domain candidates
    Validator->>Validator: normalize, validate, publicsuffix, dedupe -> populate domains/fqdns

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐇 I hopped through tags and scripts today,

found hosts that used to hide away.
With goquery's eyes and goja's sight,
domains now dance into the light.
A tiny hop for code — a joyous spray! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 15.71% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'parser for fqdn' directly references the main change: refactoring FQDN extraction to use actual parsers (goquery for HTML, goja for JavaScript) instead of regex-only heuristics.
Linked Issues check	✅ Passed	The PR fully implements issue `#1773`'s requirements: HTML parsing via goquery, JavaScript AST parsing via goja, domain extraction from specific contexts (attributes, srcset, scripts), and regex fallback to reduce false positives.
Out of Scope Changes check	✅ Passed	All changes are directly related to improving FQDN extraction through parsers: BodyDomainGrab refactoring, helper functions, test coverage for HTML/JS parsing, and required dependencies (goja). No unrelated changes detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch improve-fqdn-extraction

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@common/httpx/domains.go`:
- Line 151: The deferred anonymous recover in common/httpx/domains.go currently
calls recover() without using its return value which fails lint; update the
defer to explicitly handle the result (e.g., capture into a variable and ignore
or check for nil) so the return value is used — locate the defer func() {
recover() }() and change it to call recover into a variable (for example r :=
recover()) and either assign it to the blank identifier (_ = r) or wrap it with
if r != nil { /* no-op or minimal handling */ } to satisfy errcheck.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 42d71ac6-683a-49d5-a24a-a7beed6f8603

📥 Commits

Reviewing files that changed from the base of the PR and between 9836829 and 090f0c2.

⛔ Files ignored due to path filters (1)

go.sum is excluded by !**/*.sum

📒 Files selected for processing (4)

common/httpx/domains.go
common/httpx/domains_test.go
common/httpx/test-data/sample_with_js.html
go.mod

common/httpx/domains.go

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@common/httpx/domains.go`:
- Around line 41-53: The current flow always runs extractDomainsFromRegex after
the HTML branch, causing JS-only bodies to miss extractDomainsFromJS and
reintroducing parser false positives; change the control flow around the
looksLikeHTML check to a single conditional switch on r.Data: if
looksLikeHTML(r.Data) use extractDomainsFromHTML and then extractDomainsFromJS
for inline scripts (respecting maxInlineScriptSize), else if
looksLikeJavaScript(r.Data) (or Content-Type if available) and len(r.Data) <=
maxInlineScriptSize call extractDomainsFromJS(string(r.Data), domains, fqdns,
r.Input), otherwise call extractDomainsFromRegex(r.Raw, domains, fqdns,
r.Input); update or add a looksLikeJavaScript helper if needed and remove the
unconditional extractDomainsFromRegex call after the HTML branch.
- Around line 58-64: The looksLikeHTML function fails for UTF-8 BOM-prefixed
documents because bytes.TrimSpace doesn't remove the BOM; update looksLikeHTML
to strip a leading UTF-8 BOM (bytes.TrimPrefix(..., []byte{0xEF,0xBB,0xBF}) or
equivalent) from the trimmed prefix before testing the first byte, so inputs
like "\ufeff<!DOCTYPE html>" are correctly detected as HTML.
- Around line 194-198: Normalize resp.Input's host portion before doing equality
checks: extract hostname from the existing input value (e.g., using
net.SplitHostPort or strings.Split on ':' into an inputHost) and then use
inputHost in the comparisons instead of raw input; replace the checks around
domains[val] and fqdns[d] so they read "if inputHost != val { domains[val] =
struct{}{} }" and "if d != val && d != inputHost { fqdns[d] = struct{}{} }" and
add the appropriate import (net or strings) so the normalization occurs before
the comparisons that involve input, d, and val.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6d51af93-2eb2-4a8a-8815-9829a7bfd1ce

📥 Commits

Reviewing files that changed from the base of the PR and between 090f0c2 and c3b06b5.

📒 Files selected for processing (1)

common/httpx/domains.go

coderabbitai · 2026-03-21T18:04:29Z

common/httpx/domains.go

+	// Only run HTML/JS parsers if the body looks like HTML
+	if len(r.Data) > 0 && looksLikeHTML(r.Data) {
+		inlineScripts := extractDomainsFromHTML(r.Data, domains, fqdns, r.Input)
+
+		for _, script := range inlineScripts {
+			if len(script) <= maxInlineScriptSize {
+				extractDomainsFromJS(script, domains, fqdns, r.Input)
+			}
+		}
+	}
+
+	// Regex fallback on the raw response (catches anything the parsers miss)
+	extractDomainsFromRegex(r.Raw, domains, fqdns, r.Input)


⚠️ Potential issue | 🟠 Major

Don’t run the legacy regex after the parser path.

looksLikeHTML limits the AST path to inline <script> tags, and Line 53 then scans the whole response with the old regex anyway. That means standalone JavaScript bodies never use extractDomainsFromJS, and HTML responses still accumulate the false positives this parser work is supposed to remove.

Control-flow sketch

if len(r.Data) > 0 { switch { case looksLikeHTML(r.Data): inlineScripts := extractDomainsFromHTML(r.Data, domains, fqdns, r.Input) for _, script := range inlineScripts { if len(script) <= maxInlineScriptSize { extractDomainsFromJS(script, domains, fqdns, r.Input) } } case looksLikeJavaScript(r.Data): // or Content-Type based detection if that is already available here if len(r.Data) <= maxInlineScriptSize { extractDomainsFromJS(string(r.Data), domains, fqdns, r.Input) } default: extractDomainsFromRegex(r.Raw, domains, fqdns, r.Input) } }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@common/httpx/domains.go` around lines 41 - 53, The current flow always runs extractDomainsFromRegex after the HTML branch, causing JS-only bodies to miss extractDomainsFromJS and reintroducing parser false positives; change the control flow around the looksLikeHTML check to a single conditional switch on r.Data: if looksLikeHTML(r.Data) use extractDomainsFromHTML and then extractDomainsFromJS for inline scripts (respecting maxInlineScriptSize), else if looksLikeJavaScript(r.Data) (or Content-Type if available) and len(r.Data) <= maxInlineScriptSize call extractDomainsFromJS(string(r.Data), domains, fqdns, r.Input), otherwise call extractDomainsFromRegex(r.Raw, domains, fqdns, r.Input); update or add a looksLikeJavaScript helper if needed and remove the unconditional extractDomainsFromRegex call after the HTML branch.

coderabbitai · 2026-03-21T18:04:29Z

common/httpx/domains.go

+func looksLikeHTML(data []byte) bool {
+	prefix := data
+	if len(prefix) > 1024 {
+		prefix = prefix[:1024]
+	}
+	trimmed := bytes.TrimSpace(prefix)
+	return len(trimmed) > 0 && trimmed[0] == '<'


⚠️ Potential issue | 🟡 Minor

Treat BOM-prefixed documents as HTML.

bytes.TrimSpace does not strip a UTF-8 BOM, so a valid body like \ufeff<!DOCTYPE html> returns false here and skips the parser path.

Small fix

func looksLikeHTML(data []byte) bool { prefix := data if len(prefix) > 1024 { prefix = prefix[:1024] } trimmed := bytes.TrimSpace(prefix) + trimmed = bytes.TrimPrefix(trimmed, []byte{0xEF, 0xBB, 0xBF}) return len(trimmed) > 0 && trimmed[0] == '<' }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func looksLikeHTML(data []byte) bool {

prefix := data

if len(prefix) > 1024 {

prefix = prefix[:1024]

}

trimmed := bytes.TrimSpace(prefix)

return len(trimmed) > 0 && trimmed[0] == '<'

func looksLikeHTML(data []byte) bool {

prefix := data

if len(prefix) > 1024 {

prefix = prefix[:1024]

}

trimmed := bytes.TrimSpace(prefix)

trimmed = bytes.TrimPrefix(trimmed, []byte{0xEF, 0xBB, 0xBF})

return len(trimmed) > 0 && trimmed[0] == '<'

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@common/httpx/domains.go` around lines 58 - 64, The looksLikeHTML function fails for UTF-8 BOM-prefixed documents because bytes.TrimSpace doesn't remove the BOM; update looksLikeHTML to strip a leading UTF-8 BOM (bytes.TrimPrefix(..., []byte{0xEF,0xBB,0xBF}) or equivalent) from the trimmed prefix before testing the first byte, so inputs like "\ufeff<!DOCTYPE html>" are correctly detected as HTML.

coderabbitai · 2026-03-21T18:04:29Z

common/httpx/domains.go

+	if input != val {
+		domains[val] = struct{}{}
+	}
+	if d != val && d != input {
+		fqdns[d] = struct{}{}


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail printf 'Response struct and relevant fields:\n' rg -n -C2 'type Response struct|Input string|Raw string|Data \[\]byte' --type go printf '\nPotential writes to Response.Input (excluding tests):\n' rg -n -C3 '\bInput:\s|\.Input\s*=' --type go -g '!**/*_test.go' -g '!**/vendor/**' printf '\nBodyDomainGrab call sites:\n' rg -n -C3 '\bBodyDomainGrab\s*\(' --type go

Repository: projectdiscovery/httpx

Length of output: 25591

🏁 Script executed:

#!/bin/bash set -euo pipefail # Examine the parseURL function to see if it normalizes hostnames printf 'parseURL function:\n' rg -n -A15 'func.*parseURL' --type go common/httpx/ runner/ 2>/dev/null | head -50 # Check NewRequestWithContext to understand how Host is set printf '\n\nChecking if req.Host is normalized:\n' rg -n -B2 -A2 'req.Host|\.Host' --type go common/httpx/httpx.go | head -40 # Look at the addDomainCandidate function more closely (context around lines 194-198) printf '\n\naddDomainCandidate function logic:\n' sed -n '150,220p' common/httpx/domains.go

Repository: projectdiscovery/httpx

Length of output: 3365

🏁 Script executed:

#!/bin/bash set -euo pipefail # Trace where origInput comes from - find all calls to analyze function printf 'Calls to analyze() function to trace origInput source:\n' rg -n 'analyze\(' --type go runner/runner.go | head -20 # Check where the input to BodyDomainGrab is passed printf '\n\nInput parameter flow to BodyDomainGrab:\n' sed -n '37,50p' common/httpx/domains.go # Check what BodyDomainGrab receives as input printf '\n\nBodyDomainGrab signature and input usage:\n' sed -n '37,45p' common/httpx/domains.go # Check how input is passed to extraction functions printf '\n\nHow input is passed to extraction functions:\n' rg -n 'extractDomainsFrom|addDomainCandidate' --type go common/httpx/domains.go | head -30 # Verify if req.Host can be non-canonical printf '\n\nNew request creation and Host setting:\n' sed -n '230,250p' common/httpx/httpx.go # Check publicsuffix.Domain behavior printf '\n\nPublicsuffix Domain function to understand what val is:\n' rg -n 'publicsuffix.Domain' --type go common/httpx/domains.go -B2 -A2

Repository: projectdiscovery/httpx

Length of output: 3572

🏁 Script executed:

#!/bin/bash set -euo pipefail # Find where analyze() is called and trace the input parameter printf 'Trace calls to analyze() - where does t come from:\n' sed -n '1690,1750p' runner/runner.go # Check if req.Host can contain a port or other non-canonical data printf '\n\nCheck how req.Host is set:\n' rg -n 'NewRequest|req.Host' --type go common/httpx/httpx.go -B3 -A3 | head -40 # Check the types and parseURL implementation more carefully printf '\n\nparseURL to understand Host vs Hostname:\n' rg -n -A10 'type.*URL.*struct|func.*URL.*Host' --type go | grep -A10 'urlutil' | head -20 # Look at line 238 context - how is req.Host set? printf '\n\nContext around resp.Input assignment:\n' sed -n '225,245p' common/httpx/httpx.go # Check if there's any normalization of the input before calling BodyDomainGrab printf '\n\nCheck Response struct to see all fields:\n' sed -n '1,30p' common/httpx/response.go

Repository: projectdiscovery/httpx

Length of output: 3318

🏁 Script executed:

#!/bin/bash set -euo pipefail # Look at the full addDomainCandidate function again with more context printf 'Full addDomainCandidate function:\n' sed -n '176,200p' common/httpx/domains.go # Check if Go http.Request.Host includes port printf '\n\nLook at what req.Host value contains (from Go docs behavior):\n' sed -n '410,425p' common/httpx/httpx.go # Check test case to see if input can have port printf '\n\nTest case examining input parameter:\n' sed -n '450,460p' common/httpx/domains_test.go # Look at where origInput is created in analyze function printf '\n\nanalyze function signature and origInput handling:\n' sed -n '1827,1860p' runner/runner.go # Check how target.Host is constructed printf '\n\nTarget structure and Host field:\n' rg -n 'type Target struct' --type go -A10

Repository: projectdiscovery/httpx

Length of output: 3030

🏁 Script executed:

#!/bin/bash set -euo pipefail # Check if req.Host includes port by looking at how it's used and what Go's http.Request does printf 'Check NewRequestFromURLWithContext signature and req.Host behavior:\n' rg -n 'retryablehttp.NewRequestFromURLWithContext' --type go -A5 -B5 | head -30 # Check if there's any explicit handling of ports in the comparison printf '\n\nCheck how input is normalized or processed before addDomainCandidate:\n' sed -n '37,60p' common/httpx/domains.go # Look for test cases that show input with ports printf '\n\nSearch for test cases with input containing ports:\n' rg -n 'Input:.*:' --type go common/httpx/domains_test.go # Check how resp.Input gets set in all cases printf '\n\nAll assignments to resp.Input or Input field:\n' rg -n 'resp\.Input\s*=' --type go common/httpx/ # Verify if ports are ever included in input printf '\n\nCheck URLs passed to NewRequestWithContext:\n' sed -n '1880,1890p' runner/runner.go # Look at req.String() to see if it includes port printf '\n\nCheck how req is used after being created:\n' sed -n '1956,1965p' runner/runner.go

Repository: projectdiscovery/httpx

Length of output: 1583

🏁 Script executed:

#!/bin/bash set -euo pipefail # Let's verify if req.Host includes port by checking Go's behavior printf 'Check URL parsing to understand Host field:\n' python3 << 'EOF' # Go's http.Request.Host behavior from URL # URL.Host includes port, URL.Hostname() does not # When creating request from URL, req.Host = URL.Host test_cases = [ ("http://example.com/path", "example.com"), ("http://example.com:443/path", "example.com:443"), ("http://EXAMPLE.COM/path", "EXAMPLE.COM"), ] print("Go http.Request behavior:") print("- URL.Host includes port: example.com:443") print("- URL.Hostname() excludes port: example.com") print("- req.Host comes from URL.Host, so it can include port") print() print("Issue: if input='example.com:443' and d='example.com'") print(" val = publicsuffix.Domain('example.com') = 'example.com'") print(" Check: input != val => 'example.com:443' != 'example.com' => TRUE") print(" Result: example.com gets added to domains (BUG!)") EOF # Check if there's any normalization of input before BodyDomainGrab is called printf '\n\nCheck if input is normalized before passing to BodyDomainGrab:\n' sed -n '2635,2655p' runner/runner.go # Look at how the URL is constructed to understand if port is included printf '\n\nCheck what fullURL contains at line 2644:\n' sed -n '2600,2650p' runner/runner.go | head -50

Repository: projectdiscovery/httpx

Length of output: 2853

🏁 Script executed:

#!/bin/bash set -euo pipefail # Find where the loop variable t comes from (the input parameter to analyze) printf 'Find where input t originates (look for loop context):\n' sed -n '1680,1700p' runner/runner.go # Check the process function to see where it's called from printf '\n\nCheck process() function which is called initially:\n' rg -n 'func.*process.*Target' --type go runner/runner.go -A15 | head -40 # Look for the main entry point where inputs are read printf '\n\nLook for where scan inputs are read:\n' rg -n 'func.*Scan.*process' --type go runner/runner.go | head -5 # Check what origInput can contain by finding the input source printf '\n\nFind where input strings originate from:\n' rg -n 'for.*range.*input|readInputFile|parseInput' --type go runner/runner.go | head -20

Repository: projectdiscovery/httpx

Length of output: 879

🏁 Script executed:

#!/bin/bash set -euo pipefail # Search for the process function definition printf 'Find process function definition:\n' rg -n 'func.*\(r \*Runner\) process' --type go runner/runner.go -A10 # Look for where t is defined in the context printf '\n\nLook for the process function and where it gets called:\n' rg -n 'r\.process\(' --type go runner/runner.go | head -10 # Find the main entry point printf '\n\nLook for the main scanning loop:\n' rg -n 'func.*RunEnumeration\|func.*Scan' --type go runner/runner.go | head -10

Repository: projectdiscovery/httpx

Length of output: 1349

🏁 Script executed:

#!/bin/bash set -euo pipefail # Check the test case that explicitly sets Input to verify the expected behavior printf 'Test case showing Input field usage:\n' sed -n '450,465p' common/httpx/domains_test.go # Look at resp.Input assignment again carefully printf '\n\nWhere resp.Input is set in httpx.go:\n' sed -n '235,242p' common/httpx/httpx.go # Check if Go http.Request.Host includes port by testing with a simple case printf '\n\nCheck req.Host behavior - does it include port?\n' python3 << 'EOF' # Go's net/http behavior: # When creating a request from a URL like "http://example.com:8080/path": # - url.Host = "example.com:8080" (includes port) # - url.Hostname() = "example.com" (excludes port) # - req.Host = url.Host (comes from URL.Host) # So resp.Input = req.Host will include the port print("CRITICAL: resp.Input = req.Host can include port number") print() print("Example scenario:") print(" Input URL: http://example.com:8080/") print(" req.Host = 'example.com:8080'") print(" resp.Input = 'example.com:8080'") print() print("In addDomainCandidate with d='example.com':") print(" val = publicsuffix.Domain('example.com') = 'example.com'") print(" Line 194: if input != val") print(" if 'example.com:8080' != 'example.com' => TRUE (BUG!)") print(" Adds 'example.com' to domains when it should be filtered") EOF # Verify this is actually how it works by checking the test printf '\n\nTest to confirm the issue:\n' sed -n '450,465p' common/httpx/domains_test.go

Repository: projectdiscovery/httpx

Length of output: 1821

Normalize input to extract hostname before comparing against d and val on lines 194-198.

resp.Input is set to req.Host (line 238 of httpx.go), which in Go's net/http includes the port number (e.g., "example.com:8080"). However, d and val are hostname-only after normalization. This causes the equality checks to fail: if input != val becomes "example.com:8080" != "example.com" → true, allowing the input host to leak back into Domains and Fqdns.

Extract the hostname from input before the comparisons—for example:

inputHost := strings.Split(input, ":")[0] if inputHost != val { domains[val] = struct{}{} } if d != val && d != inputHost { fqdns[d] = struct{}{} }

This matches the test expectation (domains_test.go:459) that the input domain should be excluded from results.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@common/httpx/domains.go` around lines 194 - 198, Normalize resp.Input's host portion before doing equality checks: extract hostname from the existing input value (e.g., using net.SplitHostPort or strings.Split on ':' into an inputHost) and then use inputHost in the comparisons instead of raw input; replace the checks around domains[val] and fqdns[d] so they read "if inputHost != val { domains[val] = struct{}{} }" and "if d != val && d != inputHost { fqdns[d] = struct{}{} }" and add the appropriate import (net or strings) so the normalization occurs before the comparisons that involve input, d, and val.

parser for fqdn

090f0c2

coderabbitai bot reviewed Mar 21, 2026

View reviewed changes

common/httpx/domains.go Outdated Show resolved Hide resolved

fixing lint

c3b06b5

coderabbitai bot reviewed Mar 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parser for fqdn#2459

parser for fqdn#2459
Mzack9999 wants to merge 2 commits intodevfrom
improve-fqdn-extraction

Mzack9999 commented Mar 21, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

neo-by-projectdiscovery-dev bot commented Mar 21, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Mar 21, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 21, 2026

Uh oh!

coderabbitai bot Mar 21, 2026

Uh oh!

coderabbitai bot Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Mzack9999 commented Mar 21, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed changes

Checklist

Summary by CodeRabbit

Uh oh!

neo-by-projectdiscovery-dev bot commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Neo - PR Security Review

Highlights

Uh oh!

coderabbitai bot commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Mzack9999 commented Mar 21, 2026 •

edited by coderabbitai bot

Loading

neo-by-projectdiscovery-dev bot commented Mar 21, 2026 •

edited

Loading

coderabbitai bot commented Mar 21, 2026 •

edited

Loading