-
Notifications
You must be signed in to change notification settings - Fork 1.5k
add onhype #1647
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add onhype #1647
Conversation
WalkthroughAdds ONYPHE as a new passive/subscraping source. Registers it in passive sources and tests. Implements Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant Subfinder
participant Passive as Passive Manager
participant Onyphe as Onyphe Source
participant API as ONYPHE API
User->>Subfinder: run subfinder <domain>
Subfinder->>Passive: enumerate(domain)
Passive->>Onyphe: Run(ctx, domain, session)
rect rgba(200,230,255,0.3)
note over Onyphe: Initialize / pick API key
loop Paginate until no results or max_page
Onyphe->>API: GET /api/v2/search?q=category:resolver domain:<domain>
API-->>Onyphe: JSON response
Onyphe->>Onyphe: Unmarshal (tolerant types)
Onyphe-->>Passive: subscraping.Result{Type: Subdomain}
end
end
Passive-->>Subfinder: stream results
Subfinder-->>User: output subdomains
note over Onyphe,Passive: Track errors, results, time, skipped
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing touches
🧪 Generate unit tests
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
🧹 Nitpick comments (2)
pkg/subscraping/sources/onhype/onhype.go (2)
91-113: Unused Result fieldsHostandDomainare never emitted.The
Resultstruct definesHostandDomainfields (lines 30-31), and the customUnmarshalJSONmethod populates them (lines 275-297), but the Run method only emitsSubdomains,Hostname,Forward, andReverse.If
HostandDomainfields contain useful subdomain data, emit them as well:if record.Reverse != "" { results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: record.Reverse} s.results++ } + + if record.Host != "" { + results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: record.Host} + s.results++ + } + + if record.Domain != "" { + results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: record.Domain} + s.results++ + } }Alternatively, if these fields are not needed, remove them from the
Resultstruct and theUnmarshalJSONmethod to reduce complexity.
157-228: LGTM with optional refactor suggestion.The custom
UnmarshalJSONforOnypheResponsecorrectly handles ONYPHE API responses where numeric fields (page,page_size,total,max_page) may be returned as either numbers or strings.The normalization logic is repetitive. Consider extracting it into a helper function:
func parseIntField(raw json.RawMessage) (int, error) { if len(raw) == 0 { return 0, nil } // Try direct parse if val, err := strconv.Atoi(string(raw)); err == nil { return val, nil } // Try unquoted string var str string if err := json.Unmarshal(raw, &str); err == nil { if val, err := strconv.Atoi(str); err == nil { return val, nil } } return 0, fmt.Errorf("cannot parse int from %s", string(raw)) }Then use it in
UnmarshalJSON:o.Page, _ = parseIntField(raw.Page) o.PageSize, _ = parseIntField(raw.PageSize) o.Total, _ = parseIntField(raw.Total) o.MaxPage, _ = parseIntField(raw.MaxPage)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
pkg/passive/sources.go(2 hunks)pkg/passive/sources_test.go(2 hunks)pkg/subscraping/sources/onhype/onhype.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
pkg/passive/sources.go (1)
pkg/subscraping/sources/onhype/onhype.go (1)
Source(34-40)
pkg/subscraping/sources/onhype/onhype.go (2)
pkg/subscraping/types.go (2)
Session(71-78)Statistics(29-34)pkg/subscraping/utils.go (1)
PickRandom(12-20)
🪛 GitHub Actions: 🔨 Build Test
pkg/subscraping/sources/onhype/onhype.go
[error] 268-268: Error return value of json.Unmarshal is not checked (errcheck)
🪛 GitHub Check: Lint Test
pkg/subscraping/sources/onhype/onhype.go
[failure] 272-272:
Error return value of json.Unmarshal is not checked (errcheck)
[failure] 268-268:
Error return value of json.Unmarshal is not checked (errcheck)
🔇 Additional comments (11)
pkg/subscraping/sources/onhype/onhype.go (11)
16-23: LGTM!The
OnypheResponsestruct correctly models the paginated API response with error tracking and result sets.
25-32: LGTM!The
Resultstruct appropriately captures subdomain and hostname data from ONYPHE resolver records.
34-40: LGTM!The
Sourcestruct follows the established pattern for tracking API keys and statistics.
43-58: LGTM with a note on API key handling.The setup, defer cleanup, and API key selection follow the established pattern. The source correctly skips when no API key is available.
60-64: LGTM!Authorization header and pagination parameters are correctly configured for ONYPHE Search API v2.
80-90: LGTM!JSON decoding and response cleanup are correctly implemented. The error handling discards the response body and aborts pagination, which is consistent with the HTTP error handling pattern in this source.
115-121: LGTM!Pagination termination logic correctly checks for empty results or reaching
MaxPage.
132-146: LGTM!The interface methods correctly indicate that ONYPHE is a default source requiring an API key, without recursive support. API key management follows the standard pattern.
148-155: LGTM!Statistics reporting follows the established pattern.
230-300: LGTM after fixing error handling.The custom
UnmarshalJSONforResultcorrectly normalizes fields that may be strings or arrays. Once the error handling forForwardandReverseis fixed, the implementation will be complete.
69-78: Keep abort-on-error for pagination (consistent with other sources). Aborting the loop on any HTTP error aligns with existing implementations—no source currently skips failed pages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
pkg/subscraping/sources/onyphe/onyphe.go (1)
267-273: Address error handling as flagged in previous review.The previous review flagged ignoring
json.Unmarshalerrors on lines 268 and 272 as a critical issue. While using the blank identifier_bypasses the linter, it still silently discards malformed JSON.For consistency with the error handling pattern used for other fields (Hostname, Host, Domain), consider handling these errors or documenting why they can be safely ignored.
Apply this diff to handle errors consistently:
if len(raw.Forward) > 0 { - _ = json.Unmarshal(raw.Forward, &r.Forward) + var forwardStr string + if err := json.Unmarshal(raw.Forward, &forwardStr); err == nil { + r.Forward = forwardStr + } } if len(raw.Reverse) > 0 { - _ = json.Unmarshal(raw.Reverse, &r.Reverse) + var reverseStr string + if err := json.Unmarshal(raw.Reverse, &reverseStr); err == nil { + r.Reverse = reverseStr + } }
🧹 Nitpick comments (2)
pkg/subscraping/sources/onyphe/onyphe.go (2)
91-113: Consider deduplication for subdomain results.The code emits subdomains from multiple fields (Subdomains array, Hostname, Forward, Reverse) without deduplication. If ONYPHE returns the same subdomain in different fields, it will be emitted multiple times, inflating the result count.
Consider adding deduplication logic or verify that the upstream consumer handles duplicates.
Example implementation with a map to track seen subdomains:
+ seen := make(map[string]bool) + for _, record := range respOnyphe.Results { for _, subdomain := range record.Subdomains { - if subdomain != "" { + if subdomain != "" && !seen[subdomain] { + seen[subdomain] = true results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: subdomain} s.results++ } } - if record.Hostname != "" { + if record.Hostname != "" && !seen[record.Hostname] { + seen[record.Hostname] = true results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: record.Hostname} s.results++ } - if record.Forward != "" { + if record.Forward != "" && !seen[record.Forward] { + seen[record.Forward] = true results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: record.Forward} s.results++ } - if record.Reverse != "" { + if record.Reverse != "" && !seen[record.Reverse] { + seen[record.Reverse] = true results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: record.Reverse} s.results++ } }
166-228: Consider extracting repeated parsing logic.Lines 175-225 repeat identical parsing logic for four numeric fields. Extract this into a helper function to reduce duplication and improve maintainability.
func parseIntField(raw json.RawMessage) int { if len(raw) == 0 { return 0 } // Try parsing as raw number string if val, err := strconv.Atoi(string(raw)); err == nil { return val } // Try parsing as JSON-quoted string var strVal string if err := json.Unmarshal(raw, &strVal); err == nil { if val, err := strconv.Atoi(strVal); err == nil { return val } } return 0 } func (o *OnypheResponse) UnmarshalJSON(data []byte) error { var raw OnypheResponseRaw if err := json.Unmarshal(data, &raw); err != nil { return err } o.Error = raw.Error o.Results = raw.Results o.Page = parseIntField(raw.Page) o.PageSize = parseIntField(raw.PageSize) o.Total = parseIntField(raw.Total) o.MaxPage = parseIntField(raw.MaxPage) return nil }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
pkg/passive/sources.go(2 hunks)pkg/passive/sources_test.go(2 hunks)pkg/subscraping/sources/onyphe/onyphe.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- pkg/passive/sources_test.go
- pkg/passive/sources.go
🧰 Additional context used
🧬 Code graph analysis (1)
pkg/subscraping/sources/onyphe/onyphe.go (2)
pkg/subscraping/types.go (2)
Session(71-78)Statistics(29-34)pkg/subscraping/utils.go (1)
PickRandom(12-20)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Test Builds (macOS-latest)
- GitHub Check: Test Builds (ubuntu-latest)
- GitHub Check: Test Builds (windows-latest)
- GitHub Check: Analyze (go)
🔇 Additional comments (4)
pkg/subscraping/sources/onyphe/onyphe.go (4)
132-134: No change required: ONYPHE’s IsDefault() returning true is consistent with other API-key sources (e.g., Virustotal, Shodan).
69-70: No update needed for API endpoint
ONYPHE Search API v2 uses https://www.onyphe.io/api/v2/search/ according to the official documentation.
115-117: Use>instead of>=to include the final page.Change the pagination exit check to:
- if len(respOnyphe.Results) == 0 || page >= respOnyphe.MaxPage { + if len(respOnyphe.Results) == 0 || page > respOnyphe.MaxPage {[max_page is inclusive last page]
60-60: Remove bearer token case check Lowercase “bearer” is valid per ONYPHE’s docs and RFC 7235’s case-insensitive auth-scheme.Likely an incorrect or invalid review comment.
Mzack9999
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation: lgtm
closes #1645
Summary by CodeRabbit
New Features
Tests