这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@dogancanbakir
Copy link
Member

@dogancanbakir dogancanbakir commented Sep 22, 2025

closes #1645

$ go run . -d hackerone.com -s onyphe

               __    _____           __         
   _______  __/ /_  / __(_)___  ____/ /__  _____
  / ___/ / / / __ \/ /_/ / __ \/ __  / _ \/ ___/
 (__  ) /_/ / /_/ / __/ / / / / /_/ /  __/ /    
/____/\__,_/_.___/_/ /_/_/ /_/\__,_/\___/_/

                projectdiscovery.io

[INF] Current subfinder version v2.9.0 (latest)
[INF] Loading provider config from /Users/dogancanbakir/Library/Application Support/subfinder/provider-config.yaml
[INF] Enumerating subdomains for hackerone.com
mta-sts.hackerone.com
ns.hackerone.com
b.ns.hackerone.com
a.ns.hackerone.com
www.hackerone.com
support.hackerone.com
[INF] Found 6 subdomains for hackerone.com in 2 seconds 367 milliseconds

Summary by CodeRabbit

  • New Features

    • Added Onyphe as a new passive source for subdomain discovery, expanding coverage and improving result reliability.
    • Enabled by default (uses your API key if provided).
    • Supports paginated queries to retrieve more results efficiently.
  • Tests

    • Updated test suite to include Onyphe in the recognized and default source sets.

@dogancanbakir dogancanbakir self-assigned this Sep 22, 2025
@coderabbitai
Copy link

coderabbitai bot commented Sep 22, 2025

Walkthrough

Adds ONYPHE as a new passive/subscraping source. Registers it in passive sources and tests. Implements pkg/subscraping/sources/onyphe/onyphe.go with API-key handling, paginated querying of ONYPHE Search API v2 for resolver data, custom JSON unmarshalling, result streaming, and statistics.

Changes

Cohort / File(s) Summary
Passive source registration
pkg/passive/sources.go, pkg/passive/sources_test.go
Import and register the onyphe source in NameSourceMap and AllSources; update tests to include "onyphe" in expected sources and defaults.
New ONYPHE source implementation
pkg/subscraping/sources/onyphe/onyphe.go
Add Source with API key management, Run loop calling ONYPHE Search API v2 (category:resolver domain:<target>), pagination, robust JSON unmarshalling, streaming subdomain results, and statistics reporting.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant Subfinder
  participant Passive as Passive Manager
  participant Onyphe as Onyphe Source
  participant API as ONYPHE API

  User->>Subfinder: run subfinder <domain>
  Subfinder->>Passive: enumerate(domain)
  Passive->>Onyphe: Run(ctx, domain, session)

  rect rgba(200,230,255,0.3)
    note over Onyphe: Initialize / pick API key
    loop Paginate until no results or max_page
      Onyphe->>API: GET /api/v2/search?q=category:resolver domain:<domain>
      API-->>Onyphe: JSON response
      Onyphe->>Onyphe: Unmarshal (tolerant types)
      Onyphe-->>Passive: subscraping.Result{Type: Subdomain}
    end
  end

  Passive-->>Subfinder: stream results
  Subfinder-->>User: output subdomains
  note over Onyphe,Passive: Track errors, results, time, skipped
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • ehsandeep

Poem

A twitch of whiskers, keys in paw, I hop into the flow,
Query moonlit resolvers where the hidden subdomains grow.
Page by page I nibble bytes, unmarshalling the night,
Each hostname like a clover leaf—crisp, delicious, right.
Onyphe’s path now burrowed in—results begin to show! 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Title Check ⚠️ Warning The title is too short and misspells the new source, failing to clearly convey that the change adds full Onyphe subscraping and passive integration. Update the title to accurately reflect the addition of the Onyphe source, for example “Add Onyphe subscraping and passive source integration,” and correct the spelling.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues Check ✅ Passed The PR satisfies issue #1645 by implementing a new Onyphe source under pkg/subscraping, integrating the Search API v2 with the specified OQL query, wiring the source into passive sources, and updating tests to include “onyphe.”
Out of Scope Changes Check ✅ Passed All modifications, including the new subscraping implementation, JSON handling, passive sources integration, and test updates, directly support the ONYPHE addition and introduce no unrelated changes.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch add_onhype

Comment @coderabbitai help to get the list of available commands and usage tips.

@dogancanbakir dogancanbakir marked this pull request as ready for review October 2, 2025 10:28
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (2)
pkg/subscraping/sources/onhype/onhype.go (2)

91-113: Unused Result fields Host and Domain are never emitted.

The Result struct defines Host and Domain fields (lines 30-31), and the custom UnmarshalJSON method populates them (lines 275-297), but the Run method only emits Subdomains, Hostname, Forward, and Reverse.

If Host and Domain fields contain useful subdomain data, emit them as well:

 			if record.Reverse != "" {
 				results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: record.Reverse}
 				s.results++
 			}
+
+			if record.Host != "" {
+				results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: record.Host}
+				s.results++
+			}
+
+			if record.Domain != "" {
+				results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: record.Domain}
+				s.results++
+			}
 		}

Alternatively, if these fields are not needed, remove them from the Result struct and the UnmarshalJSON method to reduce complexity.


157-228: LGTM with optional refactor suggestion.

The custom UnmarshalJSON for OnypheResponse correctly handles ONYPHE API responses where numeric fields (page, page_size, total, max_page) may be returned as either numbers or strings.

The normalization logic is repetitive. Consider extracting it into a helper function:

func parseIntField(raw json.RawMessage) (int, error) {
	if len(raw) == 0 {
		return 0, nil
	}
	
	// Try direct parse
	if val, err := strconv.Atoi(string(raw)); err == nil {
		return val, nil
	}
	
	// Try unquoted string
	var str string
	if err := json.Unmarshal(raw, &str); err == nil {
		if val, err := strconv.Atoi(str); err == nil {
			return val, nil
		}
	}
	
	return 0, fmt.Errorf("cannot parse int from %s", string(raw))
}

Then use it in UnmarshalJSON:

o.Page, _ = parseIntField(raw.Page)
o.PageSize, _ = parseIntField(raw.PageSize)
o.Total, _ = parseIntField(raw.Total)
o.MaxPage, _ = parseIntField(raw.MaxPage)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1464ef0 and 648f1e1.

📒 Files selected for processing (3)
  • pkg/passive/sources.go (2 hunks)
  • pkg/passive/sources_test.go (2 hunks)
  • pkg/subscraping/sources/onhype/onhype.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
pkg/passive/sources.go (1)
pkg/subscraping/sources/onhype/onhype.go (1)
  • Source (34-40)
pkg/subscraping/sources/onhype/onhype.go (2)
pkg/subscraping/types.go (2)
  • Session (71-78)
  • Statistics (29-34)
pkg/subscraping/utils.go (1)
  • PickRandom (12-20)
🪛 GitHub Actions: 🔨 Build Test
pkg/subscraping/sources/onhype/onhype.go

[error] 268-268: Error return value of json.Unmarshal is not checked (errcheck)

🪛 GitHub Check: Lint Test
pkg/subscraping/sources/onhype/onhype.go

[failure] 272-272:
Error return value of json.Unmarshal is not checked (errcheck)


[failure] 268-268:
Error return value of json.Unmarshal is not checked (errcheck)

🔇 Additional comments (11)
pkg/subscraping/sources/onhype/onhype.go (11)

16-23: LGTM!

The OnypheResponse struct correctly models the paginated API response with error tracking and result sets.


25-32: LGTM!

The Result struct appropriately captures subdomain and hostname data from ONYPHE resolver records.


34-40: LGTM!

The Source struct follows the established pattern for tracking API keys and statistics.


43-58: LGTM with a note on API key handling.

The setup, defer cleanup, and API key selection follow the established pattern. The source correctly skips when no API key is available.


60-64: LGTM!

Authorization header and pagination parameters are correctly configured for ONYPHE Search API v2.


80-90: LGTM!

JSON decoding and response cleanup are correctly implemented. The error handling discards the response body and aborts pagination, which is consistent with the HTTP error handling pattern in this source.


115-121: LGTM!

Pagination termination logic correctly checks for empty results or reaching MaxPage.


132-146: LGTM!

The interface methods correctly indicate that ONYPHE is a default source requiring an API key, without recursive support. API key management follows the standard pattern.


148-155: LGTM!

Statistics reporting follows the established pattern.


230-300: LGTM after fixing error handling.

The custom UnmarshalJSON for Result correctly normalizes fields that may be strings or arrays. Once the error handling for Forward and Reverse is fixed, the implementation will be complete.


69-78: Keep abort-on-error for pagination (consistent with other sources). Aborting the loop on any HTTP error aligns with existing implementations—no source currently skips failed pages.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
pkg/subscraping/sources/onyphe/onyphe.go (1)

267-273: Address error handling as flagged in previous review.

The previous review flagged ignoring json.Unmarshal errors on lines 268 and 272 as a critical issue. While using the blank identifier _ bypasses the linter, it still silently discards malformed JSON.

For consistency with the error handling pattern used for other fields (Hostname, Host, Domain), consider handling these errors or documenting why they can be safely ignored.

Apply this diff to handle errors consistently:

 	if len(raw.Forward) > 0 {
-		_ = json.Unmarshal(raw.Forward, &r.Forward)
+		var forwardStr string
+		if err := json.Unmarshal(raw.Forward, &forwardStr); err == nil {
+			r.Forward = forwardStr
+		}
 	}

 	if len(raw.Reverse) > 0 {
-		_ = json.Unmarshal(raw.Reverse, &r.Reverse)
+		var reverseStr string
+		if err := json.Unmarshal(raw.Reverse, &reverseStr); err == nil {
+			r.Reverse = reverseStr
+		}
 	}
🧹 Nitpick comments (2)
pkg/subscraping/sources/onyphe/onyphe.go (2)

91-113: Consider deduplication for subdomain results.

The code emits subdomains from multiple fields (Subdomains array, Hostname, Forward, Reverse) without deduplication. If ONYPHE returns the same subdomain in different fields, it will be emitted multiple times, inflating the result count.

Consider adding deduplication logic or verify that the upstream consumer handles duplicates.

Example implementation with a map to track seen subdomains:

+		seen := make(map[string]bool)
+		
 		for _, record := range respOnyphe.Results {
 			for _, subdomain := range record.Subdomains {
-				if subdomain != "" {
+				if subdomain != "" && !seen[subdomain] {
+					seen[subdomain] = true
 					results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: subdomain}
 					s.results++
 				}
 			}
 
-			if record.Hostname != "" {
+			if record.Hostname != "" && !seen[record.Hostname] {
+				seen[record.Hostname] = true
 				results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: record.Hostname}
 				s.results++
 			}
 
-			if record.Forward != "" {
+			if record.Forward != "" && !seen[record.Forward] {
+				seen[record.Forward] = true
 				results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: record.Forward}
 				s.results++
 			}
 
-			if record.Reverse != "" {
+			if record.Reverse != "" && !seen[record.Reverse] {
+				seen[record.Reverse] = true
 				results <- subscraping.Result{Source: s.Name(), Type: subscraping.Subdomain, Value: record.Reverse}
 				s.results++
 			}
 		}

166-228: Consider extracting repeated parsing logic.

Lines 175-225 repeat identical parsing logic for four numeric fields. Extract this into a helper function to reduce duplication and improve maintainability.

func parseIntField(raw json.RawMessage) int {
	if len(raw) == 0 {
		return 0
	}
	
	// Try parsing as raw number string
	if val, err := strconv.Atoi(string(raw)); err == nil {
		return val
	}
	
	// Try parsing as JSON-quoted string
	var strVal string
	if err := json.Unmarshal(raw, &strVal); err == nil {
		if val, err := strconv.Atoi(strVal); err == nil {
			return val
		}
	}
	
	return 0
}

func (o *OnypheResponse) UnmarshalJSON(data []byte) error {
	var raw OnypheResponseRaw
	if err := json.Unmarshal(data, &raw); err != nil {
		return err
	}

	o.Error = raw.Error
	o.Results = raw.Results
	o.Page = parseIntField(raw.Page)
	o.PageSize = parseIntField(raw.PageSize)
	o.Total = parseIntField(raw.Total)
	o.MaxPage = parseIntField(raw.MaxPage)

	return nil
}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d236bba and 7823d41.

📒 Files selected for processing (3)
  • pkg/passive/sources.go (2 hunks)
  • pkg/passive/sources_test.go (2 hunks)
  • pkg/subscraping/sources/onyphe/onyphe.go (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • pkg/passive/sources_test.go
  • pkg/passive/sources.go
🧰 Additional context used
🧬 Code graph analysis (1)
pkg/subscraping/sources/onyphe/onyphe.go (2)
pkg/subscraping/types.go (2)
  • Session (71-78)
  • Statistics (29-34)
pkg/subscraping/utils.go (1)
  • PickRandom (12-20)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Test Builds (macOS-latest)
  • GitHub Check: Test Builds (ubuntu-latest)
  • GitHub Check: Test Builds (windows-latest)
  • GitHub Check: Analyze (go)
🔇 Additional comments (4)
pkg/subscraping/sources/onyphe/onyphe.go (4)

132-134: No change required: ONYPHE’s IsDefault() returning true is consistent with other API-key sources (e.g., Virustotal, Shodan).


69-70: No update needed for API endpoint
ONYPHE Search API v2 uses https://www.onyphe.io/api/v2/search/ according to the official documentation.


115-117: Use > instead of >= to include the final page.

Change the pagination exit check to:

-   if len(respOnyphe.Results) == 0 || page >= respOnyphe.MaxPage {
+   if len(respOnyphe.Results) == 0 || page > respOnyphe.MaxPage {

[max_page is inclusive last page]


60-60: Remove bearer token case check Lowercase “bearer” is valid per ONYPHE’s docs and RFC 7235’s case-insensitive auth-scheme.

Likely an incorrect or invalid review comment.

Copy link
Member

@Mzack9999 Mzack9999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation: lgtm

@dogancanbakir dogancanbakir merged commit 7396f0d into dev Oct 20, 2025
10 checks passed
@dogancanbakir dogancanbakir deleted the add_onhype branch October 20, 2025 07:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add ONYPHE

3 participants