这是indexloc提供的服务,不要输入任何密码
Skip to content

Conversation

@vzamanillo
Copy link
Contributor

The API traffic will be limited to 2,000 requests per minute for a given user, I think this should be enough for a single recon. A token rotation policy could be implemented in the future.

@forgedhallpass
Copy link
Contributor

Token rotation should already work if the user provides multiple secrets in the provider configuration file.

@vzamanillo
Copy link
Contributor Author

I mean, without having to run subfinder again, just like the GitHub source does, detecting a rate limit reached response and asking to the token manager for a new one. I don't know if this is even neccesary due to the random api key mechanism, thoughs?

@forgedhallpass
Copy link
Contributor

I don't know if this is even necessary due to the random api key mechanism, thoughts?

This is what I meant, but yeah ... if someone will need it, we can also add a token manager later on I guess.

@vzamanillo
Copy link
Contributor Author

vzamanillo commented Nov 7, 2022

Looks like sonar is down, this is why the checks are failing

imagen

tarunKoyalwar
tarunKoyalwar previously approved these changes Nov 30, 2022
Copy link
Member

@tarunKoyalwar tarunKoyalwar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm !

@tarunKoyalwar
Copy link
Member

./subfinder -d hackerone.com -s gitlab -t 250                  

               __    _____           __         
   _______  __/ /_  / __(_)___  ____/ /__  _____
  / ___/ / / / __ \/ /_/ / __ \/ __  / _ \/ ___/
 (__  ) /_/ / /_/ / __/ / / / / /_/ /  __/ /    
/____/\__,_/_.___/_/ /_/_/ /_/\__,_/\___/_/ v2.5.5

		projectdiscovery.io

Use with caution. You are responsible for your actions
Developers assume no liability and are not responsible for any misuse or damage.
By using subfinder, you also agree to the terms of the APIs used.

[INF] Loading provider config from '/Users/tarun/.config/subfinder/provider-config.yaml'
[INF] Enumerating subdomains for 'hackerone.com'
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x2 addr=0x10 pc=0x1049d72fc]

goroutine 9 [running]:
github.com/projectdiscovery/subfinder/v2/pkg/subscraping/sources/gitlab.(*Source).enumerate(0x140003cc207?, {0x105243448, 0x1400012c420}, 
{0x140003cc207, 0x56}, 0x14000420000?, 0x140000bfce8?, 0x0?, 0x0?)
	/Users/tarun/reviews/subfinder/v2/pkg/subscraping/sources/gitlab/gitlab.go:87 +0x51c
github.com/projectdiscovery/subfinder/v2/pkg/subscraping/sources/gitlab.(*Source).enumerate(0x14000182207?, {0x105243448, 0x1400012c420}, 
{0x14000182207, 0x56}, 0x1400046e1e0?, 0x140000c0048?, 0x0?, 0x0?)
	/Users/tarun/reviews/subfinder/v2/pkg/subscraping/sources/gitlab/gitlab.go:112 +0x350
github.com/projectdiscovery/subfinder/v2/pkg/subscraping/sources/gitlab.(*Source).enumerate(0x140003cc547?, {0x105243448, 0x1400012c420}, 
{0x140003cc547, 0x55}, 0x14000190d20?, 0x140000c03a8?, 0x0?, 0x0?)
	/Users/tarun/reviews/subfinder/v2/pkg/subscraping/sources/gitlab/gitlab.go:112 +0x350
github.com/projectdiscovery/subfinder/v2/pkg/subscraping/sources/gitlab.(*Source).enumerate(0x14000428207?, {0x105243448, 0x1400012c420}, 
{0x14000428207, 0x55}, 0x14000190700?, 0x140000c0708?, 0x0?, 0x0?)
	/Users/tarun/reviews/subfinder/v2/pkg/subscraping/sources/gitlab/gitlab.go:112 +0x350
github.com/projectdiscovery/subfinder/v2/pkg/subscraping/sources/gitlab.(*Source).enumerate(0x140001823a7?, {0x105243448, 0x1400012c420}, 
{0x140001823a7, 0x55}, 0x140002d4340?, 0x140000c0a68?, 0x0?, 0x0?)
	/Users/tarun/reviews/subfinder/v2/pkg/subscraping/sources/gitlab/gitlab.go:112 +0x350
github.com/projectdiscovery/subfinder/v2/pkg/subscraping/sources/gitlab.(*Source).enumerate(0x140004283a7?, {0x105243448, 0x1400012c420}, 
{0x140004283a7, 0x55}, 0x140001902a0?, 0x140000c0dc8?, 0x0?, 0x0?)
	/Users/tarun/reviews/subfinder/v2/pkg/subscraping/sources/gitlab/gitlab.go:112 +0x350
github.com/projectdiscovery/subfinder/v2/pkg/subscraping/sources/gitlab.(*Source).enumerate(0x140003cc067?, {0x105243448, 0x1400012c420}, 
{0x140003cc067, 0x55}, 0x140000c8000?, 0x140000c1128?, 0x0?, 0x0?)
	/Users/tarun/reviews/subfinder/v2/pkg/subscraping/sources/gitlab/gitlab.go:112 +0x350
github.com/projectdiscovery/subfinder/v2/pkg/subscraping/sources/gitlab.(*Source).enumerate(0x14000428067?, {0x105243448, 0x1400012c420}, 
{0x14000428067, 0x55}, 0x14000524380?, 0x140004fb488?, 0x0?, 0x0?)
	/Users/tarun/reviews/subfinder/v2/pkg/subscraping/sources/gitlab/gitlab.go:112 +0x350
github.com/projectdiscovery/subfinder/v2/pkg/subscraping/sources/gitlab.(*Source).enumerate(0x14000428547?, {0x105243448, 0x1400012c420}, 
{0x14000428547, 0x55}, 0x140000a03c0?, 0x140004fb7e8?, 0x0?, 0x0?)
	/Users/tarun/reviews/subfinder/v2/pkg/subscraping/sources/gitlab/gitlab.go:112 +0x350
github.com/projectdiscovery/subfinder/v2/pkg/subscraping/sources/gitlab.(*Source).enumerate(0x140001f4002?, {0x105243448, 0x1400012c420}, 
{0x140001f4002, 0x55}, 0x140002120d0?, 0x1400046a930?, 0x0?, 0x0?)
	/Users/tarun/reviews/subfinder/v2/pkg/subscraping/sources/gitlab/gitlab.go:112 +0x350
github.com/projectdiscovery/subfinder/v2/pkg/subscraping/sources/gitlab.(*Source).enumerate(0x140000a93a0?, {0x105243448, 0x1400012c420}, 
{0x1400017a000, 0x4e}, 0x0?, 0x0?, 0x0?, 0x0?)
	/Users/tarun/reviews/subfinder/v2/pkg/subscraping/sources/gitlab/gitlab.go:112 +0x350
github.com/projectdiscovery/subfinder/v2/pkg/subscraping/sources/gitlab.(*Source).Run.func1()
	/Users/tarun/reviews/subfinder/v2/pkg/subscraping/sources/gitlab/gitlab.go:44 +0x1a8
created by github.com/projectdiscovery/subfinder/v2/pkg/subscraping/sources/gitlab.(*Source).Run
	/Users/tarun/reviews/subfinder/v2/pkg/subscraping/sources/gitlab/gitlab.go:33 +0x120

@tarunKoyalwar
Copy link
Member

@vzamanillo

It seems like we can speedup this source .

Gitlab ratelimit = 2000 req/ min
Subfinder using Gitlab = < 100 req/min

seaching hackerone.com returns around 1000 results . it takes approx 1000 requests to fetch results from gitlab (i.e >10 minutes ) . we are not even hitting ratelimit . It will take too much time just to fetch subdomains for 10 domains .

we can probably speed it up by

  1. Fetch All Search results (all pages 1->10)
  2. Create a fixed amount of worker goroutines based on search results and parse them ( or any other Idea depends on architecture and complexity )

@tarunKoyalwar tarunKoyalwar removed the request for review from ehsandeep November 30, 2022 09:33
@vzamanillo
Copy link
Contributor Author

vzamanillo commented Nov 30, 2022

Hey @tarunKoyalwar,

thank you for your review,

Regarding your proposal:

  • Fetch All Search results (all pages 1->10) the max search API results per request is 100 that's why we have to check the response headers after the request is done.
  • Create a fixed amount of worker goroutines based on search results and parse them (or any other Idea depends on architecture and complexity) this could work but I think we are going to reach the request rate limit very soon, if this happens we will wait at least about 60 seconds before doing a new request so I don't know if there is a real improvement here.

@tarunKoyalwar
Copy link
Member

@vzamanillo

It looks like my proposal was not clear pls find the details below

Current Methodology

  • search using api
  • read response of search
    • parse links/ref (items)
    • fetch each item , run regex and get results
    • if linkheader is available fetch next page (recursion)

The easiest way to speed it up would be instead of fetching each item and run regex in serial we can do that in parallel

My Proposal

  • Search using API
  • chan Item new channel for items
  • goroutine 1
    • read response of search
    • parse items and send them to chan Item
    • Fetech Next page and repeat (recursion)
  • Worker goroutines (say 25 . but depends on performance)
  • Worker Goroutine
    • receive Item
    • run regex send results

If possible you can furthur improve it

@tarunKoyalwar
Copy link
Member

@vzamanillo

This idea currently has 1 limitation . i.e rate limit is global and not for each source as you mentioned in
#718 .Since this is a great idea it will be implemented soon
but for now I think we can use implement new ratelimit (https://github.com/projectdiscovery/ratelimit) before fetching response . But this solution is good only if global ratelimit is unlimited .

The easiest solution is to implement #718

Any suggestions @ehsandeep @Mzack9999

@vzamanillo feel free to share if you have a better solution for this.

Concurrent item processing
Hardcoded delay before parsing every 100 item block
@vzamanillo
Copy link
Contributor Author

@tarunKoyalwar, check the latest commit please, it is a concept and some thing has to be fixed (the hardcoded delay), but with acceptable results

go run main.go -d hackerone.com -s gitlab -t 250 -v

take

imagen

@tarunKoyalwar
Copy link
Member

@vzamanillo , It fails when multiple subdomains are passed

....
[INF] Found 20 subdomains for 'hackerone.com' in 3 minutes 17 seconds
[INF] Enumerating subdomains for 'inspectiv.com'
panic: send on closed channel

goroutine 44993 [running]:
github.com/projectdiscovery/subfinder/v2/pkg/subscraping/sources/gitlab.(*Source).proccesItems(0x0?, {0x10167f748, 0x1400007e480}, {0x14000256000?, 0x31, 0x140005d2e50?}, 0x0?, 0x100ad1094?, 0x14000927200?, 0x140003b3290?)
	/Users/tarun/reviews/subfinder/v2/pkg/subscraping/sources/gitlab/gitlab.go:106 +0x274
created by github.com/projectdiscovery/subfinder/v2/pkg/subscraping/sources/gitlab.(*Source).enumerate
	/Users/tarun/reviews/subfinder/v2/pkg/subscraping/sources/gitlab/gitlab.go:75 +0x2d4
panic: send on closed channel

goroutine 43767 [running]:
github.com/projectdiscovery/subfinder/v2/pkg/subscraping/sources/gitlab.(*Source).proccesItems(0x1400063c420?, {0x10167f748, 0x1400007e480}, {0x140006da000?, 0x63, 0x14000657e50?}, 0x0?, 0x100ad1094?, 0x140000010e0?, 0x140006bd290?)
	/Users/tarun/reviews/subfinder/v2/pkg/subscraping/sources/gitlab/gitlab.go:106 +0x274
created by github.com/projectdiscovery/subfinder/v2/pkg/subscraping/sources/gitlab.(*Source).enumerate
	/Users/tarun/reviews/subfinder/v2/pkg/subscraping/sources/gitlab/gitlab.go:75 +0x2d4
exit status 2

Víctor Zamanillo and others added 2 commits November 30, 2022 22:19
* digitorus source

* Sorted AllSources

* Fixed TestSourceFiltering after merging dev branch
Copy link
Member

@tarunKoyalwar tarunKoyalwar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@tarunKoyalwar
Copy link
Member

@ehsandeep ,

gitlab has rate limit of 2000 req/min . and gitlab source does on an average > 90 req/min .
Existing subfinder architecture slows down execution source speed . current implementation creates new waitgroup to speedup the source but is not stable .PR can be merged into dev but requires completion of #718 and other planned major refactoring before merging it to master/main .

@ehsandeep ehsandeep merged commit 27d4087 into projectdiscovery:dev Dec 15, 2022
@vzamanillo vzamanillo deleted the gitlab-search branch December 15, 2022 11:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants