-
-
Notifications
You must be signed in to change notification settings - Fork 762
Description
This issue is for tracking the ongoing development of BBOT's web spider capabilities.
BBOT's web spider is already solid, easily crawling websites and extracting URLs, JS links, etc. How well it performs compared to Project Discovery's Katana is unknown.
Here are some things we can do to build on BBOT's feature set, to make it a best-in-class web spider:
- Test both BBOT and Katana against the same targets to identify strengths, weaknesses, and blind spots.
- Improve BBOT's Custom YARA Rules documentation to include useful, real examples instead of the "AAAAA" placeholders.
- Create web-spidering presets for more custom use cases, e.g. for when a user wants to extract and display all links.
- Replace Gowitness with a more native headless solution, which integrates nicely with the web spider.
Also, consolidating / rustifying excavate, along with it's custom rule integration, will enable us to spider at scale, with the highest performance possible.
Why not a Katana module?
While a Katana module would be easy to write, it wouldn't be ideal for two main reasons:
- BBOT is already recursive, and introducing another recursive tool is likely to have unintended side effects. Examples include infinite recursion bugs, visiting the same URL multiple times, or putting heavy stress on the target.
- Many of Katana's features are already included in BBOT, including configurable web spider settings, URL extraction, and custom rules to search HTTP responses.
Therefore the best approach will be to polish BBOT's existing spider feature set to make it more effective and user friendly.
Relevant: