Analysis

The crawl phase of a scan involves navigating around the target web application, following links and submitting forms to catalog the content of the entire target web application and the navigational paths within it to create an accurate map of the target web application.

Configure the crawler

Next step is to set the configurations for the crawler to run on.

This is a very crucial step as the performance and success of the crawler depends on the configurations you have set. Take time to study the effects of all the Configrations.

Click the (config action) on the toolbar to get access to the configuration dialog where you can set preferred configurations.

The configurations that affect the crawler are:

  • Limits- Sets all the limitations for the crawler

  • Crawler- Sets all the essesntial crawler configurations

  • Headers- Sets the http request headers to be used by the crawler request.

  • Input Fields- Sets the values for the page’s input fields for automatic filling and submition.

  • Exclusion- Sets the paths, cookies, file extensions and url parameters to be excluded for the crawl.

  • Authentication- Sets values for automatic authentication by the crawler.

  • Proxy- Sets the proxy address and port where all crawl request will pass through.

Spider Suite can crawl an entire target web application from a single link which acts as the root entry page for target web application.

To start crawling requires you to follow the following simple and few procedures.

  • Input the target link

Add the target url link.

Few things to Note:

Always make sure that you have input a valid url with its protocal/schema.

- Valid Links are:
     https://example.com
     https://example.com/
     https://example.com/path1/path2
     https://example.com:443/path1
     https://127.0.0.1:80

- Invalid Links are:
    example.com
    https://example/
    https://example.com?param1=value1&param2=value2
  • Start Crawler

Start the crawler by clicking on the Crawl button and the crawler with immediately start crawling the target web application.

After starting the crawler, you can observer the crawler’s progress on the far right corner which shows the number of pages crawled per all the pages available:

progress = / .

After crawler has started you have options to pause or stop the crawler.

  • Pause Crawler

Spider Suite allows you to pause the crawler at any point during crawling and for any duration of time by pressing the Pause button.

After pressing Pause button the crawler immediately pauses sending the requests to the server or processing new replies but all the already finished and processed pages will still be added to the sitemap, so its not uncommon to pause the crawler and still seeing a few pages being added to the sitemap.

  • Resume Crawler

After pausing the crawler you can resume crawling by pressing the Resume button and the crawler will immediately resume crawling the target web pages where it left of.

  • Stop Crawler

You can stop the crawler at any point in time by pressing the Stop button. Stopping the crawler means that you terminate the crawler and you can no longer resume that particular crawl, all resources allocated are cleaned hence you can only start afresh from there.

After pressing Stop button the crawler immediately stops sending the requests to the server but it will wait untill all the already sent request to be processed and added to the sitemap before it kills all the crawler threads. So its not uncommon to stop the crawler and still seeing a few pages being added to the sitemap as it will wait for all responses from the target server to be processed.

After stopping the crawler you may be prompted to save all the remaining target links that had’nt been crawled yet.

If you accept all the pending links will be added to the passive crawler tool.

If you deny all the pending links will be discarded.

Advance Crawling.

SpiderSuite has advance crawling options:

This advance crawling can be accessed by clicking on the [...] button.

With SpiderSuite you have the ability to provide an initial list of links (seed links) of the particular target and they will be added to the crawler queue of links to be crawled.

  • Enter Target link
  • Add Seed links

Note: The provided list of links should all relate (have the same hostname) to the main target link as the crawler uses the main target link as the reference point for all the links to be crawled.

  • Click on Crawl to begin crawling

With SpiderSuite you also have the ability to fetch a provided list of links.

This type of crawling (fetching) does not crawl any additional links extrated from the crawled pages. Only the provided links will be crawled.

  • You can provide file containing the list of links to be fetched. This is ideal for fectching a very long list of links as the file is not loaded into memory(it uses a streaming api to get line after line of link from the file and fetches it)
  • Or you can input the list of links to be fetched. This is ideal for fetching a small to medium list of links as the list is stored in memory.
  • Start the crawling by clicking on Fetch button.

Bruteforcing pages/directories

Lastly SpiderSuite also has the ability to bruteforce target sites’s directories(pages). This is a useful feature for scoping directories and files that may be hidden.

  • Set the target link(url).
  • You can provide file containing the wordlist pages/directories to be used for bruteforcing. This is ideal for bruteforcing with a very long wordlist as the file is not loaded into memory(it uses a streaming api to get line after line of page name from the file, append it to the target link and fetch it).
  • Or you can input the wordlist for bruteforce. This is ideal for small to medium worldlist.
  • Start bruteforcing by clicking on Bruteforce button.

Sitemap

All the results from Crawling, Fetching and Bruteforcing will be displayed on SpiderSuite’s Sitemap.

The actual pages are already saved on SpiderSuite’s current project database (.sspd) file.

To view content of any page on the sitemap simply click on it and all its content will be displayed on the structure and source tab.

You can browse the structure and source tab to view all the content extracted from the particular page you clicked on.

You can perform desired actions on a page on sitemap by right clicking on it and choosing the action you want to perform.

Or You can perform desired actions on the entire list on sitemap by clicking on the Actions button. Please note that the Actions button is only activated if there are links on the sitemap.

Graph

SpiderSuite has the capabilities of visualizing the links on the sitemap using a graph and also ability to manipulate the graph to your liking.

  • Visualize the Entire sitemap’s on a graph.

Simply click on the Actions button and click on Show Graph action to visualize the graph.

  • Visualize a sitemap branch on a graph.

First change the Sitemap’s view to Tree view.

Then Simply right click on the branch you want to visualize and click on Show Graph. This will only show the graph for that particular choosen branch.

  • The Graph
  • You can manipulate the graph to you linking by simply clicking on the (config action) icon on graph menu bar and set your desired configurations.