Spider Speed and Turbo Crawl

Spider speed option lets the user to increase the speed of the crawler. By default application creates 5 threads which scrap the website data in parallel. But if more speed is required (if computer hardware and internet speed in capable of handling this) then this can be done with spider speed option available in Spider menu. See the image below.

spider speed 1

Below is the image that how this option looks like.

spider speed 2

The option ‘Connection Timeout’ waits for mentioned seconds when connecting to a URL. The time can be set for this option from 1 – 15 seconds. If in that time span the webpage do not responds then application throws a ‘Connection Timeout Exception’ which is properly handled. Moreover, this option lets the user know the website responding time.

Turbo Crawl

There is an option as well for ‘Turbo Crawl’. This option saves users’ time and crawl the website at turbo speed without manual spider speed specification. It automatically sets the spider speed at turbo.

Requirements for Turbo Crawl

Turbo Crawl requires good hardware specifications for being used. Make sure you have at least ‘Core i’ technology and a good internet connection speed when enabling this option.

Note: When turbo crawl has enabled; all the filters and search options will be unable to access to make sure the data authenticity. These options will be accessible once the crawl gets completed and/or finalized by the user. Option can be found in Spider Menu. Below is a snapshot.

turbo crawl

Scenarios where Turbo Crawl is Recommended

Below are some scenarios where turbo crawl is worth using it.

  • When deep crawl is required: Make sure that system has enough memory to crawl the maximum URLs from a website.
  • If big data is required: Webbee crawls every bit from webpage and sort and schedule it in different files for user. Read the “Download Options and Advance Reports Download” section for more details.
  • If the sitemap extraction is required and the targeted website has “big in size sitemaps” and “big count of sitemaps” in its robots.txt. (If robots.txt does not contain the sitemaps URLs then manual input can be used for sitemap crawling. Read Custom Robots section)
  • Just header status codes are required for website i.e. how many webpages on website are 200 OK or Redirects and/or 404 not found.

Other Resources

Ahmad Ali

About Ahmad Ali

Ahmad is the co-founder and CEO at Webbee Inc. He’s been working as a digital marketer for past few years and has worked with some notables names across different industries. He is also the creator of Webbee SEO spider, one of the most advanced SEO spider tool on the internet.

Leave a Comment