XML Sitemap helps the crawlers to identify the changes to your website. In return this helps in better rankings and indexing. Moreover, xml sitemaps contain all the pages of website that are intended to be crawled by search engines and to be ranked.
Regarding sitemaps there are some features which have added in the application. Below are details.
This is default feature that if a website has sitemap URL in its robots.txt, that sitemap will be crawled. A separate tab has created to show the details of sitemaps of websites. See the image below.
By clicking the ‘View Sitemap’ all the information can be seen i.e. sitemap URL, page URL, last modified date, change frequency and priority when a sitemap is being crawled. And by clicking the ‘Download’ button that viewed sitemap can be downloaded as well. See the image below.
The default crawling option let you crawl the sitemap with the website crawling in parallel but if the intentions are to just crawl sitemaps then there is another feature under ‘Spider’ menu which let u do this. See the image below.
With this option checked; application will only crawl the sitemaps.
There could be instances when this is desired to ignore sitemap from crawling as it is, by default, selected to be crawled with the website. So, if such scenario occurs then there is an option to ignore sitemap from being crawled. Option can be found in ‘Configuration Panel’ under Spider menu. See the below image.
Website without Sitemap URL in Robots.txt
By default crawler fetches the robots.txt and search the sitemap URL in it. If it finds the Sitemap URL, it crawls it else gives the notification that sitemap URL is not found. There could be an instance when sitemap exists with the website but its URL is not presented in the robots.txt then ‘Custom Robots’ can be used to crawl the sitemap of website. Below is a snapshot for better understanding.
Above customization will ignore the actual robots.txt and will crawl above mentioned.