Using multiple sitemaps to analyse indexation on large sites
One of the easy wins in improving search traffic to a large site is to improve indexation. Indexation isn’t about the raw number of pages indexed, it’s about increasing the percentage of real, high value pages, that are indexed.
Forcing Google to index useless pages that won’t get any traffic isn’t going to help things.
Indexation is quite a straightforward issue, every site has an indexation cap based on a number of factors including:
- PageRank
- Trust
- Site / server speed
- Duplicate content
The last one is hard to explain but basically if Google sees loads of pages that are the same then it probably won’t bother to do as deep a crawl of the site as if it found a lot of high value unique pages.
Monitoring indexing using the site: command every month is good and looking at the number of pages that receive at least one visitor each month is better but both of these methods just look at the site as a whole. What we need is a method of breaking the numbers down so we can see which pages are not indexed and figure out how to improve things.
Multiple sitemaps
This is where using multiple sitemaps comes in – rather than just using one giant sitemap what we like to do is use a sitemap for each type of page on the site.
That way we can look at the number of pages indexed for each page type and immediately see that 76% of product pages are indexed but only 43% of the lower level paginated category pages are indexed for example.

Once you can diagnose exactly the type of pages that Google doesn’t want to index you can fix the issue by improving PageRank flow to those pages and adding more unique content.
Some ideas for the type of pages you might like to look at separately:
- New products this month
- Top selling products
- Pages in French/English/German etc
- Products that have not been selling
- Blog posts from a particular month/year
- Product pages
- Category pages
- Paginated category pages (page 2 of 10 etc)
- Products in a certain category
Thanks to John from web development leeds for the screenshot.
Comments
Latest from B3Labs
- Another milestone reached for Branded3 as it’s acquired by the
St Ives Group - The latest media consumer findings & what they mean for digital marketers
- Talk to Branded3 at @BuyYorkshire in Leeds next week!
Latest from Blogstorm
- After five years, Google still doesn’t know how to rank images
- Tickets now on sale for the next #B3Seminar in London – book now!
- Google Only Shows One Organic Result To iPad Users
Pingback: Diagnosing Google Crawl Allowance Using Webmaster Tools & Excel | Mobil Seo
Pingback: Diagnosing Google Crawl Allowance Using Webmaster Tools & Excel SEO Technique Help W3C Tag
Pingback: Google Image Sitemap
Pingback: Using Webmaster Tools to find what’s not indexed
Pingback: SEO | Creative XML Site Maps | One Result
Pingback: SEO-SEM-Tools