Having Problems to Index all URLs on Sitemap
-
Hi all again ! Thanks in advance ! My client's site is having problems to index all its pages. I even bought the full extension of XML Sitemaps and the number of urls increased, but we still have problems to index all of them.
What are the reasons? The robots.txt is open for all robots, we only prohibit users and spiders to enter our Intranet. I've read that duplicate content and 404's can be the reason. Anything else?
-
Like already answered above it's quite hard to get to the 100% indexation rate for your webpages. What's your current indexation rate? If it's below 90% you might still have some issues somewhere.
-
Hi there
According to Google:
"...we don't guarantee that we'll crawl all of the pages of a particular site. Google doesn't crawl all the pages on the web, and we don't index all the pages we crawl. It's perfectly normal for not all the pages on a site to be indexed."
Google also provides tips and resources to help your site being indexed properly and, possibly, more fully. You can check that resource out here. Kissmetrics has a few other tips.
To Andy's point, Google indexes what it wants - don't be discouraged if your entire site isn't indexed in WMT. If you know how many pages are on your site (which you definitely should), I would try the "site:" function and get a better idea of what actually is indexed in Google.
Hope this helps! Good luck!
-
I have yet to see Google index every page of a site. They tend not to index pages that they don't think meet the criteria, so unless it was something like 90% weren't being indexed, I wouldn't worry about it. There are so many reasons that Google won't index a page.
You will also find that over time, if Google attributes more trust to the site, that more pages will be indexed.
Of course, you can do things to improve your changes, such as making sure Google can crawl all pages, check to see there are no bottlenecks anywhere and the big one - make sure your content is amazing. As long as the site is the best it can be, over time the number of indexed pages will increase.
Remember - the sitemap is not a guarantee that pages will be indexed. It just helps Google crawl your site.
-Andy
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sitemap: Linking horizontal pages on a sitemap that has a vertical hierarchy structure
I'm currently in the process of revamping a website and creating a sitemap for it so that all pages get indexed by search engines. The site is divided into two websites that share the same root domain. The marketing site is on example.com and the application is on go.example.com. To get to go.example.com from example.com, you need to go through one of three “action pages”. The action pages are accessed from every page on example.com where we have a CTA button on the site (that’s pretty much every page). These action pages do not link back to any other page on the site though, nor are they a necessary step to navigate to other webpages. These action pages are only viewed when a user is ready to be taken to the application site. My question is, how should these pages be set up in a vertical sitemap since these three pages have a horizontal structure? Any insight would be much appreciated!
Technical SEO | | RallyUp0 -
Sitemap
Hi, I am setting up a new sitemap for our website. the website contains about 8000 - 10.000 pages. Of wich are 6000 productpages. I have 10 categories, about 80 sub-catagories and about 400 sub-sub categories ( these ar my most important landingpages) At this moment our sitemap is only 1 MB. From that point of view 1 sitemap will be enough. But can i take SEO advantage by splitting this sitemap in 10 categories? Or are there other ways to set it up for a better SEO? Thanks!
Technical SEO | | Leonie-Kramer0 -
Special characters in URL
Will registered trademark symbol within a URL be bad? I know some special characters are unsafe (#, >, etc.) but can not find anything that mentions registered trademark. Thanks!
Technical SEO | | bonnierSEO0 -
Add selective URLs to an XML Sitemap
Hi! Our website has a very large no of pages. I am looking to create an XML Sitemap that contains only the most important pages (category pages etc). However, on crawling the website in a tool like Xenu (the others have a 500 page limit), I am unable to control which pages get added to the XML Sitemap, and which ones get excluded. Essentially, I only want pages that are upto 4 clicks away from my homepage to show up in the XML Sitemap. How should I create an XML sitemap, and at the same time control which pages of my site I add to it (category pages), and which ones I remove (product pages etc). Thanks in advance! Apurv
Technical SEO | | AB_Newbie0 -
Question about construction of our sitemap URL in robots.txt file
Hi all, This is a Webmaster/SEO question. This is the sitemap URL currently in our robots.txt file: http://www.ccisolutions.com/sitemap.xml As you can see it leads to a page with two URLs on it. Is this a problem? Wouldn't it be better to list both of those XML files as separate line items in the robots.txt file? Thanks! Dana
Technical SEO | | danatanseo0 -
Dynamic Parameters in URL
I have received lots of warnings because of long urls. Most of them are because my website has many Attributes to FILTER out products. And each time the user clicks on one, its added to the URL. pls see my site here: www.theprinterdepo.com The warning is here: Although search engines can crawl dynamic URLs, search engine representatives have warned against using over 2 parameters in any given URL. The question to the community is: -What should I do? These attributes really help the user to find easier the products. I could remove some of the attributes, I am not sure if my ecommerce solution (MAGENTO), allows to change the behavior of this so that this does not use querystring parameters.
Technical SEO | | levalencia10 -
Strange duplicate url
From your csv report I have this strange issue. This url: elettrodomestici.yeppon.it/climatizzatori/condizionatori-fissi/prodotti/condizionatori-fissi-comfee/ it's a duplicate of this elettrodomestici.yeppon.it/climatizzatori/condizionatori-fissi/prodotti/condizionatori-fissi-comfee/ but the only url that I can see in the website is this one. Why the "-" is transalted some times in "%2D" referrer obviously is elettrodomestici.yeppon.it/climatizzatori/condizionatori-fissi/prodotti/condizionatori-fissi-comfee/solo-disponibili/ I have many duplicate url...Can you help me? Thanks
Technical SEO | | yeppon0 -
URL restructure and phasing out HTML sitemap
Hi SEOMozzies, Love the Q&A resource and already found lots of useful stuff too! I just started as an in-house SEO at a retailer and my first main challenge is to tidy up the complex URL structures and remove the ugly sub sitemap approach currently used. I already found a number of suggestions but it looks like I am dealing with a number of challenges that I need to resolve in a single release. So here is the current setup: The website is an ecommerce site (department store) with around 30k products. We are using multi select navigation (non Ajax). The main website uses a third party search engine to power the multi select navigation, that search engine has a very ugly URL structure. For example www.domain.tld/browse?location=1001/brand=100/color=575&size=1&various other params, or for multi select URL’s www.domain.tld/browse?location=1001/brand=100,104,506/color=575&size=1 &various other non used URL params. URL’s are easily up to 200 characters long and non-descriptive at all to our users. Many of these type of URL’s are indexed by search engines (we currently have 1.2 million of those URL’s indexed including session id’s and all other nasty URL params) Next to this the site is using a “sub site” that is sort of optimized for SEO, not 100% sure this is cloaking but it smells like it. It has a simplified navigation structure and better URL structure for products. Layout is similair to our main site but all complex HTMLelements like multi select, large top navigations menu's etc are all removed. Many of these links are indexed by search engines and rank higher than links from our main website. The URL structure is www.domain.tld/1/optimized-url .Currently 64.000 of these URL’s are indexed. We have links to this sub site in the footer of every page but a normal customer would never reach this site unless they come from organic search. Once a user lands on one of these pages we try to push him back to the main site as quickly as possible. My planned approach to improve this: 1.) Tidy up the URL structure in the main website (e.g. www.domain.tld/women/dresses and www.domain.tld/diesel-red-skirt-4563749. I plan to use Solution 2 as described in http://www.seomoz.org/blog/building-faceted-navigation-that-doesnt-suck to block multi select URL’s from being indexed and would like to use the URL param “location” as an indicator for search engines to ignore the link. A risk here is that all my currently indexed URL (1.2 million URL’s) will be blocked immediately after I put this live. I cannot redirect those URL’s to the optimized URL’s as the old URL’s should still be accessible. 2.) Remove the links to the sub site (www.domain.tld/1/optimized-url) from the footer and redirect (301) all those URL’s to the newly created SEO friendly product URL’s. URL’s that cannot be matched since there is no similar catalog location in the main website will be redirected (301) to our homepage. I wonder if this is a correct approach and if it would be better to do this in a phased way rather than the currently planned big bang? Any feedback would be highly appreciated, also let me know if things are not clear. Thanks! Chris
Technical SEO | | eCommerceSEO0