Having Problems to Index all URLs on Sitemap
-
Hi all again ! Thanks in advance ! My client's site is having problems to index all its pages. I even bought the full extension of XML Sitemaps and the number of urls increased, but we still have problems to index all of them.
What are the reasons? The robots.txt is open for all robots, we only prohibit users and spiders to enter our Intranet. I've read that duplicate content and 404's can be the reason. Anything else?
-
Like already answered above it's quite hard to get to the 100% indexation rate for your webpages. What's your current indexation rate? If it's below 90% you might still have some issues somewhere.
-
Hi there
According to Google:
"...we don't guarantee that we'll crawl all of the pages of a particular site. Google doesn't crawl all the pages on the web, and we don't index all the pages we crawl. It's perfectly normal for not all the pages on a site to be indexed."
Google also provides tips and resources to help your site being indexed properly and, possibly, more fully. You can check that resource out here. Kissmetrics has a few other tips.
To Andy's point, Google indexes what it wants - don't be discouraged if your entire site isn't indexed in WMT. If you know how many pages are on your site (which you definitely should), I would try the "site:" function and get a better idea of what actually is indexed in Google.
Hope this helps! Good luck!
-
I have yet to see Google index every page of a site. They tend not to index pages that they don't think meet the criteria, so unless it was something like 90% weren't being indexed, I wouldn't worry about it. There are so many reasons that Google won't index a page.
You will also find that over time, if Google attributes more trust to the site, that more pages will be indexed.
Of course, you can do things to improve your changes, such as making sure Google can crawl all pages, check to see there are no bottlenecks anywhere and the big one - make sure your content is amazing. As long as the site is the best it can be, over time the number of indexed pages will increase.
Remember - the sitemap is not a guarantee that pages will be indexed. It just helps Google crawl your site.
-Andy
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Folders in url structure?
Hello, Revamping an out-of-date website and am wondering if I need to include the folders (categories) in the url structure? The proposed structure has 8 main folders. I've been reading that Google is ok if the folder is not included in the url, but is it really? The hesitation I have is that the urls are getting long and the main folder only has only a sub folder beneath it. So, /folder-name/facility-name/treatment-overview. This looks too long, doesn't it? Thanks!
Technical SEO | | lfrazer1230 -
Google Appending Blog URL inbetween my homepage and product page is it issue with base url?
Hi All, Google Appending Blog URL inbetween my homepage and product page. Is it issue or base url or relative url? Can you pls guide me? Looking to both tiny url you will get my point what i am saying. Please help Thanks!
Technical SEO | | amu1230 -
Desktop & Mobile XML Sitemap Submitted But Only Desktop Sitemap Indexed On Google Search Console
Hi! The Problem We have submitted to GSC a sitemap index. Within that index there are 4 XML Sitemaps. Including one for the desktop site and one for the mobile site. The desktop sitemap has 3300 URLs, of which Google has indexed (according to GSC) 3,000 (approx). The mobile sitemap has 1,000 URLs of which Google has indexed 74 of them. The pages are crawlable, the site structure is logical. And performing a Landing Page URL search (showing only Google/Organic source/medium) on Google Analytics I can see that hundreds of those mobile URLs are being landed on. A search on mobile for a longtail keyword from a (randomly selected) page shows a result in the SERPs for the mobile page that judging by GSC has not been indexed. Could this be because we have recently added rel=alternate tags on our desktop pages (and of course corresponding canonical ones on mobile). Would Google then 'not index' rel=alternate page versions? Thanks for any input on this one. PmHmG
Technical SEO | | AlisonMills0 -
"Url blocked by robots.txt." on my Video Sitemap
I'm getting a warning about "Url blocked by robots.txt." on my video sitemap - but just for youtube videos? Has anyone else encountered this issue, and how did you fix it if so?! Thanks, J
Technical SEO | | Critical_Mass0 -
URL structure
Hello Guys, Quick Question regarding URL strucutre One of our client is an hotel chain, thye have a group site www.example.com and each property is located in a subfolder: www.example.com/example-boston.html , www.example.com/example-ny.html etc. My quesion is : where is better to place the language extension at a subfolder level?
Technical SEO | | travelclickseo
Should i go for www.example.com/en/example-ny.html or it is preferable to specify the language after the property name www.example.com/example-ny/en/accommodation.html? Thanks and Regards, Alessio0 -
Why is robots.txt blocking URL's in sitemap?
Hi Folks, Any ideas why Google Webmaster Tools is indicating that my robots.txt is blocking URL's linked in my sitemap.xml, when in fact it isn't? I have checked the current robots.txt declarations and they are fine and I've also tested it in the 'robots.txt Tester' tool, which indicates for the URL's it's suggesting are blocked in the sitemap, in fact work fine. Is this a temporary issue that will be resolved over a few days or should I be concerned. I have recently removed the declaration from the robots.txt that would have been blocking them and then uploaded a new updated sitemap.xml. I'm assuming this issue is due to some sort of crossover. Thanks Gaz
Technical SEO | | PurpleGriffon0 -
Marketing URL
Hi, I need a bit of advice on marketing URL's. The destinations URL is http://www.website.com/by-development.php?area=Isle Of Wight&development=developmentname. If we wanted to use www.website.com/developmentname on literature to send people to the ugly URL above, what would we do? Would we need to rewrite the ugly URL to the neat and then 301 the ugly to the neat? Currently, the team are using a new domain of neatandrelevant.info and 301 redirecting it to ugly URL but there are lots of different developments they want to send people to so a new domain is bought for each development which seems a bit unnecessary. They point to different pages on the ugly URL website. Assuming canonical tag would not be needed then because the ugly URL page would be redirected. Also, as the website has ugly URL's anyway, would it not be best practice to use rewrites anyway so that the URL's read www.mywebsite.com/region/development? Would it confuse things to then have extra short marketing URL's missing out /region? Hope that makes sense....
Technical SEO | | Houses0 -
Crawling and indexing content
If a page element (div, e.g.) is initially hidden and shown only by a hover descriptor or Javascript call, will Google crawl and index it’s content?
Technical SEO | | Mont0