Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
How long after https migration that google shows in search console new sitemap being indexed?
-
We migrated 4 days ago to https and followed best practices..
In search console now still 80% of our sitemaps appear as "pending" and among those sitemaps that were processed only less than 1% of submitted pages appear as indexed?Is this normal ?
How long does it take for google to index pages from sitemap?
Before https migration nearly all our pages were indexed and I see in the crawler stats that google has crawled a number of pages each day after migration that corresponds to number of submitted pages in sitemap.Sitemap and crawler stats show no errors.
-
thanks Stephan.
It took nearly a month for search console to display the majority of our pages in sitemap as indexed, even though pages showed up much earler in SERPs. We had it split down into 30 different sitemaps. Later we published also a sitemap index and saw a nice increase a few days later in indexed pages which may have been related.
Finally google now is indexing 88% of our sitemap.
Do you think in general that 88% is for a site of this size a somehow normal percentage or would you normally expect a higher percentage of indexed sitemap page and investigate deeper for potential pages that google may consider thin content? Navigation I can rule out as a reason. -
Did the "pending" message go away in the end? Unfortunately you're fairly limited in what you can do with this. The message likely indicates/indicated that one of the following was true:
- Google had difficulty accessing the sitemap (though you did say no errors)
- It was taking a long time to do it because of the large number of links
You could try splitting your sitemap up into several smaller ones, and using a sitemap index. Or have you done this already? By splitting it into several sitemaps, you can at least see whether some index and some don't, whether there do turn out to be issues with some of the URLs listed there, etc.
You can also prioritise the most important pages by putting them into their own sitemap (linked to from the sitemap index, of course), and submitting that one first. So at least if everything else takes longer you'll get your most important landing pages indexed.
-
Update. now 10 days passed since our migration to https and upload of sitemap, still same situation.
-
Google has been crawling all our pages during the last days. I see it in the crawling stats.
My concern is that
- majority of my sitemaps are still showing up as "pending" 3 days after I originally submited the sitemaps.
- those sitemaps that are processed show as indexed only less than 1% of my submitted pages.
We do have around 170.000 pages in our sitemap.
So I wonder wheher this is unusual or normal delay from google search console.
-
Its difficult to say. It depends on many factors like (importance of your site in Google's eyes, when they crawled your site the last time, relevance of the topic in general, etc.) BUT you can speed up the process a lot, i.e. initiate it on your own. You don't have to wait until Google recrawls your site at random. Did you know?
Go to Search Console - Crawl - Fetch as Google - Add your site's URL or URL of a particular sub page. Press Fetch
Google will recrawl that page again very quickly. When I do that with a particular page (not the entire domain) it usually takes 1-2 days at most to recrawl and index it again.
Hope this helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Search console validation taking a long time?
Hello! I did something dumb back in the beginning of September. I updated Yoast and somehow noindexed a whole set of custom taxonomy on my site. I fixed this and then asked Google to validate the fixes on September 20. Since then they have gotten through only 5 of the 64 URLS.....is this normal? Just want to make sure I'm not missing something that I should be doing. Thank you! ^_^
Intermediate & Advanced SEO | | angelamaemae0 -
How to stop URLs that include query strings from being indexed by Google
Hello Mozzers Would you use rel=canonical, robots.txt, or Google Webmaster Tools to stop the search engines indexing URLs that include query strings/parameters. Or perhaps a combination? I guess it would be a good idea to stop the search engines crawling these URLs because the content they display will tend to be duplicate content and of low value to users. I would be tempted to use a combination of canonicalization and robots.txt for every page I do not want crawled or indexed, yet perhaps Google Webmaster Tools is the best way to go / just as effective??? And I suppose some use meta robots tags too. Does Google take a position on being blocked from web pages. Thanks in advance, Luke
Intermediate & Advanced SEO | | McTaggart0 -
Should I use noindex or robots to remove pages from the Google index?
I have a Magento site and just realized we have about 800 review pages indexed. The /review directory is disallowed in robots.txt but the pages are still indexed. From my understanding robots means it will not crawl the pages BUT if the pages are still indexed if they are linked from somewhere else. I can add the noindex tag to the review pages but they wont be crawled. https://www.seroundtable.com/google-do-not-use-noindex-in-robots-txt-20873.html Should I remove the robots.txt and add the noindex? Or just add the noindex to what I already have?
Intermediate & Advanced SEO | | Tylerj0 -
Google News Sitemap in Different Languages
Thought I'd ask this question to confirm what I already think. I'm curious that if we're publishing something in two language and both are verified by the publishing center if the group would recommend publishing two separate Google News Sitemaps (one in each language) or publishing one in each language.
Intermediate & Advanced SEO | | mattdinbrooklyn0 -
Mass Removal Request from Google Index
Hi, I am trying to cleanse a news website. When this website was first made, the people that set it up copied all kinds of articles they had as a newspaper, including tests, internal communication, and drafts. This site has lots of junk, but this kind of junk was on the initial backup, aka before 1st-June-2012. So, removing all mixed content prior to that date, we can have pure articles starting June 1st, 2012! Therefore My dynamic sitemap now contains only articles with release date between 1st-June-2012 and now Any article that has release date prior to 1st-June-2012 returns a custom 404 page with "noindex" metatag, instead of the actual content of the article. The question is how I can remove from the google index all this junk as fast as possible that is not on the site anymore, but still appears in google results? I know that for individual URLs I need to request removal from this link
Intermediate & Advanced SEO | | ioannisa
https://www.google.com/webmasters/tools/removals The problem is doing this in bulk, as there are tens of thousands of URLs I want to remove. Should I put the articles back to the sitemap so the search engines crawl the sitemap and see all the 404? I believe this is very wrong. As far as I know this will cause problems because search engines will try to access non existent content that is declared as existent by the sitemap, and return errors on the webmasters tools. Should I submit a DELETED ITEMS SITEMAP using the <expires>tag? I think this is for custom search engines only, and not for the generic google search engine.
https://developers.google.com/custom-search/docs/indexing#on-demand-indexing</expires> The site unfortunatelly doesn't use any kind of "folder" hierarchy in its URLs, but instead the ugly GET params, and a kind of folder based pattern is impossible since all articles (removed junk and actual articles) are of the form:
http://www.example.com/docid=123456 So, how can I bulk remove from the google index all the junk... relatively fast?0 -
How to Submit My new Website in All Search Engines
Hello Everyone, Can Any body help to suggest Good software, or Any other to easily Submit my website , to All Search Engines ? ? Any expert Can help please, Thanx in Advance
Intermediate & Advanced SEO | | falguniinnovative0 -
How to find all indexed pages in Google?
Hi, We have an ecommerce site with around 4000 real pages. But our index count is at 47,000 pages in Google Webmaster Tools. How can I get a list of all pages indexed of our domain? trying to locate the duplicate content. Doing a "site:www.mydomain.com" only returns up to 676 results... Any ideas? Thanks, Ben
Intermediate & Advanced SEO | | bjs20100 -
Our login pages are being indexed by Google - How do you remove them?
Each of our login pages show up under different subdomains of our website. Currently these are accessible by Google which is a huge competitive advantage for our competitors looking for our client list. We've done a few things to try to rectify the problem: - No index/archive to each login page Robot.txt to all subdomains to block search engines gone into webmaster tools and added the subdomain of one of our bigger clients then requested to remove it from Google (This would be great to do for every subdomain but we have a LOT of clients and it would require tons of backend work to make this happen.) Other than the last option, is there something we can do that will remove subdomains from being viewed from search engines? We know the robots.txt are working since the message on search results say: "A description for this result is not available because of this site's robots.txt – learn more." But we'd like the whole link to disappear.. Any suggestions?
Intermediate & Advanced SEO | | desmond.liang1