After Server Migration - Crawling Gets slow and Dynamic Pages wherein Content changes are not getting Updated
-
Hello,
I have just performed doing server migration 2 days back
All's well with traffic moved to new servers
But somehow - it seems that w.r.t previous host that on submitting a new article - it was getting indexed in minutes. Now even after submitting page for indexing - its taking bit of time in coming to Search Engines and some pages wherein content is daily updated - despite submitting for indexing - changes are not getting reflected
Site name is - http://www.mycarhelpline.com
Have checked in robots, meta tags, url structure - all remains well intact. No unknown errors reports through Google webmaster
Could someone advise - is it normal - due to name server and ip address change and expect to correct it automatically or am i missing something
Kindly advise in . Thanks
-
Thanks all for Inputs
I searched Google and found this note from Google which may happen post server migration
https://support.google.com/webmasters/answer/6033412?hl=en&ref_topic=6033383
A note about Googlebot’s crawl rate
It’s normal to see a temporary drop in Googlebot’s crawl rate immediately after the launch, followed by a steady increase over the next few weeks, potentially to rates that may be higher than from before the move.
This fluctuation occurs because we determine crawl rate for a site based on many signals, and these signals change when your hosting changes. As long as Googlebot does not encounter any serious problems or slowdowns when accessing your new serving infrastructure, it will try to crawl your site as fast as necessary and possible.
Add on Thompson Paul - Appreciate - yes its a good suggestion, will see to include sitemap
-
The one thing you haven't mentioned, which is likely to be most critical for this issue, is your XML sitemap. I couldn't find it at any of the standard URLs (/sitemap.xml and /sitemap_index.xml both lead to generic 404 pages). Also, there's no directive to the sitemap in your robots.txt.
Given that the sitemap.xml is the clearest and fastest way for you to help Google to discover new content, I'd strongly recommend you get a clean, dynamically updated sitemap.xml implemented for the site, submit it through both Google and Bing webmaster tools, and place the proper pointer to it in your robots.txt file.
Once it's been submitted to the webmaster tools, you'll be able to see exactly how frequently its being discovered/crawled.
Hope that helps?
Paul
-
The good news is, this actually sounds pretty normal. 24 hours to reflect changes in content is better than many sites. I can't account for why it dropped from 4 to 24, but I'd say this is still in the range of "good"
-
@ Cyrus
Certain pages
Earlier it was less than 4 hrs - but now its taking around 24 hrs - basis the data been updated in Search engine result just found today - i thought it was not getting updated at all
Fetch & render - no issues, Its submitting. No errors in GWT
Tested speed test - though no noticeable improvement in loading time - but no unnessary page size or load time been increased too
I was wondering - can it be a temporary phenomena - where crawl speed is slow and later on will come back to normal. Its less than 72 hrs when server been migrated
Google Search Console Crawl Stats is last updated for 16th June - so unable to figure it out from there. No errors in webmaster
-
Howdy,
A couple of questions:
1. Are there certain pages that aren't getting updated, or is it your entire site?
2. How often are changes in the pages reflected in Google's cache?Is it a case where Google simply displays old/outdated information all the time? Finally, have you done a "Fetch and Render" check in Google Webmaster Tools?
-
@Anirban
Thanks, no errors in GWT Tools. Loading time - could not observe a change. As per Gtmetrics - tested - is well within limits. Pages with dynamic content are not getting updated in Search Engine - which earlier was happening on immediate basis
-
It should not be. check your page load time. If pages takes longer to load than google bot may bounce off. Check your webmaster tool as see if there are any server errors showing.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google crawling 200 page site thousands of times/day. Why?
Hello all, I'm looking at something a bit wonky for one of the websites I manage. It's similar enough to other websites I manage (built on a template) that I'm surprised to see this issue occurring. The xml sitemap submitted shows Google there are 229 pages on the site. Starting in the beginning of December Google really ramped up their intensity in crawling the site. At its high point Google crawled 13,359 pages in a single day. I mentioned I manage other similar sites - this is a very unusual spike. There are no resources like infinite scroll that auto generates content and would cause Google some grief. So follow up questions to my "why?" is "how is this affecting my SEO efforts?" and "what do I do about it?". I've never encountered this before, but I think limiting my crawl budget would be treating the symptom instead of finding the cure. Any advice is appreciated. Thanks! *edited for grammar.
Intermediate & Advanced SEO | | brettmandoes0 -
When the site's entire URL structure changed, should we update the inbound links built pointing to the old URLs?
We're changing our website's URL structures, this means all our site URLs will be changed. After this is done, do we need to update the old inbound external links to point to the new URLs? Yes the old URLs will be 301 redirected to the new URLs too. Many thanks!
Intermediate & Advanced SEO | | Jade1 -
"noindex, follow" or "robots.txt" for thin content pages
Does anyone have any testing evidence what is better to use for pages with thin content, yet important pages to keep on a website? I am referring to content shared across multiple websites (such as e-commerce, real estate etc). Imagine a website with 300 high quality pages indexed and 5,000 thin product type pages, which are pages that would not generate relevant search traffic. Question goes: Does the interlinking value achieved by "noindex, follow" outweigh the negative of Google having to crawl all those "noindex" pages? With robots.txt one has Google's crawling focus on just the important pages that are indexed and that may give ranking a boost. Any experiments with insight to this would be great. I do get the story about "make the pages unique", "get customer reviews and comments" etc....but the above question is the important question here.
Intermediate & Advanced SEO | | khi50 -
Can't get auto-generated content de-indexed
Hello and thanks in advance for any help you can offer me! Customgia.com, a costume jewelry e-commerce site, has two types of product pages - public pages that are internally linked and private pages that are only accessible by accessing the URL directly. Every item on Customgia is created online using an online design tool. Users can register for a free account and save the designs they create, even if they don't purchase them. Prior to saving their design, the user is required to enter a product name and choose "public" or "private" for that design. The page title and product description are auto-generated. Since launching in October '11, the number of products grew and grew as more users designed jewelry items. Most users chose to show their designs publicly, so the number of products in the store swelled to nearly 3000. I realized many of these designs were similar to each and occasionally exact duplicates. So over the past 8 months, I've made 2300 of these design "private" - and no longer accessible unless the designer logs into their account (these pages can also be linked to directly). When I realized that Google had indexed nearly all 3000 products, I entered URL removal requests on Webmaster Tools for the designs that I had changed to "private". I did this starting about 4 months ago. At the time, I did not have NOINDEX meta tags on these product pages (obviously a mistake) so it appears that most of these product pages were never removed from the index. Or if they were removed, they were added back in after the 90 days were up. Of the 716 products currently showing (the ones I want Google to know about), 466 have unique, informative descriptions written by humans. The remaining 250 have auto-generated descriptions that read coherently but are somewhat similar to one another. I don't think these 250 descriptions are the big problem right now but these product pages can be hidden if necessary. I think the big problem is the 2000 product pages that are still in the Google index but shouldn't be. The following Google query tells me roughly how many product pages are in the index: site:Customgia.com inurl:shop-for Ideally, it should return just over 716 results but instead it's returning 2650 results. Most of these 1900 product pages have bad product names and highly similar, auto-generated descriptions and page titles. I wish Google never crawled them. Last week, NOINDEX tags were added to all 1900 "private" designs so currently the only product pages that should be indexed are the 716 showing on the site. Unfortunately, over the past ten days the number of product pages in the Google index hasn't changed. One solution I initially thought might work is to re-enter the removal requests because now, with the NOINDEX tags, these pages should be removed permanently. But I can't determine which product pages need to be removed because Google doesn't let me see that deep into the search results. If I look at the removal request history it says "Expired" or "Removed" but these labels don't seem to correspond in any way to whether or not that page is currently indexed. Additionally, Google is unlikely to crawl these "private" pages because they are orphaned and no longer linked to any public pages of the site (and no external links either). Currently, Customgia.com averages 25 organic visits per month (branded and non-branded) and close to zero sales. Does anyone think de-indexing the entire site would be appropriate here? Start with a clean slate and then let Google re-crawl and index only the public pages - would that be easier than battling with Webmaster tools for months on end? Back in August, I posted a similar problem that was solved using NOINDEX tags (de-indexing a different set of pages on Customgia): http://moz.com/community/q/does-this-site-have-a-duplicate-content-issue#reply_176813 Thanks for reading through all this!
Intermediate & Advanced SEO | | rja2140 -
Is it possible to get job pages in the SERPs to compete against the likes on Monster and Indeed?
Is it possible to get job pages in the SERPs to compete against the likes on Monster and Indeed? I'm looking to build specific pages for jobs that are posted on our website, but I feel it's a tough challenge for any site to compete? Are there better options?
Intermediate & Advanced SEO | | Hughescov0 -
Getting Pages Requiring Login Indexed
Somehow certain newspapers' webpages show up in the index but require login. My client has a whole section of the site that requires a login (registration is free), and we'd love to get that content indexed. The developer offered to remove the login requirement for specific user agents (eg Googlebot, et al.). I am afraid this might get us penalized. Any insight?
Intermediate & Advanced SEO | | TheEspresseo0 -
Old pages still crawled by SE returning 404s. Better to put 301 or block with robots.txt ?
Hello guys, A client of ours has thousand of pages returning 404 visibile on googl webmaster tools. These are all old pages which don't exist anymore but Google keeps on detecting them. These pages belong to sections of the site which don't exist anymore. They are not linked externally and didn't provide much value even when they existed What do u suggest us to do: (a) do nothing (b) redirect all these URL/folders to the homepage through a 301 (c) block these pages through the robots.txt. Are we inappropriately using part of the crawling budget set by Search Engines by not doing anything ? thx
Intermediate & Advanced SEO | | H-FARM0 -
Duplicate page Content
There has been over 300 pages on our clients site with duplicate page content. Before we embark on a programming solution to this with canonical tags, our developers are requesting the list of originating sites/links/sources for these odd URLs. How can we find a list of the originating URLs? If you we can provide a list of originating sources, that would be helpful. For example, our the following pages are showing (as a sample) as duplicate content: www.crittenton.com/Video/View.aspx?id=87&VideoID=11 www.crittenton.com/Video/View.aspx?id=87&VideoID=12 www.crittenton.com/Video/View.aspx?id=87&VideoID=15 www.crittenton.com/Video/View.aspx?id=87&VideoID=2 "How did you get all those duplicate urls? I have tried to google the "contact us", "news", "video" pages. I didn't get all those duplicate pages. The page id=87 on the most of the duplicate pages are not supposed to be there. I was wondering how the visitors got to all those duplicate pages. Please advise." Note, the CMS does not create this type of hybrid URLs. We are as curious as you as to where/why/how these are being created. Thanks.
Intermediate & Advanced SEO | | dlemieux0