Google crawl index issue with our website...
-
Hey there. We've run into a mystifying issue with Google's crawl index of one of our sites. When we do a "site:www.burlingtonmortgage.biz" search in Google, we're seeing lots of 404 Errors on pages that don't exist on our site or seemingly on the remote server.
In the search results, Google is showing nonsensical folders off the root domain and then the actual page is within that non-existent folder.
An example:
Google shows this in its index of the site (as a 404 Error page): www.burlingtonmortgage.biz/MQnjO/idaho-mortgage-rates.asp
The actual page on the site is: www.burlingtonmortgage.biz/idaho-mortgage-rates.asp
Google is showing the folder MQnjO that doesn't exist anywhere on the remote. Other pages they are showing have different folder names that are just as wacky.
We called our hosting company who said the problem isn't coming from them...
Has anyone had something like this happen to them?
Thanks so much for your insight!
Megan -
Hi Keri. Thanks for following up. This turned out to be an issue with an auto-generated breadcrumbs script. I don't know what the intricacies of that were but we were able to remove it and get this issue straightened out.
Thanks again!
Megan
-
Hi Megan,
I'm following up on older questions that are marked unanswered. Did you ever get this figured out?
-
Megan ,
Please check with your hosting company,
about this code to be included in htaccess
ErrorDocument 404 /404.shtml
/404.shtml its your 404 page
-
Thanks for your help on this Wissam. Is this something that we need to have the hosting company set-up on the server to ensure that these pages get returned as 404s?
-
Megan,
See here
http://markup.io/v/fyd9w4w9wmjr
Googlebot when It crawls this page, you remote server is telling Google Bot that its a Live page and this page Exists
The solution to the upper problem, might help you in fixing the actual problem.
If the Pages with the mystery folder Does not Exist .. your remote server should show google bot a 404 not found (http header).
-
Are we talking about one problem or two?
http://www.burlingtonmortgage.biz/contact.htm does not exist on the remote server (as it was removed over a year ago). I see that there are similar errors for other old pages which were also previously removed. Should we have redirected those to the 404 page since there are not related pages on the existing site?
I am not sure if the two problems have anything to do with one another. The pages with the "mystery folders" are existing pages. They just exist in the root. Why would google be looking at them as if they are inside sub folder?
-
Megan,
noticed something also for example this page http://www.burlingtonmortgage.biz/contact.htm . its showing a 404 error from title and content ... but the HTTP header is showing 200 ok. u need to fix that.
and would assume maybe thats why google started indexing weird URLs generating from your site... and if its true is a 404 page ..google is not picking it up because its showing its a Live page (200ok)
-
We use Dreamweaver.
-
Which CMS are you using?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
No index and Crawl Budget
Hello, If we noindex pages, will it improve crawl budget ? For example pages like these - https://x-z.com/2012/10/
Technical SEO | | Johnroger
https://x-y.com/2012/06/
https://x-y.com/2013/03/
https://x-y.com/2019/10/
https://x-y.com/2019/08/ Should we delete/redirect such pages ? Thanks0 -
Why Google crawl parameter URLs?
Hi SEO Masters, Google is indexing this parameter URLs - 1- xyz.com/f1/f2/page?jewelry_styles=6165-4188-4184-4192-4180-6109-4191-6110&mode=li_23&p=2&filterable_stone_shapes=4114 2- xyz.com/f1/f2/page?jewelry_styles=6165-4188-4184-4192-4180-4169-4195&mode=li_23&p=2&filterable_stone_shapes=4115&filterable_metal_types=4163 I have handled by Google parameter like this - jewelry_styles= Narrows Let Googlebot decide mode= None Representative URL p= Paginates Let Googlebot decide filterable_stone_shapes= Narrows Let Googlebot decide filterable_metal_types= Narrows Let Googlebot decide and Canonical for both pages - xyz.com/f1/f2/page?p=2 So can you suggest me why Google indexed all related pages with this - xyz.com/f1/f2/page?p=2 But I have no issue with first page - xyz.com/f1/f2/page (with any parameter). Cononical of first page is working perfectly. Thanks
Technical SEO | | Rajesh.Prajapati
Rajesh0 -
Indexing Issue
Hi, We have moved one of our domain https://www.mycity4kids.com/ in angular js and after that, i observed the major drop in the number of indexed pages. I crosschecked the coding and other important parameters but didn't find any major issue. What could be the reason behind the drop?
Technical SEO | | ResultFirst0 -
Google Indexing of Site Map
We recently launched a new site - on June 4th we submitted our site map to google and almost instantly had all 25,000 URL's crawled (yay!). On June 18th, we made some updates to the title & description tags for the majority of pages on our site and added new content to our home page so we submitted a new sitemap. So far the results have been underwhelming and google has indexed a very low number of the updated pages. As a result, only a handful of the new titles and descriptions are showing up on the SERP pages. Any ideas as to why this might be? What are the tricks to having google re-index all of the URLs in a sitemap?
Technical SEO | | Emily_A0 -
Keyword rankings for new website good in Yahoo and Bing but no movement in Google?
Hello, This past September I launched a new redesigned website for a client. His old website was a static html site that was many years old and the new website was created using WordPress. With the new design we made sure to use all the proper techniques for SEO. (h1 tages, image names, quality links, page titles, etc.) Plus, all the content is new content written for this site. I've actually launched new sites many times and after a few months usually start seeing keyword ranking improvements from the major search engines. With this particular website I'm seeing improvements in Yahoo and Bing but no movement in Google. I've used Google Webmaster Tools and made sure my sitemap is being submitted, etc. It all seems good, but I can't understand why Yahoo and Bing are working but nothing from Google. My page grades are all A's and B's and Moz isn't showing any big issues. Maybe I need to give it more time? This client is a lawyer and has many websites out there so maybe he's being penalized somewhere I don't know? As I mentioned, I've been doing SEO for about 8 years and have never had this much trouble with Google. I was wondering if you can look at the site and see if there are any glaring issues I'm missing. Website: http://www.arizonamedicalmalpractice.info/ The keyword phrases were are looking at are "Phoenix Medical Malpractice Lawyer", "Phoenix Medical Malpractice Attorney:, "Arizona Medical Malpractice Lawyer & Attorney", etc. I appreciate anyone who takes the time and does a quick look over. Thanks very much, Bill
Technical SEO | | Bill_K0 -
How to stop google from indexing specific sections of a page?
I'm currently trying to find a way to stop googlebot from indexing specific areas of a page, long ago Yahoo search created this tag class=”robots-nocontent” and I'm trying to see if there is a similar manner for google or if they have adopted the same tag? Any help would be much appreciated.
Technical SEO | | Iamfaramon0 -
Crawling and indexing content
If a page element (div, e.g.) is initially hidden and shown only by a hover descriptor or Javascript call, will Google crawl and index it’s content?
Technical SEO | | Mont0 -
Pages not Indexed after a successful Google Fetch
I am trying to understand why google isn't indexing key content on my site. www.BeyondTransition.com is indexed and new pages show up in a couple of hours. My key content is 6 pages of information for each of 3000 events (driven by mySQL on a wordpress platform). These pages are reached via a search page, but no direct navigation from the home page. When I link to an event page from an indexed page it doesn't show up in search results. When I use fetch on webmaster tools the fetch is successful but is then not indexed - or if it does appear in results it's directed to the internal search page e.g. http://www.beyondtransition.com/site/races/course/race110003/ has been fetched and submitted with links but when I search for BeyondTransition Ironman Cozumel I get these results.... So what have I done wrong and how do I go about fixing it? All thoughts and advice appreciated Thanks Denis
Technical SEO | | beyondtransition0