Google crawl index issue with our website...
-
Hey there. We've run into a mystifying issue with Google's crawl index of one of our sites. When we do a "site:www.burlingtonmortgage.biz" search in Google, we're seeing lots of 404 Errors on pages that don't exist on our site or seemingly on the remote server.
In the search results, Google is showing nonsensical folders off the root domain and then the actual page is within that non-existent folder.
An example:
Google shows this in its index of the site (as a 404 Error page): www.burlingtonmortgage.biz/MQnjO/idaho-mortgage-rates.asp
The actual page on the site is: www.burlingtonmortgage.biz/idaho-mortgage-rates.asp
Google is showing the folder MQnjO that doesn't exist anywhere on the remote. Other pages they are showing have different folder names that are just as wacky.
We called our hosting company who said the problem isn't coming from them...
Has anyone had something like this happen to them?
Thanks so much for your insight!
Megan -
Hi Keri. Thanks for following up. This turned out to be an issue with an auto-generated breadcrumbs script. I don't know what the intricacies of that were but we were able to remove it and get this issue straightened out.
Thanks again!
Megan
-
Hi Megan,
I'm following up on older questions that are marked unanswered. Did you ever get this figured out?
-
Megan ,
Please check with your hosting company,
about this code to be included in htaccess
ErrorDocument 404 /404.shtml
/404.shtml its your 404 page
-
Thanks for your help on this Wissam. Is this something that we need to have the hosting company set-up on the server to ensure that these pages get returned as 404s?
-
Megan,
See here
http://markup.io/v/fyd9w4w9wmjr
Googlebot when It crawls this page, you remote server is telling Google Bot that its a Live page and this page Exists
The solution to the upper problem, might help you in fixing the actual problem.
If the Pages with the mystery folder Does not Exist .. your remote server should show google bot a 404 not found (http header).
-
Are we talking about one problem or two?
http://www.burlingtonmortgage.biz/contact.htm does not exist on the remote server (as it was removed over a year ago). I see that there are similar errors for other old pages which were also previously removed. Should we have redirected those to the 404 page since there are not related pages on the existing site?
I am not sure if the two problems have anything to do with one another. The pages with the "mystery folders" are existing pages. They just exist in the root. Why would google be looking at them as if they are inside sub folder?
-
Megan,
noticed something also for example this page http://www.burlingtonmortgage.biz/contact.htm . its showing a 404 error from title and content ... but the HTTP header is showing 200 ok. u need to fix that.
and would assume maybe thats why google started indexing weird URLs generating from your site... and if its true is a 404 page ..google is not picking it up because its showing its a Live page (200ok)
-
We use Dreamweaver.
-
Which CMS are you using?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Crawl Issues / Partial Fetch Via Google
We recently launched a new site that doesn't have any ads, but in Webmaster Tools under "Fetch as Google" under the rendering of the page I see: Googlebot couldn't get all resources for this page. Here's a list: URL Type Reason Severity https://static.doubleclick.net/instream/ad_status.js Script Blocked Low robots.txt https://googleads.g.doubleclick.net/pagead/id AJAX Blocked Low robots.txt Not sure where that would be coming from as we don't have any ads running on our site? Also, it's stating the the fetch is a "partial" fetch. Any insight is appreciated.
Technical SEO | | vikasnwu0 -
I have multiple URLs that redirect to the same website. Is this an issue?
I have multiple URLs that all lead to the same website. Years ago they were purchased and were sitting dormant. Currently they are 301 redirects and each of the URLs feed to different areas of my website. Should I be worried about losing authority? And if so, is there a better way to do this?
Technical SEO | | undrdog990 -
How do I avoid this issue of duplicate content with Google?
I have an ecommerce website which sells a product that has many different variations based on a vehicle’s make, model, and year. Currently, we sell this product on one page “www.cargoliner.com/products.php?did=10001” and we show a modal to sort through each make, model, and year. This is important because based on the make, model, and year, we have different prices/configurations for each. For example, for the Jeep Wrangler and Jeep Cherokee, we might have different products: Ultimate Pet Liner - Jeep Wrangler 2011-2013 - $350 Ultimate Pet Liner - Jeep Wrangler 2014 - 2015 - $350 Utlimate Pet Liner - Jeep Cherokee 2011-2015 - $400 Although the typical consumer might think we have 1 product (the Ultimate Pet Liner), we look at these as many different types of products, each with a different configuration and different variants. We do NOT have unique content for each make, model, and year. We have the same content and images for each. When the customer selects their make, model, and year, we just search and replace the text to make it look like the make, model, and year. For example, when a custom selects 2015 Jeep Wrangler from the modal, we do a search and replace so the page will have the same url (www.cargoliner.com/products.php?did=10001) but the product title will say “2015 Jeep Wrangler”. Here’s my problem: We want all of these individual products to have their own unique urls (cargoliner.com/products/2015-jeep-wrangler) so we can reference them in emails to customers and ideally we start creating unique content for them. Our only problem is that there will be hundreds of them and they don’t have unique content other than us switching in the product title and change of variants. Also, we don’t want our url www.cargoliner.com/products.php?did=10001 to lose its link juice. Here’s my question(s): My assumption is that I should just keep my url: www.cargoliner.com/products.php?did=10001 and be able to sort through the products on that page. Then I should go ahead and make individual urls for each of these products (i.e. cargoliner.com/products/2015-jeep-wrangler) but just add a “nofollow noindex” to the page. Is this what I should do? How secure is a “no-follow noindex” on a webpage? Does Google still index? Am I at risk for duplicate content penalties? Thanks!
Technical SEO | | kirbyfike0 -
How do we keep Google from treating us as if we are a recipe site rather than a product website?
We sell food products that, of course, can be used in recipes. As a convenience to our customer we have made a large database of recipes available. We have far more recipes than products. My concern is that Google may start viewing us as a recipe website rather than a food product website. My initial thought was to subdomain the recipes (recipe.domain.com) but that seems silly given that you aren't really leaving our website and the layout of the website doesn't change with the subdomain. Currently our URL structure is... domain.com/products/product-name.html domain.com/recipes/recipe-name.html We do rank well for our products in general searches but I want to be sure that our recipe setup isn't detrimental.
Technical SEO | | bearpaw0 -
Duplicate content issues arise 6 months after creation of website?!
Hi, I've had the same website for 6 months and fixed all original on-site issues a long time ago.Now this week I wake up and found 3 new errors: 3 of my pages have missing titles issues, missing meta description issues and also the Moz crawl says they all have duplicate content issues. All my rankings went down a lot as well. This site is static, doesn't even have a blog, everything is rel canonical and non-indexed. It's 100% original content as well. So how can those issues arise 6 months later? All my titles and descriptions are there and non-duplicate, and the content is original and not duplicate as well. Is this a wordpress bug or virus? Anyone had this happen to them and how to fix it? Thanks a lot for you help! -Marc
Technical SEO | | marcandre0 -
Will Google index a site with white text? Will it give it bad ratings?
Will google not rank a site bc pretty much all the copy is white (and the background is all white)? Here's the site in question: https://www.dropbox.com/s/6w24f6h5p0zaxhg/Garrison_PLAY.vs2-static.pdf https://www.dropbox.com/sh/fwudppvwy2khpau/t43NozpG3E/Garrison_PLAY.vs3.jpg thanks--if you need me to clarify more let me know TM Humphries LocalSearched.com
Technical SEO | | CloudGuys0 -
Magento - Google Webmaster Crawl Errors
Hi guys, Started my free trial - very impressed - just thought I'd ask a question or two while I can. I've set up the website for http://www.worldofbooks.com (large bookseller in the UK), using Magento. I'm getting a huge amount of not found crawl errors (27,808), I think this is due to URL rewrites, all the errors are in this format (non search friendly): http://www.worldofbooks.com/search_inventory.php?search_text=&category=&tag=Ure&gift_code=&dd_sort_by=price_desc&dd_records_per_page=40&dd_page_number=1 As oppose to this format: http://www.worldofbooks.com/arts-books/history-of-art-design-styles/the-art-book-by-phaidon.html (the re-written URL). This doesn't seem to really be affecting our rankings, we targeted 'cheap books' and 'bargain books' heavily - we're up to 2nd for Cheap Books and 3rd for Bargain Books. So my question is - are these large amount of Crawl errors cause for concern or is it something that will work itself out? And secondly - if it is cause for concern will it be affecting our rankings negatively in any way and what could we do to resolve this issue? Any points in the right direction much appreciated. If you need any more clarification regarding any points I've raised just let me know. Benjamin Edwards
Technical SEO | | Benj250 -
Why is Google only indexing 3 of 8 pages?
Hi everyone, I have a small 8 page website I launched about 6 months ago. For the life of me I can not figure out why google is only indexing 3 of the 8 pages. The pages are not duplicate content in any way. I have good internal linking structure. At this time I dont have many inbound links from others, that will come in time. Am I missing something here? Can someone give me a clue? Thanks Tim Site: www.jparizonaweddingvideos.com
Technical SEO | | fasctimseo0