Google crawl index issue with our website...
-
Hey there. We've run into a mystifying issue with Google's crawl index of one of our sites. When we do a "site:www.burlingtonmortgage.biz" search in Google, we're seeing lots of 404 Errors on pages that don't exist on our site or seemingly on the remote server.
In the search results, Google is showing nonsensical folders off the root domain and then the actual page is within that non-existent folder.
An example:
Google shows this in its index of the site (as a 404 Error page): www.burlingtonmortgage.biz/MQnjO/idaho-mortgage-rates.asp
The actual page on the site is: www.burlingtonmortgage.biz/idaho-mortgage-rates.asp
Google is showing the folder MQnjO that doesn't exist anywhere on the remote. Other pages they are showing have different folder names that are just as wacky.
We called our hosting company who said the problem isn't coming from them...
Has anyone had something like this happen to them?
Thanks so much for your insight!
Megan -
Hi Keri. Thanks for following up. This turned out to be an issue with an auto-generated breadcrumbs script. I don't know what the intricacies of that were but we were able to remove it and get this issue straightened out.
Thanks again!
Megan
-
Hi Megan,
I'm following up on older questions that are marked unanswered. Did you ever get this figured out?
-
Megan ,
Please check with your hosting company,
about this code to be included in htaccess
ErrorDocument 404 /404.shtml
/404.shtml its your 404 page
-
Thanks for your help on this Wissam. Is this something that we need to have the hosting company set-up on the server to ensure that these pages get returned as 404s?
-
Megan,
See here
http://markup.io/v/fyd9w4w9wmjr
Googlebot when It crawls this page, you remote server is telling Google Bot that its a Live page and this page Exists
The solution to the upper problem, might help you in fixing the actual problem.
If the Pages with the mystery folder Does not Exist .. your remote server should show google bot a 404 not found (http header).
-
Are we talking about one problem or two?
http://www.burlingtonmortgage.biz/contact.htm does not exist on the remote server (as it was removed over a year ago). I see that there are similar errors for other old pages which were also previously removed. Should we have redirected those to the 404 page since there are not related pages on the existing site?
I am not sure if the two problems have anything to do with one another. The pages with the "mystery folders" are existing pages. They just exist in the root. Why would google be looking at them as if they are inside sub folder?
-
Megan,
noticed something also for example this page http://www.burlingtonmortgage.biz/contact.htm . its showing a 404 error from title and content ... but the HTTP header is showing 200 ok. u need to fix that.
and would assume maybe thats why google started indexing weird URLs generating from your site... and if its true is a 404 page ..google is not picking it up because its showing its a Live page (200ok)
-
We use Dreamweaver.
-
Which CMS are you using?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Website indexing issues
My website is being indexed with both https - https with www. and no leader at all. example. https//www.example.com and https//example.com and example.com 3 different versions are being indexed. How would I begin resolving this? Hosting?
Technical SEO | | DigitalRipples0 -
Submitted URL has crawl issue - Submitted URL seems to be a Soft 404 - but all looks fine
Google Search Console is showing some pages up as "Submitted URL has crawl issue" but they look fine to me. I have set them as fixed but after a month they were finally re-crawled and google states the issue persists. Examples are: https://www.rscpp.co.uk/counselling/175809/psychology-alcester-lanes-end.html
Technical SEO | | TommyNewmanCEO
https://www.rscpp.co.uk/browse/location-index/889/index-of-therapy-in-hanger-lane.html
https://www.rscpp.co.uk/counselling/274646/psychology-waltham-forest-sexual-problems.html There's also some "Submitted URL seems to be a Soft 404": https://www.rscpp.co.uk/counselling/112585/counselling-moseley-depression.html I also have more which are "pending", but again I couldn't see a problem with them in the first place. I'm at a bit of a loss as to what to do next. Any advice? Thanks in advance.0 -
No Longer Indexed in Google (Says Redirected)
Just recently my page, http:/www./waikoloavacationrentals.com/mauna-lani-terrace, was no longer indexed by google. The sub pages from it still are. I have not done anything sketchy with the page. When I went into the google fetch it says that it is redirected. Any ideas what is this all about? Here is what it says for the fetch: Http/1.1 301 moved permanently
Technical SEO | | RobDalton
Server: nginx
Date: Tue, 07 Mar 2017 00:43:26GMT
Content-Type: text/html
Content-Length: 178
Connection: keep-alive
Keep-Alive: timeout=20
Location: http://waikoloavacationrentals.com/mauna-lani-terrace <title>301 moved permanently</title> <center> 301 moved permanently </center> <center>nginx</center>0 -
Product Duplicate Content Issue with Google Shopping
I have a site with approx 20,000 products. These products are resold to hundreds of other companies and are fed from one database therefore the content is duplicated many many times. To overcome this, we are launching the site with noindex meta tags on all product pages. (In phase 2 we will begin adding unique content for every product eek) However, we still want them to appear in Google Shopping. Will this happen or will it have to wait until we remove the noindex tags?
Technical SEO | | FGroup0 -
Google Webmaster tools vs SeoMOZ Crawl Diagnostics
Hi Guys I was just looking over my weekly report and crawl diagnostics. What I've noticed is that the data gathered on SeoMoz is different from Google Webmaster diagnostics. The number of errors, in particular duplicate page titles, content and pages not found is much higher that what google webmaster tools is represents. I'm a bit confused and don't know which data is more accurate. Please Help
Technical SEO | | Tolod0 -
How to de-index the server location of my website
Somehow my website is indexed by it's server location. So in addition to www.[example].com, it's also indexed like this: server123.[server name].com I have no idea how that happened. But, I was wondering, does anyone know how to de-index all the urls like this? I have a lot of urls indexed using my server's address. Thanks.
Technical SEO | | webtarget0 -
Crawling and indexing content
If a page element (div, e.g.) is initially hidden and shown only by a hover descriptor or Javascript call, will Google crawl and index it’s content?
Technical SEO | | Mont0 -
Will using http ping, lastmod increase our indexation with Google?
If Google knows about our sitemaps and they’re being crawled on a daily basis, why should we use the http ping and /or list the index files in our robots.txt? Is there a benefit (i.e. improving indexability) to using both ping and listing index files in robots? Is there any benefit to listing the index sitemaps in robots if we’re pinging? If we provide a decent <lastmod>date is there going to be any difference in indexing rates between ping and the normal crawl that they do today?</lastmod> Do we need to all to cover our bases? thanks Marika
Technical SEO | | marika-1786190