Why is site not being indexed by Google, and not showing on a crawl test??
-
On a site we developed of which .com is forwarded to .net domain, we quit getting crawled by google on about the 20th of Feb. Now when we try to run a crawl test on either url, we get There was an error fetching this page. Error description For some reason the page returned did not describe itself as an html page. It could be possible that the url is serving an image, rss feed, pdf, or xml file of some sort. The crawl tool does not currently report metrics on this type of data. Our other sites are fine and this was up to this date. We took out noodp, noydir today as the only thing we could think of. Site is on WP cms.
-
Site last cached 2nd March
Your site is indexed.
Header's returning 200 codes.
Site can be crawled fine, Xenu finds about 27 pages.
Lynxviewer gets through the page alright.
Only thing I can think of is that robots.txt looks needlessly complicated but should be alright, I would consider stripping it all out and re-running the test, if you get the same error then it's not that, if it is then narrow down what it could be.
If no joy, let me know and I'll have another look.
-
The site is www.innerloophomesreport.net, .com. Thanks.
-
Probably going to need the URL on this one.
I presume you can access the site as a user? What's in your robots.txt file? You using the SEOmoz tools?
-
Hi Robert Fisher,
This problem probably come from the headers of the file and not from the content itself. You might want to look at the headers returned by your URL using one of the following tools :
http://www.seoconsultants.com/tools/headers
http://www.rexswain.com/httpview.html
http://web-sniffer.net/
http://www.g-force.ca/referencement/entetesWhen you got the headers, I suggest you post it here so we can look into it.
Best regards,
Guillaume Voyer.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Site move?
What happens in a site move from a subdomain to a new domain and how does that effect the root domain of the subdomain and whether or not the subdomain SEO would be transferred to the new domain?
Web Design | | Jjjay1 -
How to prevent development website subdomain from being indexed?
Hello awesome MOZ Community! Our development team uses a sub-domain "dev.example.com" for our SEO clients' websites. This allows changes to be made to the dev site (U/X changes, forms testing, etc.) for client approval and testing. An embarrassing discovery was made. Naturally, when you run a "site:example.com" the "dev.example.com" is being indexed. We don't want our clients websites to get penalized or lose killer SERPs because of duplicate content. The solution that is being implemented is to edit the robots.txt file and block the dev site from being indexed by search engines. My questions is, does anyone in the MOZ Community disagree with this solution? Can you recommend another solution? Would you advise against using the sub-domain "dev." for live and ongoing development websites? Thanks!
Web Design | | SproutDigital0 -
Regarding rel=canonical on duplicate pages on a shopping site... some direction, please.
Good morning, Moz community: My name is David, and I'm currently doing internet marketing for an online retailer of marine accessories. While many product pages and descriptions are unique, there are some that have the descriptions duplicated across many products. The advice commonly given is to leave one page as is / crawlable (probably best for one that is already ranking/indexed), and use rel=canonical on all duplicates. Any idea for direction on this? Do you think it is necessary? It will be a massive task. (also, one of the products that we rank highest for, we have tons of duplicate descriptions.... so... that is sort of like evidence against the idea?) Thanks!
Web Design | | DavidCiti0 -
Best Captcha Recommendations for Magento Site?
I am looking for the best captcha solution for our website which is magento based. Currently our web developer is recommending google captcha. Is this just a spam check list or will it do the job well? I would like any other recommendations that are clear for readers and are professional.
Web Design | | TeguarMarketing0 -
Site structure and Visual Sitemaps
Aside from mind mapping software are there any tools ( recommended) to build a visual sitemap of the internal linking structure of a URL? I've been trying to 'show' clients the structure of a website as it pertains to internal and external links. Here is one I've tried it's "Close" - http://site-visualizer.com/ . I've been using the excel export function, import into mind meister and building it. It's a teeny bit time consuming for large websites. Site structure I feel is a valuable portion of SEO and a down and dirty visual explanation would be great. Don't get me wrong, it offers other benefits as well- it's just I'd like to free up the time it takes. Thank you in advance. Screen shots are available on the website of the organization.
Web Design | | TammyWood0 -
Site with no ads hit by Page Layout update?
Hi there! Can a site that has no ads on it be hit by Google's latest Page Layout update? Can it be hit for just one or two keywords? My site (www.ink2paper.com) has a decline in Google organic traffic in early Feb so my suspicion is the Page Layout update. However I have no ads on the site. Digging into GWMT I find that it is only one or 2 keywords that seems to have taken a dive, mainly [photo paper]. I used to get around 80 imps a day for this term. Then on 6 Feb it was down to 50; 7 Feb = 34; 8 Feb just 4 impressions! I got a spike back at usual levels on 10 & 11 Feb, but since then it has been back down to only 5 or so impressions a day. [photographic paper] took a small hit at the start of February, but has nose dived since the start of April. The homepage performs well for Google organic traffic - low bounce (22%) and good ecom conversion rate (14%) - although this is likely to be largely branded traffic. I feel my site is a 'good' result for the search term [photo paper], although there is always room for improvement of course! Any suggestions as to why Google has stopped showing my site for these keywords? All help is greatly appreciated. Cheers,
Web Design | | SimonHogg
Simon0 -
Sitemap Question - Very Old Ecommerce Site, Never Used A Map
I help manage a family website, that has about 10,000 products... It was top ranked since 1996, then got smacked by Penguin and recovered but its still receiving only a fraction of the natural traffic it used to get. Something we have never used... Is a sitemap. I'm curious if anyone knows reliable software that will generate a sitemap? My cart is custom built, website uses html pages across the board. Dynamic content and parameters are set up properly, onsite seo is in the excellent range. The only thing that I haven't been utilizing is a sitemap. Because the cart was hand built, it would a huge convenience to use a lightweight program thats compatible with any website, has parameter settings, exclusions and anything else useful to negate any duplicate content. I have a few highly dynamic pages as well... If anyone knows a product or a possible solution, it would be much appreciated. Working it up myself would be very time consuming. Thx
Web Design | | Southbay_Carnivorous_Plants0 -
Do iFrames embedded in a page get crawled?
Do iFrames embedded in a page get crawled? I have an iFrame which prints a page hosted by another company embedded in my page. Their links don't include rel=nofollow attributes, so I don't want Google to see them. Do spiders crawl the content in iFrames, or do I have to ensure that the links on this page include the nofollow attribute?
Web Design | | deuce1s0