Cannot crawl website with redirect intalled on subdomain url
-
Hi!
I want to crawl this website : http://www.car-moderne.ch.
I tried a got back the crawl just for that one url (not for all the pages of the website). This single line cvs says that the status of the http://www.car-moderne.ch is 200, but in fact it is a redirect 301 to http://www.car-moderne.ch/fr where the live home page is (actually the Moz bar sees the 301, not the 200 as the single-lined crawl does).
How can I proceed in this case (a 301 redirect being installed on the subdomain url) to still be able to have a full-fledged juicy cvs with all the broken links, duplicate content, etc.
Thank you for your help!
Pascal Hämmerli
-
So glad to help, Pascal!
-
Dear Chiaryn,
Thank you for your very helpful reply.
This website is hosted on a partner agency who create the website and I only act as a SEO consultant for them. What you say is very helpful because it means their home-made CMS should be corrected to provided better 301 redirection.
I wish you a good day,
Pascal
-
Hey Pascal,
Sorry for the confusion here! It looks like the subdomain, www.car-moderne.ch, returns a 200 HTTP status to our crawler and to other crawlers, such as the hurl.it tool. In the body of the screenshot I attached from the hurl.it tool, the only code there is the number 404, so basically the site is serving a page with no crawlable data. The page isn't redirecting and it doesn't return any real source code, so there is no data for us to include in the crawl. I would recommend working with your webmaster to resolve this issue and to get the page to correctly serve a 301 redirect to the /fr version of the site to all crawlers.
I can see that the site is correctly responding with a 301 redirect for some crawlers, such as this test I ran as googlebot, but the response doesn't seem to be consistent. One thing you will want to be sure to have your webmaster check is how the site responds to user-agents that are hosted on Amazon Web Services, as some of our crawlers and the hurl.it crawl are both hosted through AWS.
Once the issue of the HTTP response is resolved, you should be able to get much better data from the crawl test tool.
I hope this helps! Please let me know if I can help you with anything else.
Chiaryn
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to turn off automated site crawls
Hi there, Is there a way to turn off the automated site crawl feature for an individual campaign? Thanks
Moz Bar | | SEONOW1230 -
Calling all 301 htaccess Guru's - www to non www - then to https + Redirect homepage to inner page
I have tried searching, multiple opinions and multiple things that supposedly work. What I have now, seems to work from an end user perspective, but Roger tells me otherwise: Redirect Chain issue....redirect, which redirects which redirects etc..... FIRST, we need to redirect all www to non www. SECOND, we need to redirect all to https. THIRD, we need to redirect the homepage to an inner page. (Got to love BOGUS DMCA complaints! :)?) So far we have: RewriteEngine on
Moz Bar | | Jes-Extender-Australia
RewriteCond %{HTTPS} off
RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
RewriteCond %{HTTP_HOST} ^mydomain.com.au$ [OR]
RewriteCond %{HTTP_HOST} ^www.mydomain.com.au$
RewriteRule ^/?$ "https://mydomain.com.au/inner-page-here" [R=301,L] Plus down the page there is the usual wordpress settings: <ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /
RewriteRule ^index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]</ifmodule> So, why does it seem to work for the end user, but Roger has his knickers in a knot saying, redirect, to redirect to redirect etc? Namaste and many thank you's in advance 🙂0 -
Why My Website DA decreased as the no of domain catches in google webmaster?
My Website is www.naturesouq.com and its domain authority not increased till the last 5 months and SEO possessed on it in regular interval and all the things going well . More than 150 domain pointing to the website but the website DA is being Decreased from 9.6 to 6. Why this is happend? What i have to do to improve my site performance and dercrease the spam score. Please help me. .
Moz Bar | | seohomemarket210 -
On-Page Grader "Sorry, but that URL is inaccessible."
We have a new client with a squarespace page. http://www.mountainhouseestate.com The Moz On-Page grader returns the error "Sorry, but that URL is inaccessible." for all pages. Possibly related, Google seems to hate their site. Even a search for "mountain house estate" returns lousy results. Bing/Yahoo has no problem with it.
Moz Bar | | Duke_Ferris1 -
On-Page Grader Url is inaccessible
Hi everybody. I'm trying to use on -page grader for https://www.upscaledinnerclub.com and get "Sorry, but that URL is inaccessible." Robots.txt are empty, another thread on MOZ was talking about DNS check - it's all good. So, I can't figure out why this is happening. Also I am trying the same for another website https://www.regexseo.com - the same story. Common thing is that they both are on Google App Engine. And at first i thought that was the problem. Bu then i checked this one : https://www.logitinc.com/ and it's working, even though this website is on GAE as well. None of these website have robots.txt or any differences in setup or settings. Any thoughts?
Moz Bar | | DmitriiK0 -
Sorry, but that URL is inaccessible?
Hi, I am trying to grade some pages and keywords using the "On-page grader" tool but for each URL that I try, the tool returns me a "Sorry, but that URL is inaccessible". The thing is that I have already used previously and without any problem some of these URLs. In fact, I have just realized that, while the same URL (for example: www.lacasadelaaldea.com) works in the "On-page optimization tool", it doesn't in the "On-page grader" right now. I have looked if someone could have experienced the same issue and I have found some other threads talking about it... so I have checked with my hosting provider that there is no firewall or any other thing causing this problem but they can't find anything. How do you make the call to the server? What could be happening? Thanks in advance, Juan
Moz Bar | | lcdla0 -
Why is 410 (Gone) being classed as a high priority issue in crawl diagnostics?
Are high priority issues have suddenly soared by over 100 because Moz is classing 410s as high priority.
Moz Bar | | Melissabraz
Google doesn't class these as so serious, so we were wondering if anyone knows why Mos does?0 -
Ajax #! URL support?
Hi Moz, My site is currently following the convention outlined here: https://support.google.com/webmasters/answer/174992?hl=en Basically since pages are generated via Ajax we are setup to direct bots that replace the #! in a url with ?escaped_fragment to cached versions of the ajax generated content. For example, if the bot sees this url: http://www.discoverymap.com/#!/California/Map-of-Carmel/73 it will replace it will instead access the page: http://www.discoverymap.com/?escaped_fragment=/California/Map-of-Carmel/73 In which case my server serves the cached html instead of the live page. This is all per Googles direction and is indexing fine. However the MOZ bot does not do this. It seems like a fairly straight-forward feature to support. Rather than ignoring the hash, you look to see if it is a #! and then try to spider the url replaced with ?escaped_fragment. Our server does the rest. If this is something MOZ plans on supporting in the future I would love to know. If there is other information that would be great. Also, pushstate is not practical for everyone due to limited browser support, etc. Thanks, Dustin Updates: I am editing my question because it won't let me respond to my own question. It says I need to sign up for MOZ analytics. I was signed up for Moz Analytics?! Now I am not? I responded to my invitation weeks ago? Anyway, you are misunderstanding how this process works. There is no site-map involved. The bot reads this URL on the page: http://www.discoverymap.com/#!/California/Map-of-Carmel/73 And when it is ready to spider the page for content it, it spider's this URL instead: http://www.discoverymap.com/?escaped_fragment=/California/Map-of-Carmel/73 The server does the rest, it is simply telling Roger to recognize the #! format and replace it with ?escaped_fragment Though I obviously do not know how Roger is coded but it is a simple string replacement. Thanks.
Moz Bar | | oneactlife0