Slash at end of URL causing Google crawler problems
-
Hello,
We are having some problems with a few of our pages being crawled by Google and it looks like the slash at the end of the URL is causing the problem. Would appreciate any pointers on this.
We have a redirect in place that redirects the "no slash" URL to the "slash" URL for all pages. The obvious solution would be to try turning this off, however, we're unable to figure our where this redirect is coming from. There doesn't appear to be an instruction in our .htaccess file doing this, and we've also tried using "DirectorySlash Off" in the .htaccess file, but that doesn't work either. (if it makes a difference it is a 302 redirect doing this, not a 301)
If we can't get the above to work, then the other solution would be to somehow reconfigure the page so that it is recognizable with the slash at the end by Google. However, we're not sure how this would be done.
I think the quickest solution would be to turn off the "add slash" redirect. Any ideas on where this command might be hiding, and how to turn it off would be greatly appreciated. Or any tips from people who have had similar crawl problems with google and any workarounds would be great!
Thanks!
-
Satchmo does this automatically - http://www.satchmoproject.com/docs/dev/configuration.html?highlight=trailing slash - however, as far as I can see from the documentation and forums there's no way to disable it
I'm unfamiliar with Satchmo though, hit up the Google Group - http://groups.google.com/group/satchmo-users/topics - and ask there.
-
Thanks, Ryan -- we're taking a look into this right now, and will let you know how it goes!
-
I think we should rule out the possibility that your CMS or a SEO extension or other add-on for your CMS is adjusting your URLs.
Can you add a page to your site at your root that is not part of your CMS? Drop in a test.html file and see what happens.
-
Hi Ryan -- thanks for your help.
We're hosted on a VPS, running Linux/Apache. We use Satchmo as our CMS/shopping engine. As far as I know, we haven't put explicit redirect instructions into the CMS. Do you think the CMS may be adding the slash?
-
What type of server is your site hosted on? Is it Windows or Apache? Is it shared hosting, VPS or dedicated?
What type of site do you have? Is there a CMS or other software which may modify or rewrite URLs?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Canonical error from Google
Moz couldn't explain this properly and I don't understand how to fix it. Google emailed this morning saying "Alternate page with proper canonical tag." Moz also kinda complains about the main URL and the main URL/index.html being duplicate. Of course they are. The main URL doesn't work without the index.html page. What am I missing? How can I fix this to eliminate this duplicate problem which to me isn't a problem?
Technical SEO | | RVForce0 -
Will this URL structure: "domain.com/s/content-title" cause problems?
Hey all, We have a new in-house built too for building content. The problem is it inserts a letter directly after the domain automatically. The content we build with these pages aren't all related, so we could end up with a bunch of urls like this: domain.com/s/some-calculator
Technical SEO | | joshuaboyd
domain.com/s/some-infographic
domain.com/s/some-long-form-blog-post
domain.com/s/some-product-page Could this cause any significant issues down the line?0 -
URL Parameters
On our webshop we've added some URL-parameters. We've set URL's like min_price, filter_cat, filter_color etc. on "don't Crawl" in our Google Search console. We see that some parameters have 100.000+ URL's and some have 10.000+ Is it better to add these parameters in the robots.txt file? And if that's better, how can we write it down so the URL's will not be crawled. Our robotos.txt files shows now: # Added by SEO Ultimate's Link Mask Generator module User-agent: * Disallow: /go/ # End Link Mask Generator output User-agent: * Disallow: /wp-admin/
Technical SEO | | Happy-SEO1 -
Removed URLs
Hi all, We have recently removed 200+ articles from our blog. However, those links are still being shown on Google weeks after their removal. In there a way to speed up the process? What effect will this have on our SEO ranking?
Technical SEO | | businessowner0 -
How does Google find /feed/ at the end of all pages on my site?
Hi! In Google Webmaster Tools I find *.../feed/ as a 404 page in crawl errors. The problem is that none of these pages exist and they have no inbound links (except the start page). FYI, it´s a wordpress site. Example: www.mysite.com/subpage1/feed/ www.mysite.com/subpage2/feed/ www.mysite.com/subpage3/feed/ etc Does Google search for /feed/ by default or why do I keep getting these 404´s every day?
Technical SEO | | Vivamedia0 -
Does Google see page with Trailing Slash as different
My company is purchasing another company's website. We are moving their entire site onto our CMS and the IT guys are working hard to replicate the URL structure. Several of the category pages are changing slightly and I am not sure if it matters: Old URL - http://www.DOMAIN.com/products/adults New URL - http://www.DOMAIN.com/products/adults**/** Notice the trailing slash? Will Google treat the new page as the same as the old one or as completely different (i.e. new) page? P.S. - Yes, I can setup 301s but since these pages hold decent rankings I'd really like to keep it exactly the same.
Technical SEO | | costume0 -
Is it ok to just use the end of the url when using a Rel Cononical Link?
Hi, I am working with an account and the previous SEO used a Rel Canonical link that just uses the end of the url. Instead of the full url When I look it up on the web I see most people are using the full url. Is that the proper way to do it or does is suffice to just use the end of the url? Wanted to check before I take the time to change them all. -Kent
Technical SEO | | KentH0 -
Google causing Magento Errors
I have an online shop - run using Magento. I have recently upgraded to version 1.4, and I installed a extension called Lightspeed, a caching module which makes tremendous improvements to Magento's performance. Unfortunately, a confoguration problem, meant that I had to disable the module, because it was generating errors relating to the session, if you entered the site from any page other than the home page. The site is now working as expected. I have Magento's error notification set to email - I've not received emails for errors generated by visitors. However over a 72 hour period, I received a deluge of error emails, which where being caused by Googlebot. It was generating an erro in a file called lightspeed.php Here is an example: URL: http://www.jacksgardenstore.com/tahiti-vulcano-hammock IP Address: 66.249.66.186 Time: 2011-06-11 17:02:26 GMT Error: Cannot send headers; headers already sent in /home/jack/jacksgardenstore.com/user/jack_1.4/htdocs/lightspeed.php, line 444 So several things of note: I deleted lightspeed.php from the server, before any of these error messages began to arrive. lightspeed.php was never exposed in the URL, at anytime. It was referred to in a mod_rewrite rule in .htaccess, which I also commented out. If you clicked on the URL in the error message, it loaded in the browser as expected, with no error messages. It appears that Google has cached a version of the page which briefly existed whilst Lightspeed was enabled. But I though that Google cached generated HTML. Since when does cache a server-side PHP file ???? I've just used the Fetch as Googlebot facility on Webmaster Tools for the URL in the above error message, and it returns the page as expected. No errors. I've had to errors at all in the last 48 hours, so I'm hoping it's just sorted itself out. However I'm concerned about any Google related implications. Any insights would be greatly appreciated. Thanks Ben
Technical SEO | | atticus70