Crawl Diagnostics bringing 20k+ errors as duplicate content due to session ids
-
Signed up to the trial version of Seomoz today just to check it out as I have decided I'm going to do my own SEO rather than outsource it (been let down a few times!). So far I like the look of things and have a feeling I am going to learn a lot and get results.
However I have just stumbled on something. After Seomoz dones it's crawl diagnostics run on the site (www.deviltronics.com) it is showing 20,000+ plus errors. From what I can see almost 99% of this is being picked up as erros for duplicate content due to session id's, so i am not sure what to do!
I have done a "site:www.deviltronics.com" on google and this certainly doesn't pick up the session id's/duplicate content. So could this just be an issue with the Seomoz bot. If so how can I get Seomoz to ignore these on the crawl?
Can I get my developer to add some code somewhere.
Help will be much appreciated. Asif
-
Hello Tom and Asif,
First of all Tom thanks for the excellent blog post re google docs.
We are also using the Jshop platform for one of our sites. And am not sure whether it is working correctly in terms of SEO. I just ran an seomoz crawl of the site and found that every single link in the list has a rel canonical in it, even the ones with session id's.
Here is an example:
www.strictlybeautiful.com/section.php/184/1/davines_shampoo/d112a41df89190c3a211ec14fdd705e9
www.strictlybeautiful.com/section.php/184/1/davines_shampoo
As Asif has pointed out the Jshop people say they have programmed it so that google cannot pick up the session ids, firstly is that even possible? And if I assume thats not an issue then what about the fact that every single page on the site has a rel canonical link on it?
Any help would be much appreciated.
<colgroup><col width="1074"></colgroup>
| |
| | -
Asif, here's the page with the information on the SEOmoz bot.
-
Thanks for the reply Tom. Spoke to our developer he has told me that the website platform (Jshop) does not show session ID's to the search engines so we are ok on that side. However as it doesn't recognise the Seomoz bot it shows it the session ID's. Do you know where I can find info on the Seomoz bot so we can see what it identifies itself as so it can be added to the list of recognised spiders?
Thanks
-
Hi Asif!
Firstly - I'd suggest that as soon as possible you address the core problem - the use of session ids in the URL. There are not many upsides to the approach and there are many downsides.That it doesn't show up with the site: command doesn't mean it isn't having a negative impact.
In the meantime, you should add a rel=canonical tag to all the offending pages pointing to the URL without the session id. Secondly, you could use robots.txt to block the SEOmoz bot from crawling pages with session ids, but it may affect the bots ability to crawl the site if all the links it is presented with are with session ids - which takes us back around to fixing the core problem.
Hope this helps a little!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How are you using Moz Content?
Hey people, I just subscribed for Moz Content and I am wondering how are other professionals using it as a strategy tool. Today I just released a blog post talking about how larger content impacts PA. Not a big deal. I would appreciate some ideas and insights.
Moz Pro | | amirfariabr0 -
Increase in authorization permission errors (Access Denied - Error 403)
Hi MOZ community, Since last week when I changed my theme in a WP installation I noticed (in WMT and MOZ tool) that I have increased number in authorization permission errors (error 403-forbidden). What happens is that I received a 403 error for almost every single URL of my site. All these URLs are not "real" ones but they all have my email in the end. i.e. I get an 403 error for the "/contact/support@fantasylogic.com" whilst the real URL is just "/contact/" This happens, as I said, for almost every single page of my site. I have no other crawling or indexation issues, all URLs are correctly indexed. All new pages are correctly indexed as well. URIs ending with "support@fantasylogic.com" are not indexed off course. WP and all installed plugins & theme are on the latest available release. For SEO purposes I use Yoast SEO WP plugin. The site in questions is: fantasylogic.com Any suggestions would be highly appreciated. Thank you in advance
Moz Pro | | gpapatheodorou0 -
Duplicate Page Content, Indexing and Rel Canonical Just DOUBLED! Need Advice to Fix
Last Friday (Penguin 5/2.1) my website shot way off the grid and I noticed in my MOZ PRO Campaign dashboard that all of the following just doubled in numbers on my website: duplicate page content, Google indexing, and rel canonicals. I also noticed that some of my pages, images, tags and categories now added a /page/2/ or a -2. I just changed noindex for tags, but indexing for media, pages, posts, and categories. I'm currently using All In One SEO for a plugin. Any advice would be much appreciated as I'm stuck on the issue. relconical.png Duplicate-Page-Content.png [Duplicate Content II](Duplicate Content II) index1.png
Moz Pro | | CelebrityPersonalTrainer0 -
Duplicate Page Title error for an eCommerce store !!
I currently launched my eCommerce startup hosted in Shopify and linked with MOZ. From my first Crawl Report I am getting 580 Duplicate Page Title i.e. all my Collection page have the same title. I have googled and have been checking the MOZ community but cannot find a fix to it. Some of the URL's are - http://www.onlypetstore.com/collections/all http://www.onlypetstore.com/collections/all?page=10 http://www.onlypetstore.com/collections/all?page=100 http://www.onlypetstore.com/collections/all?page=101 http://www.onlypetstore.com/collections/all?page=102 http://www.onlypetstore.com/collections/all?page=103 I am new to SEO and any suggestions will be a great help to me.
Moz Pro | | OnlyPetStore2 -
5xx (Server Errors)-in Wordpress
Since going to a wordpress platform in November, I have seen many 501 server errors in the crawl report. When I click on the link in the report however, the link shows the actual page with no errors. I reviewed all the Q&A but didn't see anything related to this issue. Does anyone have an idea as to why the actual link works when I click on it but the SEOMOZ crawl bot is showing a 5XX error. Thanks for any ideas or feedback you may have.
Moz Pro | | FidelityOne0 -
"Issue: Duplicate Page Content " in Crawl Diagnostics - but sample pages are not related to page indicated with duplicate content
In the crawl diagnostics for my campaign, the duplicate content warnings have been increasing, but when I look at the sample pages that SEOMoz says have duplicate content, they are completely different pages from the page identified. They have different Titles, Meta Descriptions and HTML content and often are different types of pages, i.e. product page appearing as having duplicate content vs. a category page. Anyone know what could be causing this?
Moz Pro | | EBCeller0 -
I have another Duplicate page content Question to ask.Why does my blog tags come up as duplicates when my page gets crawled,how do I fix it?
I have a blog linked to my web page.& when rogerbot crawls my website it considers tags for my blog pages duplicate content.is there any way I can fix this? Thanks for your advice.
Moz Pro | | PCTechGuy20120 -
Why are these pages considered duplicate page content?
A recent crawl diagnostic for a client's website had several new duplicate page content errors. The problem is, I'm not sure where the error comes from since the content in the webpage is different from one another. Here's the pages that SEOMOZ reported to have duplicate page content errors: http://www.imaginet.com.ph/wireless-internet-service-providers-term http://www.imaginet.com.ph/antivirus-term http://www.imaginet.com.ph/berkeley-internet-name-domain http://www.imaginet.com.ph/customer-premises-equipment-term The only thing similar that I see is the headline which says "Glossary Terms Used in this Site" - I hope that the one sentence is the reason for the error. Any input is appreciated as I want to find out the best solution for my client's website errors. Thanks!
Moz Pro | | TheNorthernOffice790