Does Rogerbot recognize rel="alternate" hreflang="x"?
-
Rogerbot just completed its first crawl and is reporting all kinds of duplicate content - both page content and meta title/description.
The pages it is calling duplicate are used with rel="alternate" hreflang="x", but are still being labeled as dupes.
The title and descriptions are usually exactly the same, so I am working on getting at least those translated into different languages.
I think its getting tripped up because the product page its crawling are only in English, but the chrome of the site is in the translated languages. The URLs look like so:
Original: site.com/product
Detected duplicates: site.com/fr/product, site.com/de/product, site.com/zh-hans/product
-
Hey there,
Rogerbot doesn't look for rel alts. The bot will follow meta robots, rel canonical (more used way to controlling duplicate content) and 301 redirects. Sorry about the confusion.
Best,
Nick
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does Moz recognize rel next prev tags? Magento question
Howdy Mozzers! We are running a store in magento where we have many products in each category. Hence view all for category pages is not an option. We have applied rel next prev tags to our paginated pages in the following manner Example for page 2 in a category: The issue we are facing is that Moz suggests www.domain.com/category and www.domain.com/category?p=1 as duplicates, even though rel next prev tags are implemented. 1. Does nel next prev consolidate link juice?
Moz Pro | | MozAddict
2. Does Moz recognize the tags?
3. Will this work for us or should we implement canonical tags as well?0 -
Rogerbot's crawl behaviour vs google spiders and other crawlers - disparate results have me confused.
I'm curious as to how accurately rogerbot replicates google's searchbot I've currently got a site which is reporting over 200 pages of duplicate/titles content in moz tools. The pages in question are all session IDs and have been blocked in the robot.txt (about 3 weeks ago), however the errors are still appearing. I've also crawled the page using screaming frog SEO spider. According to Screaming Frog, the offending pages have been blocked and are not being crawled. Webmaster tools is also reporting no crawl errors. Is there something I'm missing here? Why would I receive such different results. Which one's should I trust? Does rogerbot ignore robot.txt? Any suggestions would be appreciated.
Moz Pro | | KJDMedia0 -
I've got quite a few "Duplicate Page Title" Errors in my Crawl Diagnostics for my Wordpress Blog
Title says it all, is this an issue? The pages seem to be set up properly with Rel=Canonical so should i just ignore the duplicate page title erros in my Crawl Diagnostics dashboard? Thanks
Moz Pro | | SheffieldMarketing0 -
Rogerbot getting cheeky?
Hi SeoMoz, From time to time my server crashes during Rogerbot's crawling escapades, even though I have a robots.txt file with a crawl-delay 10, now just increased to 20. I looked at the Apache log and noticed Roger hitting me from from 4 different addresses 216.244.72.3, 72.11, 72.12 and 216.176.191.201, and most times whilst on each separate address, it was 10 seconds apart, ALL 4 addresses would hit 4 different pages simultaneously (example 2). At other times, it wasn't respecting robots.txt at all (see example 1 below). I wouldn't call this situation 'respecting the crawl-delay' entry in robots.txt as other question answered here by you have stated. 4 simultaneous page requests within 1 sec from Rogerbot is not what should be happening IMHO. example 1
Moz Pro | | BM7
216.244.72.12 - - [05/Sep/2012:15:54:27 +1000] "GET /store/product-info.php?mypage1.html" 200 77813
216.244.72.12 - - [05/Sep/2012:15:54:27 +1000] "GET /store/product-info.php?mypage2.html HTTP/1.1" 200 74058
216.244.72.12 - - [05/Sep/2012:15:54:28 +1000] "GET /store/product-info.php?mypage3.html HTTP/1.1" 200 69772
216.244.72.12 - - [05/Sep/2012:15:54:37 +1000] "GET /store/product-info.php?mypage4.html HTTP/1.1" 200 82441 example 2
216.244.72.12 - - [05/Sep/2012:15:46:15 +1000] "GET /store/mypage1.html HTTP/1.1" 200 70209
216.244.72.11 - - [05/Sep/2012:15:46:15 +1000] "GET /store/mypage2.html HTTP/1.1" 200 82384
216.244.72.12 - - [05/Sep/2012:15:46:15 +1000] "GET /store/mypage3.html HTTP/1.1" 200 83683
216.244.72.3 - - [05/Sep/2012:15:46:15 +1000] "GET /store/mypage4.html HTTP/1.1" 200 82431
216.244.72.3 - - [05/Sep/2012:15:46:16 +1000] "GET /store/mypage5.html HTTP/1.1" 200 82855
216.176.191.201 - - [05/Sep/2012:15:46:26 +1000] "GET /store/mypage6.html HTTP/1.1" 200 75659 Please advise.1 -
Is seomoz rogerbot only crawling the subdomains by links or as well by id?
I´m new at seomoz and just set up a first campaign. After the first crawling i got quite a few 404 errors due to deleted (spammy) forum threads. I was sure there are no links to these deleted threads so my question is weather the seomoz rogerbot is only crawling my subdomains by links or as well by ids (the forum thread ids are serially numbered from 1 to x). If the rogerbot crawls as well serially numbered ids do i have to be concerned by the 404 error on behalf of the googlebot as well?
Moz Pro | | sauspiel0 -
What does the "Internal Links" data in the Keyword Difficulty SERP report represent?
What does the "Internal Links" data in the Keyword Difficulty SERP report represent? Thank you! QocBS
Moz Pro | | richpalpine0 -
BOTW links not recognized by Open Site Explorer
Hi there, I was wondering if I buy a submission to the Best of the Web directory (waiting for the new directory list promised by the seomoz team 🙂 ) but when I get to the category on BOTW website that will fit for my website, I took some links already there and put them on open site explorer to see their value, I had the surprise they are not even recognized... So I am still wondering if it is worth or not... voilà , if anybody knows if this directory still has value...
Moz Pro | | thuraminho750