Canonical URLs and screen scraping
-
So a little question here. I was looking into a module to help implement canonical URLs on a certain CMS and I came a cross a snarky comment about relative vs. absolute URLs being used. This person was insistent that relative URLs are fine and absolute URLs are only for people who don't know what they are doing.
My question is, if using relative URLs, doesn't it make it easier to have your content scraped? After all, if you do get your content scraped at least it would point back to your site if using absolute URLs, right? Am I missing something or is my thinking OK on this?
Any feedback is much appreciated!
-
Thanks for your reply, Alan. I also considered a screen scraper removing the canonical tag, but to me screen scraping seemed lazy in the first place and so maybe they wouldn't bother in most cases. I guess that a best practice with canonicals is really situation dependent.
-
Thanks, Robert. Your rational for using relative links make sense. I appreciate you helping me sort through the noise on this issue.
John
-
People don’t abuse people when you have facts on their side, reminds me of "you don’t believe in global warming, because your un-educated" argument.
I have seen just in the last few weeks where using absolute url has got me a link. I wrote a youmoz article with a link to my website, it has been copied and has the link in it. Of cause being on SEOMoz, I have to use a absolute url back to myself
I don’t usually use absolute links on my own site, I think search engines almost always know who copied who.
I agree with rob, but I will add, a good screen scraper will remove a canonical tag, but removing absolute links is not so easy, as you then have broken links, also I believe if you have image in the article linking back to you, search engines will know who the real owner is, same with css, js and a number of other refs. Screen scrapers rarely get credit for these reasons as well as the fact that if your site has a lot of duplicate, then it is obvious that you are the one coping It’s either the one site is copied from many locations or many locations have copied from the one site. -
John
You can use either and the web is full of those who go back and forth on this issue. My guess is that any really good scraper software can likely deal with absolute urls today. The advantage that we like with relative is all about page load speed - the file size is smaller with relative urls.
So, you will get arguments both ways. If scraping is a huge issue for you, maybe you go with absolute. We know people will scrape content and we continue with relative for the above reason and because it is easier to make certain changes/linking/redirects within a CMS.
Oh as to people who use absolutes not knowing what they are doing....that is bunk. They have other priorities, maybe.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Appending a code at the end of a URL
Hi All, Some real estate/ news companies have a code appended to the end of a URL https://www.realestate.com.au/property-house-qld-ormiston-141747584 https://www.brisbanetimes.com.au/national/queensland/childcare-centre-could-face-prosecution-for-leaving-child-on-hot-bus-20230320-p5ctqs.html Can I ask if there's any negative SEO implications for doing this? Cheers Dave
Technical SEO | | Redooo0 -
Rel="canonical"
Hello guys, By fixing the duplicate meta description issues of my site I noticed something a bit weird.The pages are product pages and the product on each one of them is the same and the only difference is the length of the product. On each page there is a canonical tag, and the link within the tag points to the same page. www.example.com/Product/example/2001 <rel="canonical" href="www.example.com/Product/example/2001"></rel="canonical"> This happens on every other page. I read twice and I think I will do it again the post on GWT and I think that is wrong as it should point to a different url, which is www.example.com/ProductGroup/example/ which is the the page where all the product are grouped together. Cheers
Technical SEO | | PremioOscar0 -
Why are my URL's changing
My rankings suddenly dropped and when trying to understand why I realized that nearly all images in Google's cached version of my site were missing. In the actual site they appear but in the cached version they don't. I noticed that most of the images had a ?6b5830 at the end of the URL and these were the images that were not showing. I am hoping that I found the reason for the drop in rankings. Maybe since Google cannot see a lot of the content it decided not to rank it as well (particularly since it seems to happen on thousands of pages). This is a cached version of my site I am using the following plugins that might be causing it: Yoasts SEO plugin, W3 total cache. Does anyone know what is causing ?6b5830 to be added to the end of most of my URL's? Could this be the reason for the ranking drop? Thanks in advance!
Technical SEO | | JillB20130 -
URL Structure for Deal Aggregator
I have a website that aggregates deals from various daily deals site. I originally had all the deals on one page /deals, however I thought that maybe it might be more useful to have several pages e.g. /beautydeals or /hoteldeals. However if I give every section it's own page that means I have either no current deals on the main /deals page or I will have duplicate content. I'm wondering what might be the best approach here? A few of the options that come to mind are: 1. Return to having all the deals on one page /deals and linking internally to content within that page
Technical SEO | | andywozhere
2. Have both a main /deals page with all of the deals plus other pages such as /beautydeals, but add re="canonical" to point to the main /deals page
3. Create new content for the /deals page... however I think people will probably want to see at least some deals straight away, rather than having to click through to another page.
4. Display some sub-categories on the main /deals page, but have separate URLs for other more popular sub-categories e.g. /beautydeals (this is how it works at the moment) I should probably point out that the site also has other content such as events and a directory. Any suggestions on how best to approach this much appreciated! Cheers, Andy0 -
Exact URL Match For Ranking
Has anyone else run into this issue? I have a competitor that purchases domain names for popular inner pages we are trying to rank for. We are trying to build a brand, our competitors have a lower domain authority but rank higher for inner pages in the serps with VERY little content, backlinks/seo work, they host a single page and do a re-direct to their main site. Would this be a good long term strategy? EX. We sell golf clubs our brand name is golfcity (Ex only) and we carry callaway clubs, our competitor is also building a brand but they purchased callawayclubs.net and do a re-direct. They rank on page one for keywords callaway clubs. If I do try to do this does one have an advantage over another? .com. net .org. because Ive seem them all used and rank on page 1. Thank you!!!
Technical SEO | | TP_Marketing0 -
Magento Canonical Tags
Magento pages have been giving me a lot of trouble with the canonical tags. In some cases duplicate pages are showing up, so I need to add the canonical tag. In other cases I'm getting an error that there are multiple canonical tags per page. How can I get my pages canonized without duplicate tags? It seems like it's either too much or not enough, no matter what I do. Note: this only applies to category and product pages.
Technical SEO | | GravitateOnline0 -
Trailing Slashes In Url use Canonical Url or 301 Redirect?
I was thinking of using 301 redirects for trailing slahes to no trailing slashes for my urls. EG: www.url.com/page1/ 301 redirect to www.url.com/page1 Already got a redirect for non-www to www already. Just wondering in my case would it be best to continue using htacces for the trailing slash redirect or just go with Canonical URLs?
Technical SEO | | upick-1623910 -
Magento URL Question
Calling all Magento Kings out there! I'm working on a client' site - powered by magento. I'm looking to rewrite a lot of the URLs. I know there is the URL rewrite tool, but I think what I need to do may go beyond this. Typical example would be: Old URL - http://www.xxxxxxxx.co.uk/fabric/product/product-black-screen-print-and-silver-fabric.html New URL - http://www.xxxxxx.co.uk/fabric/product/silver I know that magento's URLs seem to be created through categories so wanted to double check with someone the best way to do this. Also, I've heard that 301 redirects of non www to www in the .htaccess has a knock on effect on discounts? All comments greatly appreciated.
Technical SEO | | PerchDigital0