How was cdn.seomoz.org configured?
-
The SEOmoz CDN appears to have a "pull zone" that is set to the root of the domain, such that any static file can be addressed from either subdomain:
http://www.seomoz.org/q/moz_nav_assets/images/logo.png
http://cdn.seomoz.org/q/moz_nav_assets/images/logo.png
The risk of this configuration is that web pages (not just images/CSS/JS) also get cached and served by the CDN. I won't put the URL here for fear of Google indexing it, but if you replace the 'www' in the URL below with 'cdn', you'll see a cached copy of the original:
http://www.seomoz.org/ugc/the-greatest-attribution-ever-graphed
The worst-case scenario is that the homepage gets indexed. But this doesn't happen here:
That URL issues a 301 redirect back to the canonical www subdomain. As it should.
Here's my question: how was that done?
Because maxcdn.com can't do it. If you set a "pull zone" to your entire domain, they'll cache your homepage and everything else. googlebot has a field day with that; it will reindex your entire site off the CDN.
Maybe the SEOmoz CDN provider (CloudFront) allows specific URLs to be blocked? Or do you detect the CloudFront IPs and serve them a 301 (which they'd proxy out to anyone requesting cdn.seomoz.org)?
One solution is to create a pull zone that points to a folder, like example.com/images... but this doesn't help a complex site that has cacheable content in multiple places (do you Wordpress users really store ALL your static content under /wp-content/ ?).
Or, as suggested above, dynamically detect requests from the CDN's proxy servers, and give them a 301 for any HTML-page request. This gets complex quickly, and is both prone to breakage and very difficult to regression-test.
Properly retrofitting a complex site to use a CDN, without creating a half-dozen new CDN subdomains, does not appear to be easy.
-
its a SEOmoz secret...
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Schema.org for Hotels
Good morning, I have a question about how to use Schema.org on a hotel website. Since a website will have many pages, do I add the microdata for the hotel's address, contact info, ratings, reviews, etc on every page as that information is in the footer of each page or do I just add to the homepage once? Thanks in advance
Intermediate & Advanced SEO | | mulch0 -
Disavow files and net com org etc ....
When looking at my backlinks if I see something like this: www.domainPizza.net
Intermediate & Advanced SEO | | HLTalk
www.domainPizza.com
sub.domainPizza.com
www.domainpizza.org
domainPizza.net
https://domainpizza.com
https://www.domainpizza.net What is the actual list of disavows that I put into the file if I want to disavow this domain? I am seeing so many variations of the same domain. Thank you.0 -
Google not Indexing images on CDN.
My URL is: http://bit.ly/1H2TArH We have set up a CDN on our own domain: http://bit.ly/292GkZC We have an image sitemap: http://bit.ly/29ca5s3 The image sitemap uses the CDN URLs. We verified the CDN subdomain in GWT. The robots.txt does not restrict any of the photos: http://bit.ly/29eNSXv. We used to have a disallow to /thumb/ which had a 301 redirect to our CDN but we removed both the disallow in the robots.txt as well as the 301. Yet, GWT still reports none of our images on the CDN are indexed. The above screenshot is from the GWT of our main domain.The GWT from the CDN subdomain just shows 0. We did not submit a sitemap to the verified subdomain property because we already have a sitemap submitted to the property on the main domain name. While making a search of images indexed from our CDN, nothing comes up: http://bit.ly/293ZbC1While checking the GWT of the CDN subdomain, I have been getting crawling errors, mainly 500 level errors. Not that many in comparison to the number of images and traffic that we get on our website. Google is crawling, but it seems like it just doesn't index the pictures!? Can anyone help? I have followed all the information that I was able to find on the web but yet, our images on the CDN still can't seem to get indexed.
Intermediate & Advanced SEO | | alphonseha0 -
Does anyone have any experience using GoodRelations Snippet Generator for E-Commerce versus markup from Schema.org?
I am trying to implement Structured Data markup on a large e-commerce site that has ancient code and is not on any standard e-commerce platform. It is a Webstore we self-host that was developed for us and heavily customized. What's worse is that I don't have access to the source code. I have to somehow instruct our IT Director how and where to place everything. So I'm going to need to be meticulously specific. As I began wading through our code and determining where to insert code as instructed by Schema.org I ran across this on one of their pages: "This class contains derivatives of properties from the GoodRelations Vocabulary for E-Commerce, created by Martin Hepp. GoodRelations is a data model for sharing e-commerce data on the Web that can be expressed in a variety of syntaxes, including RDFa and HTML5 Microdata. More information about GoodRelations can be found at http://purl.org/goodrelations/." I went to check it out and it appears that this could be a great resource as it has a snippet generator and several "cookbooks" for adding micro data to our site. Here's a link to their snippet generator: http://www.ebusiness-unibw.org/tools/grsnippetgen/ However, with a catalog of 5,000 SKUs, needless to say we aren't going to plug in our products into this generator one-by one. Has anyone here successfully used GoodRelations to help them implement micro data into a large E-commerce site that isn't a standard platform (not Magento, WP, Joomla or Volusion) ?? I would be very greatful to anyone who can share their experiences and or make suggestions on how we might best proceed. Thanks! Dana
Intermediate & Advanced SEO | | danatanseo0 -
Does anyone have a BOTW.org promo code for november yet?
Does anyone have a best of the web directory promo code for november yet?
Intermediate & Advanced SEO | | unitedfitness0 -
Microdata / Schema.org and HTTPS
I have a quick question regarding Microdata / Schema.org files that are not hosted on secure connections. I receive a receive a security error from my e-commerce site because the code references the schema over HTTP instead of HTTPS.<div< span="">itemscope itemtype="http://schema.org/Product"></div<>This is not the first time I have run into this issue. We also use MRSS schema for an RSS feed from yahoo and the same thing happens.<div< span="">xmlns:media="http://search.yahoo.com/mrss"></div<>The problem mainly lies in the fact that these schemas are not hosted over HTTPS. If you add HTTPS to the beginning of both you will get a security error.Just wondering if anyone else has dealt with this or similar issue and what the "best practices" are around this?Is it ok to obtain the schema directly and then host it on our server, over our secure connection?Thanks!
Intermediate & Advanced SEO | | AnthonyMangia0 -
SEOMOZ found basically all my articles and says they need a 301 redirect ?
Hope someone can HELP. So my site looks like it has the proper 301 redirect to www. for the main domain. But for some reason my articles that have a /trackback on them redirect to same address with out the trackback at the end. How do i fix this? seomoz is saying all my articles need a 301 redirect .all like 100. Thanks any help would be great
Intermediate & Advanced SEO | | jstgobig0 -
Can Anyone show me a site that has followed the seomoz seo rules
Hi i have been reading the seo information on here which is very interesting and i would like to know if anyone can point to any sites that have followed the rules and advice. It is great when you can read the info and rules but i feel it is also better to see a site that has followed the rules and to hear from people who have followed the information and put them into practice and explain what results they have got. I am currently building the following website http://www.womenlifestylemagazine.com so it would be great to see a site that has followed all the rules and who can explain if they work or not.
Intermediate & Advanced SEO | | ClaireH-1848860