Stop google indexing CDN pages
-
Just when I thought I'd seen it all, google hits me with another nasty surprise!
I have a CDN to deliver images, js and css to visitors around the world. I have no links to static HTML pages on the site, as far as I can tell, but someone else may have - perhaps a scraper site?
Google has decided the static pages they were able to access through the CDN have more value than my real pages, and they seem to be slowly replacing my pages in the index with the static pages.
Anyone got an idea on how to stop that?
Obviously, I have no access to the static area, because it is in the CDN, so there is no way I know of that I can have a robots file there.
It could be that I have to trash the CDN and change it to only allow the image directory, and maybe set up a separate CDN subdomain for content that only contains the JS and CSS?
Have you seen this problem and beat it?
(Of course the next thing is Roger might look at google results and start crawling them too, LOL)
P.S. The reason I am not asking this question in the google forums is that others have asked this question many times and nobody at google has bothered to answer, over the past 5 months, and nobody who did try, gave an answer that was remotely useful. So I'm not really hopeful of anyone here having a solution either, but I expect this is my best bet because you guys are always willing to try.
-
Thank you Edward.
I don't have quite that problem, but I think you are right too.
My CDN is set up to be Origin Pull.
That means there is no need to FTP - the system just fetches content as requested.
- you should check that out if you have to ftp everything.
But what you said that helped me is this - that I should have had one CNAME for images and anotehr CNAME for content and the content should be limited to a folder called content, so I can put the CSS files and the JS files in it and that way, the plain HTML pages at teh root level will never be affected.
I also realized, while checking the system, that I wasn't using a canonical tag in the intermediate pages, as I was in the story pages. So I just added code to add canonical tags for all the intermediate pages and the front page.
I do have a few other types of pages, so I will handle the code for them next.
I think adding the canonical tag might fix the problem, but I will also work on reconfiguring the CDN and change over when the action is not too busy, in case it takes a while to propagate.
-
It sounds like you have set up your CDN slightly wrong.
After setting up a few like you have I realised that I was actually making a complete duplicate of the site rather than just the images or assets
I imagine you have your origin directory for the CDN in the public html folder.
Create a subdomain, set that as the origin.
Eg.. I'm working on this site at the moment: http://looksfishy.co.uk/
I have a subdomain called assets: http://assets.looksfishy.co.uk/
The cdn content: http://cdn.looksfishy.co.uk/
Files uploaded here:
http://assets.looksfishy.co.uk/species/holder/pike.jpg
Displayed here:
http://cdn.looksfishy.co.uk/species/holder/pike.jpg
Check the ip address on them.
It does make uploading images by ftp a bit of a faff, but does make your site better
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
One of our top visited page (login page) missing primary keyword, does this makes ranking drop of our homepage for same keyword?
Hi all, So, I have removed the "primary keyword" from login page, which is most visited page on our website to avoid keywords in non related pages. I noticed our homepage ranking dropped for same "primary keyword". Visitors of this login page directly land without searching with "primary keyword". Then how removing it from such page drops our ranking? Thanks
Algorithm Updates | | vtmoz0 -
Pages fluctuating +/- 70 positions in Google SERPs?
I've got some pages that appear somewhere around #25 in Google. Every now and then, it just goes away from the top 100 results for a few days (even up to a week) and then it comes back. I've got other pages that rank around #8 which falls down to about #75 for a while and then it comes back. But while a page may be gone from the top 100 results in the US, it still ranks at about the same place everywhere else in the world (+/- 10 positions). I've seen this happen in the past but never it happened so often. What gives?!?
Algorithm Updates | | sbrault740 -
Could Retail Price Be A Google Ranking Factor???
I have not done any detailed studies on this but it seems that Google might be using low retail prices for specific items as a ranking factor in their organic SERPs. Does anyone else suspect this? Just askin' to hear your thoughts. Thanks!
Algorithm Updates | | EGOL0 -
Best practice for cleaning up multiple Google Places listings and multiple Google accounts when logins were lost.
We are an inbound marketing agency, most of our clients are not relying on local seo. I have a pretty good understanding of it when starting fresh but not so much in joining a "movie in progress" kind of scenario. Recently we've brought on two clients who have had their websites in place for awhile, have made small attempts at marketing themselves online over the years and its resulted in multiple Google places listings, variations of the company names (one of them changed their name), worried there are yet more accounts out there they aren't aware of, etc (analytics, and others from well intentioned employees and past service providers - no internal leadership at the company level). In reading Google help forums I'm seeing some recently having their accounts suspended when they try to clean things up - in one case a person setup a new Google account thinking he would start fresh and in trying to claim listings, get rid of duplicates, etc. his account was suspended. What is the CURRENT recommended course of action in situations like these? With all the changes going on with Google, I don't know which route to take and have combed the Internet reading articles about this (including Google's resources) - would like some current real world advise.
Algorithm Updates | | rhgraves651 -
Why Am I Ranking in Bing but Not Google
My website is ranking is ranking in Bing, but it's nowhere to be found on Google? What can be some causes for this?
Algorithm Updates | | locallyrank0 -
In the body of index page i want to be able to add text that can be picked up by crawlers but I do not want these text to be visible? How can I code this?
in the body of index page i want to be able to add text that can be picked up by crawlers but I do not want these text to be visible? How can I code this?
Algorithm Updates | | FinindDesign0 -
Does google have the worst site usability?
Google tells us to make our sites better for our readers, which we are doing, but do you think google has horrible site usabilty? For example, in webmaster tools, I'm always being confused by their changes and the way they just drop things. In the HTML suggestions area, they don't tell you when the data was last updated, so the only way to tell is to download the files and check. In the URL removals, they used to show you the URLs they had removed. Now that is gone and the only way you can check is to try adding one. We don't have any URL parameters, so any parameters are as a result of some other site tacking on stuff at the end of our URL and there is no way to tell them that we don't have any parameters, so ignore them all. Also, they add new parameters they find on the end of the list, so the only way to check is to click through to the end of the list.
Algorithm Updates | | loopyal0 -
Why google index ip address instead of the domain name?
I have a website ,now google index ip address of it instead of the domain name,I have used 301 redirected to the domain name,but how to change the index IP to its domain name? And why google index the IP address?
Algorithm Updates | | frankfans1170