Stop google indexing CDN pages
-
Just when I thought I'd seen it all, google hits me with another nasty surprise!
I have a CDN to deliver images, js and css to visitors around the world. I have no links to static HTML pages on the site, as far as I can tell, but someone else may have - perhaps a scraper site?
Google has decided the static pages they were able to access through the CDN have more value than my real pages, and they seem to be slowly replacing my pages in the index with the static pages.
Anyone got an idea on how to stop that?
Obviously, I have no access to the static area, because it is in the CDN, so there is no way I know of that I can have a robots file there.
It could be that I have to trash the CDN and change it to only allow the image directory, and maybe set up a separate CDN subdomain for content that only contains the JS and CSS?
Have you seen this problem and beat it?
(Of course the next thing is Roger might look at google results and start crawling them too, LOL)
P.S. The reason I am not asking this question in the google forums is that others have asked this question many times and nobody at google has bothered to answer, over the past 5 months, and nobody who did try, gave an answer that was remotely useful. So I'm not really hopeful of anyone here having a solution either, but I expect this is my best bet because you guys are always willing to try.
-
Thank you Edward.
I don't have quite that problem, but I think you are right too.
My CDN is set up to be Origin Pull.
That means there is no need to FTP - the system just fetches content as requested.
- you should check that out if you have to ftp everything.
But what you said that helped me is this - that I should have had one CNAME for images and anotehr CNAME for content and the content should be limited to a folder called content, so I can put the CSS files and the JS files in it and that way, the plain HTML pages at teh root level will never be affected.
I also realized, while checking the system, that I wasn't using a canonical tag in the intermediate pages, as I was in the story pages. So I just added code to add canonical tags for all the intermediate pages and the front page.
I do have a few other types of pages, so I will handle the code for them next.
I think adding the canonical tag might fix the problem, but I will also work on reconfiguring the CDN and change over when the action is not too busy, in case it takes a while to propagate.
-
It sounds like you have set up your CDN slightly wrong.
After setting up a few like you have I realised that I was actually making a complete duplicate of the site rather than just the images or assets
I imagine you have your origin directory for the CDN in the public html folder.
Create a subdomain, set that as the origin.
Eg.. I'm working on this site at the moment: http://looksfishy.co.uk/
I have a subdomain called assets: http://assets.looksfishy.co.uk/
The cdn content: http://cdn.looksfishy.co.uk/
Files uploaded here:
http://assets.looksfishy.co.uk/species/holder/pike.jpg
Displayed here:
http://cdn.looksfishy.co.uk/species/holder/pike.jpg
Check the ip address on them.
It does make uploading images by ftp a bit of a faff, but does make your site better
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Birthday Update - noticeable industries?
Anybody see anything specific around which sites are seeing upswing from birthday update? We saw a lot of rankings drop, most drastic in terms that are more loosely associated with our offerings but also seeing more sites with higher DA taking a spot or two above and some more images in top spots for ecommerce terms that you might not usually think to click images to view. For a lot of these terms we're sure our conversion is better, and we offer to more of the search query intent, than some of the competitor sites that have taken top spots - even Amazon and staples who I'm guessing will move back down after google sees people leaving their site to find what they are looking for (hopefully).
Algorithm Updates | | david-johns-sheetlabels1 -
Primary keyword in every page title of website
Hi all, We can see many website page titles are filled with "brand name & primary keyword" at suffix. Just wondering how much this gonna help. Or can we remove "primary keyword" from other non-relevant pages and limit the same to important pages to rank well? Thanks
Algorithm Updates | | vtmoz0 -
Google not crawling click to expand content - suggestions?
It seems like Google confirmed this week in a G+ hangout that content in click to expand content e.g. 'read more' dropdown and tabbed content scenarios will be discounted. The suggestion was if you have content it needs to be visible on page load. Here's more on it https://www.seroundtable.com/google-index-click-to-expand-19449.html and the actual hangout, circa 11 mins in https://plus.google.com/events/cjcubhctfdmckph433d00cro9as. From a UX and usability point of view having a lot of content that was otherwise tabbed or in click to expand divs can be terrible, especially on mobile. Does anyone have workable solutions or can think of examples of really great landing pages (i'm mostly thinking ecommerce) that also has a lot of visible content? Thanks Andy
Algorithm Updates | | AndyMacLean0 -
Page 2 to page 1
I've found a lot of times it does not take much activity to get a keyword from ranking on page 3 of Google or further down to page 2 but there seems to be a hurdle from page 2 to page 1. It is very frustrating to be between 11 and 15 but not being able to make that push to 9 or 10. Has anyone got or seen any data to justifiy this?
Algorithm Updates | | S_Curtis0 -
Could Retail Price Be A Google Ranking Factor???
I have not done any detailed studies on this but it seems that Google might be using low retail prices for specific items as a ranking factor in their organic SERPs. Does anyone else suspect this? Just askin' to hear your thoughts. Thanks!
Algorithm Updates | | EGOL0 -
Are Google algos different between .co.uk and .com?
I have a site that is starting to rank well (top 10 to top 50) for dozens of keywords in Google.co.uk but very little traction in .com. Google.com is the primary market. Webmaster tools is set to US, less than 1% of links to the site are the UK TLD or hosted in the UK. Keywords I'm ranking for in UK are medium to high competition with up to 16k exact search volume per month in the US. I just started to get ranked for these keywords in .co.uk in the past week, and I do rank for some long tail keywords in google.com. I have a handful of keywords ranking in google.ca and google.fr as well, but next to nothing for google.com. I have been building links for one month. I can think of a few possible explanations: - There is a delay in updating the rankings for Google.com and the rankings similar to my .co.uk rankings will come soon - Google.com vs .co.uk use a different algorithm - My site is penalized in .com only Of course, there is no way to be sure what the reason is, but what do you think is the most likely? Thanks!
Algorithm Updates | | kentaro-2569290 -
Organic listing & map listing on 1st page of Google
Hi, Back then, a company could get multiple listings in SERP, one in Google Maps area and a homepage or internal pages from organic search results. But lately, I've noticed that Google are now putting together the maps & organic listings. This observation has been confirmed by a couple of SEO people and I thought it made sense, but one day I stumble with this KWP "bmw dealership phoenix" and saw that www.bmwnorthscottsdale.com has separate listing for google places and organic results. Any idea how this company did this? Please see the attached image
Algorithm Updates | | ao5000000 -
Test contet/pages indexed by search engines
During the web development stages of our Joomla CMS website, we have managed to get our site indexed for totally irrelevant test pages mainly to do with Joomla and some other equally irrelevant test content. How damaging is this to our domain from an SEO prospective and is there something we can do about it? When we do a site:domain.com search we see hundreds of testpages with test/irrelevant meta tags etc.
Algorithm Updates | | Fuad_YK0