Is there a way to prevent Google Alerts from picking up old press releases?
-
I have a client that wants a lot of old press releases (pdfs) added to their news page, but they don't want these to show up in Google Alerts. Is there a way for me to prevent this?
-
Thanks for the post Keri.
Yep, the OCR option would still make the image option for hiding "moo"
-
Harder, but certainly not impossible. I had Google Alerts come up on scanned PDF copies of newsletters from the 1980s and 1990s that were images.
The files recently moved and aren't showing up for the query, but I did see something else interesting. When I went to view one of the newsletters (https://docs.google.com/file/d/0B2S0WP3ixBdTVWg3RmFadF91ek0/edit?pli=1), it said "extracting text" for a few moments, then had a search box where I could search the document. On the fly, Google was doing OCR work and seemed decently accurate in the couple of tests I had done. There's a whole bunch of these newsletters at http://www.modelwarshipcombat.com/howto.shtml#hullbusters if you want to mess around with it at all.
-
Well that is how to exclude them from an alert that they setup, but I think they are talking about anyone who would setup an alert that might find the PDFs.
One other idea I had, that I think may help. If you setup the PDFs as images vs text then it would be harder for Google to "read" the PDFs and therefore not catalog them properly for the alert, but then this would have the same net effect of not having the PDFs in the index at all.
Danielle, my other question would be - why do they give a crap about Google Alerts specifically. There has been all kinds of issues with the service and if someone is really interested in finding out info on the company, there are other ways to monitor a website than Google Alerts. I used to use services that simply monitor a page (say the news release page) and lets me know when it is updated, this was often faster than Google Alerts and I would find stuff on a page before others who did only use Google Alerts. I think they are being kind of myopic about the whole approach and that blocking for Google Alerts may not help them as much as they think. Way more people simply search on Google vs using Alerts.
-
The easiest thing to do in this situation would be to add negative keywords or advanced operators to your google alert that prevent the new pages from triggering the alert. You can do this be adding advanced operators that exclude an exact match phrase, a file type, the clients domain or just a specific directory. If all the new pdf files will be in the same directory or share a common url structure you can exclude using the "inurl:-" operator.
-
That also presumes Google Alerts is anything near accurate. I've had it come up with things that have been on the web for years and for whatever reason, Google thinks they are new.
-
That was what I was thinking would have to be done... It's a little complicated on why they don't want them showing up in Alerts. They do want them showing up on the web, just not as an Alert. I'll let them know they can't have it both ways!
-
Robots.txt and exclude those files. Note that this takes them out of the web index in general so they will not show up in searches.
You need to ask your client why they are putting things on the web if they do not want them to be found. If they do not want them found, dont put them up on the web.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Rich Snippet
So i have been implementing rich snippets for work and all has been good until now, As you can see below the meta description has all of a sudden included the review date. The review date is the only date on the page. Any ideas what could be causing this? Thanks wqLKKl9
On-Page Optimization | | David-McGawn0 -
Catergories not appaearing in google help!
Hi I call upon the seomoz experts again this time regarding getting our categories to show in google when you type in our company name Tidy Books. Ive added 3 pictures 1 of our UK store www.tidy-books.co.uk 2 of US store - www.tidy-books.com 3 of a competitor who has over taken us and had the Google mark up we require. Would some love help some help in the last few months we've lost a few places and traffic is going down Thanks everyone ItiqzWV 6g8e1Zy I2qEJCc
On-Page Optimization | | tidybooks0 -
What is the best way to resolve duplicate content issue
Hi I have a client whose site content has been scraped and used in numerous other sites. This is detrimental to ranking. One term we wish to rank for is nowhere. My question is this: what's the quickest way to resolve a duplicate content issue when other sites have stolen your content? I understand that maybe I should firstly contact these site owners and 'appeal to their better nature'. This will take time and they may not even comply. I've also considered rewriting our content. Again this takes time. Has anybody experienced this issue before? If so how did you come to a solution? Thanks in advance.
On-Page Optimization | | sicseo0 -
Best way to do a 301 redirect when the incorrect page has rank and FB likes
Due to a site structural problem with our CMS we have alot of duplicate content pages (1 page, with multiple urls). We are in the process of setting up 301 redirects to correct the problem. Meanwhile; one of the pages with the "incorrect" URL happens to be the page google favors and also has about 100 FB "likes". The question is: Are we better off keeping the "incorrect" URL for that particular page and redirect the other url to it? Both have a page rank of 3. Thanks
On-Page Optimization | | foodsleuth0 -
Why has my site gone from 3rd to 21st in Google...HELP
Hi there, Web Address: www.websitedesign-miltonkeynes.com Okay I stared my own web development business about 8 - 9 months ago and decided that I was going to target a small number of local keywords that were relevant to my business and geographical area. So I decided to target the keywords below and and started to get good traffic, I never have bought links and have not copied any content although I know i should try and add more content to my website. mk web design, software Milton Keynes, web design Milton Keynes, web design mk, web designer Milton Keynes, web designers Milton Keynes & website design milton Keynes However recently my website has dropped massively in the SERP's and wanted to know what has caused this (i know Penguin) and what I can do to improve. I have listed below my rankings to show you the drop: 24th April 2012 15th May 2012 mk web design 3 11 software milton keynes 1 8 web design milton Keynes 12 50 web design mk 5 53 web designer milton keynes 10 0 web designers milton keynes 6 37 website design milton keynes 3 21 I decided to change my homepage Page Title a week ago to make sound less spammy but this has made no difference and wanted some help on what has happened so i do not do this again and what I can do to improve. Thanks in advance. Darren Bowden
On-Page Optimization | | Tarqs0 -
Discrepancy between SeoMoz vs Google Webmaster tools
SeoMoz reports over 70 4xx client server errors on my site, but Google Web Master Tools does not report any broken links. There are not any broken links on any of the pages that it is reporting. Could there be another reason for the 4xx errors besides broken links?
On-Page Optimization | | AndyHawkins0 -
Remove internal site SERPS from Google Index?
1. Internal Serp pages did not have a robots meta tag 2. As a result, client site has thousands (~4,400) of internal site SERP pages in the Google index. 3. We added the NoIndex, Follow attribute to all internal SERPS 4. We Disallowed: domain.com/internal-search-operator in Robots.txt 5. No new SERP pages are being indexed, but the other 4000 something that were already there are still in the index weeks later. 6. The pages are dynamically created and still work, so I can't use the Remove Content tool from google, because the pages don't 404. Is there any way to get these pages out of the index besides just waiting and hoping google eventuall drops them? Thanks
On-Page Optimization | | delegator.com0 -
Is the www and non www isue realy seen by Google as duplicate content?
I realy don't understand how Google could posibly devaluate a link because the site displays the same content with www and without www. I mean did somebody recently saw a devaluation of a domain because of this isue? I somehow can not belive this because it is the standard when geting a new webspace that the new website display the same content with and without www. Is a redirect realy necessary?
On-Page Optimization | | MichaelJanik0