Is there a way to prevent Google Alerts from picking up old press releases?
-
I have a client that wants a lot of old press releases (pdfs) added to their news page, but they don't want these to show up in Google Alerts. Is there a way for me to prevent this?
-
Thanks for the post Keri.
Yep, the OCR option would still make the image option for hiding "moo"
-
Harder, but certainly not impossible. I had Google Alerts come up on scanned PDF copies of newsletters from the 1980s and 1990s that were images.
The files recently moved and aren't showing up for the query, but I did see something else interesting. When I went to view one of the newsletters (https://docs.google.com/file/d/0B2S0WP3ixBdTVWg3RmFadF91ek0/edit?pli=1), it said "extracting text" for a few moments, then had a search box where I could search the document. On the fly, Google was doing OCR work and seemed decently accurate in the couple of tests I had done. There's a whole bunch of these newsletters at http://www.modelwarshipcombat.com/howto.shtml#hullbusters if you want to mess around with it at all.
-
Well that is how to exclude them from an alert that they setup, but I think they are talking about anyone who would setup an alert that might find the PDFs.
One other idea I had, that I think may help. If you setup the PDFs as images vs text then it would be harder for Google to "read" the PDFs and therefore not catalog them properly for the alert, but then this would have the same net effect of not having the PDFs in the index at all.
Danielle, my other question would be - why do they give a crap about Google Alerts specifically. There has been all kinds of issues with the service and if someone is really interested in finding out info on the company, there are other ways to monitor a website than Google Alerts. I used to use services that simply monitor a page (say the news release page) and lets me know when it is updated, this was often faster than Google Alerts and I would find stuff on a page before others who did only use Google Alerts. I think they are being kind of myopic about the whole approach and that blocking for Google Alerts may not help them as much as they think. Way more people simply search on Google vs using Alerts.
-
The easiest thing to do in this situation would be to add negative keywords or advanced operators to your google alert that prevent the new pages from triggering the alert. You can do this be adding advanced operators that exclude an exact match phrase, a file type, the clients domain or just a specific directory. If all the new pdf files will be in the same directory or share a common url structure you can exclude using the "inurl:-" operator.
-
That also presumes Google Alerts is anything near accurate. I've had it come up with things that have been on the web for years and for whatever reason, Google thinks they are new.
-
That was what I was thinking would have to be done... It's a little complicated on why they don't want them showing up in Alerts. They do want them showing up on the web, just not as an Alert. I'll let them know they can't have it both ways!
-
Robots.txt and exclude those files. Note that this takes them out of the web index in general so they will not show up in searches.
You need to ask your client why they are putting things on the web if they do not want them to be found. If they do not want them found, dont put them up on the web.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Removing old URLs from Google
We rebuilt a site about a year ago on a new platform however Google is still indexing URL's from the old site that we have no control over. We had hoped that time would have 'cleaned' these out but they are still being flagged in HTML improvements in GWT. Is there anything we can do to effect these 'external' dropping out of the indexing given that they are still being picked up after a year.
On-Page Optimization | | Switch_Digital0 -
New google serps page design
hi i know title length displayed is now based on pixels rather than character but still thought safe to have titles up to 70 characters long before they are truncated i see that on the new G serps designed pages titles that were showing in full on old design (without truncation) are now being truncated. As in same title shows fine (displays in full) on old design serps but truncated on new designed page Anyone else notice this ? Cheers Dan
On-Page Optimization | | Dan-Lawrence1 -
Google Crawl Errors from vbseo change
We have vbseo setup on our site and for some reason a setting was changed unexpectedly and was un-noticed where it changed the URL of all the pages and so none of our pages were getting indexed by google any longer due to 401 errors. Most of our SE traffic fell off. We discovered the issue a couple weeks ago and we changed the setting back so that the URLs are the same as they were originally before but in Google webmasters it's still showing crawl errors and our search engine traffic hasn't recovered at all. We have sitemaps being sent daily.
On-Page Optimization | | RudySF0 -
Google cache tool help
This link is for the Ebay Google cache - http://webcache.googleusercontent.com/search?q=cache:www.ebay.com&strip=1 I wanted to do the same for my homepage so I switched out the urls and it worked. When I try to get a different link in there such as mysite.com/category it wont work. I know my pages are indexed. Any ideas why it wont work for other pages?
On-Page Optimization | | EcommerceSite0 -
Site: command and intitle: command in Google changed?
Hi Mozzers, I'm seeing some changes in Google when using certain commands I've used for ages. I'm trying to spot cananical issues by using this search site:www.mysite.com intitle:"keyword" This used to list all pages in the index on a certain site with the keyword in the title. Now I'm getting weird results and sometimes results from other sites - not the one specified in the site: command. Anyone else seeing this? Thanks B
On-Page Optimization | | Bush_JSM0 -
Streaming Google Places Reviews
Google no longer lets users pull an iframe of places reviews. I can't find a plugin for wordpress that will do this either. Any suggestions on how to show Google places reviews on your website? Or... should we blow it off and just go with yelp / City Search reviews?
On-Page Optimization | | John_Ellis0 -
How To Prevent Crawling Shopping Carts, Wishlists, Login Pages
What's the best way to prevent engines from crawling your websites shopping cart, wishlist, log in pags, ect... Obviously have it in robots.txt but is their any other form of action that should be done?
On-Page Optimization | | Romancing0 -
I think I`ve caught some kind of google filter on my site.
What if the PA and the DA on my domain and the entire site is 1. Most of the pages on the site were empty or not unique. Now I`m adding new pages with unique content. I have only one position in the top 10. The remaining 15 positions are above the top. What should I do to increase my PA & DA and to have top 10 positions by other keywords?
On-Page Optimization | | ATCnik0