Is there a way to prevent Google Alerts from picking up old press releases?
-
I have a client that wants a lot of old press releases (pdfs) added to their news page, but they don't want these to show up in Google Alerts. Is there a way for me to prevent this?
-
Thanks for the post Keri.
Yep, the OCR option would still make the image option for hiding "moo"
-
Harder, but certainly not impossible. I had Google Alerts come up on scanned PDF copies of newsletters from the 1980s and 1990s that were images.
The files recently moved and aren't showing up for the query, but I did see something else interesting. When I went to view one of the newsletters (https://docs.google.com/file/d/0B2S0WP3ixBdTVWg3RmFadF91ek0/edit?pli=1), it said "extracting text" for a few moments, then had a search box where I could search the document. On the fly, Google was doing OCR work and seemed decently accurate in the couple of tests I had done. There's a whole bunch of these newsletters at http://www.modelwarshipcombat.com/howto.shtml#hullbusters if you want to mess around with it at all.
-
Well that is how to exclude them from an alert that they setup, but I think they are talking about anyone who would setup an alert that might find the PDFs.
One other idea I had, that I think may help. If you setup the PDFs as images vs text then it would be harder for Google to "read" the PDFs and therefore not catalog them properly for the alert, but then this would have the same net effect of not having the PDFs in the index at all.
Danielle, my other question would be - why do they give a crap about Google Alerts specifically. There has been all kinds of issues with the service and if someone is really interested in finding out info on the company, there are other ways to monitor a website than Google Alerts. I used to use services that simply monitor a page (say the news release page) and lets me know when it is updated, this was often faster than Google Alerts and I would find stuff on a page before others who did only use Google Alerts. I think they are being kind of myopic about the whole approach and that blocking for Google Alerts may not help them as much as they think. Way more people simply search on Google vs using Alerts.
-
The easiest thing to do in this situation would be to add negative keywords or advanced operators to your google alert that prevent the new pages from triggering the alert. You can do this be adding advanced operators that exclude an exact match phrase, a file type, the clients domain or just a specific directory. If all the new pdf files will be in the same directory or share a common url structure you can exclude using the "inurl:-" operator.
-
That also presumes Google Alerts is anything near accurate. I've had it come up with things that have been on the web for years and for whatever reason, Google thinks they are new.
-
That was what I was thinking would have to be done... It's a little complicated on why they don't want them showing up in Alerts. They do want them showing up on the web, just not as an Alert. I'll let them know they can't have it both ways!
-
Robots.txt and exclude those files. Note that this takes them out of the web index in general so they will not show up in searches.
You need to ask your client why they are putting things on the web if they do not want them to be found. If they do not want them found, dont put them up on the web.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google AMP or CDN?
Hello. I'm running a CMS that cannot currently support both CDN and Google AMP. I would have to choose one or the other. Does anyone have any insight on which may be the better choice until I can figure out how to have both? I installed CDN first to reduce the time it took for my pages/images to load. I'd like to have AMP because it can do the same, and perhaps be a little more Google friendly (their product). I would appreciate any thoughts. Thanks! Steve
On-Page Optimization | | recoil0 -
Google Answer Box
I optimized several pages using Rand's post on Google Answer Box: https://moz.com/blog/how-to-appear-in-googles-answer-boxes-whiteboard-friday How long after the page is indexed should it appear? Lastly, how long should I wait before determining it will not get an answer box and reconfigure the page? No bad answers 🙂 TY KJr
On-Page Optimization | | KevnJr0 -
Google Indexing Wrong Title
Hey guys ! I have a wordpress website and also yoast seo plugin . I've set up a meta title which is : TV Online | Assistir Filmes| Notícias | Futebol |GogsTV . (I checked on some free tools to see , and they also show up this) but .... google is showing this : GogsTV: TV Online | Assistir Filmes| Notícias | Futebol . Seems they are trying to show my brand name first instead of my main keyword . I'm not sure why it doesnt indexes as i want ... Does anybody know how can i fix this . Thanks
On-Page Optimization | | tiagosimk0 -
What to do with old web pages after a re skin?
Buongiorno from foggy & wet wetherby UK 😞 Having launched a website there is a cluster of old redundant pages which i dont want to appear in the serach engines problem is they deliver search traffic. Would it be best to 301 redirect them? Or delete them & 404 Not found alert, I'm really not sure whats best 😞 Any insights welcome 🙂
On-Page Optimization | | Nightwing0 -
Issues with Product Pages Getting Index In Google
I just started working here the other week and one of the big issue is that a lot of the product pages are not getting index in google. We have an xml.gz site map they submitted a long time ago. My guess is it might be something with not enough content on the pages? Here are a few example of pages that are not getting index in google. http://www.rockymountainatvmc.com/p/43/-/439/716/-/33097/Alpinestars-Dual-Motorcycle-Gloves http://www.rockymountainatvmc.com/p/47/-/201/803/-/28948/Camelbak-Blowfish-2013 http://www.rockymountainatvmc.com/p/46/-/203/836/-/6996/MSR-Head-Case http://www.rockymountainatvmc.com/p/44/54/208/764/80/1220/Galfer-Brake-Pad-Sintered-Metal There are 100's that are not indexed just trying to figure out what we need to do! We are working on new content to them all but we have over 5000 products so it will take a long time. We also have the reviews on the pages and are looking at starting a Q&A on page to help get more unique content.
On-Page Optimization | | DoRM0 -
How long after a URL starts showing a 404 does Google stop crawling?
Before hiring me to do SEO, a client re-launched their site and did not 301 the old URLs to the new. Only the home page URL stayed the same. For a month after the re-launch, the old URLs returned a 404. For the next month, all 404 pages (basically any non-existent URL) were 301'd to the home page. Finally, 2 months after launching, they properly 301'd the old URLs to the new. Now, the new URLs are not ranking well. I assume it's too late to realize any benefit from the 301's, just checking to see if anybody has any insight into how long Google keeps trying to crawl old/404/improperly 301'd URLs. Thanks!
On-Page Optimization | | AndrewMiller0 -
Google place 7 -> 40, why??
Hi, my new site http://www.ie-mac.com/ just dropped 33 places from place 7 to place 40 on goolge.com , for the two word combo: ie mac Did I screw up? How? Background Info: 1 Two weeks ago I moved my whole site from my old domain http://ie4mac.com/ to http://www.ie-mac.com/ with the goal of obtaining a good ranking for the keyword combo: ie mac. Apparantly tis worked- The site showed up on place 7. 2. I changed the design of the site and put the video on the front page. Good so far, still place 7, but: The text that google was showing was half the ALT-Tag of the Video first-slide image and the other half was our trademark disclaimer. 3. I changed the ALT tag and the disclaimer to give users a more inviting text on google. THis worked, google now shows the text as intended, but: For the desired combo: ie mac the site dropped to palce 40!! My best guesses at this point: 1. I'm using wordpress as a CMS and the all-in-one-seo-pack plugin to set custom titles etc., and the google XML sitemap plugin to buid an XML sitemap and notify google. During the couple of days, I made a lot of chnages to the site. Could be that the plugin pinged google a lot of times. Could this be part of the problem? 2. The site is hosted at http://www.ixwebhosting.com/ , because they give users dedicated IPs and a good price. However, the loadlevel on the server I'm on is always very high (10 - 20). I'm using a CDN for images and a caching plugin so the site loads in less than 2 seconds according to http://tools.pingdom.com/ . Unless the cache is empty, then it's 9 seconds. This is not great, but it's also no new, so: What could have caused the sudden drop from 7 to 40?? Thank you and kind regards
On-Page Optimization | | ie4mac0 -
How to get Google images traffic?
How to get Google images traffic? Take a look at traxnyc.com and sugest what we can improve. Thanks in advance.
On-Page Optimization | | DiamondJewelryEmpire0