Will blocking the Wayback Machine (archive.org) have any impact on Google crawl and indexing/SEO?
-
Will blocking the Wayback Machine (archive.org) by adding the code they give have any impact on Google crawl and indexing/SEO?
Anyone know?
Thanks!
~Brett
-
I have blocked the Wayback Machine for a client and not allowed them to index the site. I blocked them via the robots.txt and not Meta NoIndex, and while blocking Wayback Machine it did NOT impact the positions within the targeted Google results.
Hope this helps.
-
Brett,
I am not sure what code you are referring to but what archive.org suggests is blocking their crawler through robots.txt:
User-agent: ia_archiver
Disallow: /The robots.txt file should be in your root directory.
It's explained here: http://archive.org/about/exclude.php
Doing this will not impact your search results or crawl on Google.
V-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Menu impact on SEO
we have a single page web application for an ecommerce website. I think it is built in angular. One UX features we are exploring is the use of a "Products" Item on the menu with the categories showing on a menu rather than directly present on the header. The aim being to keep the header nice and clean. The result of this is that the categories which would typically sit in the header will now not be immediately visible until the menu is opened. Let's say I want to rank well for "building materials". Traditionally the view would be that this word would need to be in the header and marked up with the appropriate h tag. Will moving "building materials" into a product menu be detrimental for SEO? My initial thought is that as long as it is coded correctly there shouldn't be any impact on SEO. Can anyone give me their expert SEO view?
Technical SEO | | built_bot0 -
Tough SEO problem, Google not caching page correctly
My web site is http://www.mercimamanboutique.com/ Cached version of French version is, cache:www.mercimamanboutique.com/fr-fr/ showing incorrectly The German version: cache:www.mercimamanboutique.com/de-de/ is showing correctly. I have resubmitted site links, and asked Google re-index the web site many times. The German version always gets cached properly, but the French version never does. This is frustrating me, any idea why? Thanks.
Technical SEO | | ss20160 -
Why seomoz.org still in Google index?
I searched in Google, the number of URLs indexed left in the seomoz.org domain since it changed to moz.comI am surprised that after all this time more than 15,000 URLs indexed:https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=site%3Aseomoz.org%20inurl%3Aseomoz.org If I clicked on any of the results it will be redirect (301) to the new domain, so it is working, but Google still keep these URLs in the index.
Technical SEO | | Yosef
What could be the reason?Will not cause duplicated content issue on moz.com?0 -
How can I tell Google not to index a portion of a webpage?
I'm working with an ecommerce site that has many product descriptions for various brands that are important to have but are all straight duplicates. I'm looking for some type of tag tht can be implemented to prevent Google from seeing these as duplicates while still allowing the page to rank in the index. I thought I had found it with Googleoff, googleon tag but it appears that this is only used with the google appliance hardware.
Technical SEO | | bradwayland0 -
How long after disallowing Googlebot from crawling a domain until those pages drop out of their index?
We recently had Google crawl a version of the site we that we had thought we had disallowed already. We have corrected the issue of them crawling the site, but pages from that version are still appearing in the search results (the version we want them to not index and serve up is our .us domain which should have been blocked to them). My question is this: How long should I expect that domain (the .us we don't want to appear) to stay in their index after disallowing their bot? Is this a matter of days, weeks, or months?
Technical SEO | | TLM0 -
Http:// to https:// 301 or 302 redirect
I've read over the Q & A in the Community, but am wondering the reasoning behind this issue. I know - 301's are permanent and pass links, and 302s are temporary (due to cache) and don't pass links. But, I've run across two sites now that 302 redirect http:// to https://. Is there a valid reason behind this? From my POV and research, the redirect should 301 if it's permanent, but is there a larger issue I am missing?
Technical SEO | | FOTF_DigitalMarketing1 -
Having some weird crawl issues in Google Webmaster Tools
I am having a large amount of errors in the not found section that are linked to old urls that haven't been used for 4 years. Some of the ulrs being linked to are not even in the structure that we used to use for urls. Never the less Google is saying they are now 404ing and there are hundreds of them. I know the best way to attack this is to 301 them, but I was wondering why all of these errors would be popping up. I cant find anything in the google index searching for the link in "" and in webmaster tools it shows unavailable as where these are being linked to from. Any help would be awesome!
Technical SEO | | Gordian1 -
Some site pages are removed from Google Index
Hello, Some pages of my clients website are removed from Google Index. We were in top 10 position for some keywords but now I cannot find those pages neither in top 1000. Any idea what to do in order to get these pages back? thank you
Technical SEO | | besartbajrami0