Will blocking the Wayback Machine (archive.org) have any impact on Google crawl and indexing/SEO?
-
Will blocking the Wayback Machine (archive.org) by adding the code they give have any impact on Google crawl and indexing/SEO?
Anyone know?
Thanks!
~Brett
-
I have blocked the Wayback Machine for a client and not allowed them to index the site. I blocked them via the robots.txt and not Meta NoIndex, and while blocking Wayback Machine it did NOT impact the positions within the targeted Google results.
Hope this helps.
-
Brett,
I am not sure what code you are referring to but what archive.org suggests is blocking their crawler through robots.txt:
User-agent: ia_archiver
Disallow: /The robots.txt file should be in your root directory.
It's explained here: http://archive.org/about/exclude.php
Doing this will not impact your search results or crawl on Google.
V-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google indexes page elements
Hello We face this problem that Google indexes page elements from WordPress as single pages. How can we prevent these elements from being indexed separately and being displayed in the search results? For example this project: www.rovana.be When scrolling down the search results, there are a lot of elements that are indexed separately. When clicking on the link, this is wat we see (see attachements) Does anyone have experience with this way of indexing and how can we solve this problem? Thanks! LlAWG4w.png C7XDDYS.png gVroomx.png
Technical SEO | | conversal0 -
Google indexing .com and .co.uk site
Hi, I am working on a site that is experiencing indexation problems: To give you an idea, the website should be www.example.com however, Google seems to index www.example.co.uk as well. It doesn’t seem to honour the 301 redirect that is on the co.uk site. This is causing quite a few reporting and tracking issues. This happened the first time in November 2016 and there was an issue identified in the DDOS protection which meant we would have to point www.example.co.uk to the same DNS as www.example.com. This was implemented and made no difference. I cleaned up the htaccess file and this made no difference either. In June 2017, Google finally indexed the correct URL, but I can’t be sure what changed it. I have now migrated the site onto https and www.example.co.uk has been reindexed in Google alongside www.example.com I have been advised that the http needs to be removed from DDOS which is in motion I have also redirected http://www.example.co.uk straight to https://www.example.com to prevent chain redirects I can’t block the site via robot.txt unless I take the redirects off which could mean that I lose my rankings. I should also mention that I haven't actually lost any rankings, it's just replaced some URLs with co.uk and others have remained the same. Could you please advise what further steps I should take to ensure the correct URL’s are indexed in Google?
Technical SEO | | Niki_10 -
Page missing from Google index
Hi all, One of our most important pages seems to be missing from the Google index. A number of our collections pages (e.g., http://perfectlinens.com/collections/size-king) are thin, so we've included a canonical reference in all of them to the main collection page (http://perfectlinens.com/collections/all). However, I don't see the main collection page in any Google search result. When I search using "info:http://perfectlinens.com/collections/all", the page displayed is our homepage. Why is this happening? The main collection page has a rel=canonical reference to itself (auto-generated by Shopify so I can't control that). Thanks! WUKeBVB
Technical SEO | | leo920 -
Tools/Software that can crawl all image URLs in a site
Excluding Screaming Frog, what other tools/software to use in order to crawl all image URLs in a site? Because in Screaming Frog, they don't crawl image URLs which are not under the site domain. Example of an image URL outside the client site: http://cdn.shopify.com/images/this-is-just-a-sample.png If the client is: http://www.example.com, Screaming Frog only crawls images under it like, http://www.example.com/images/this-is-just-a-sample.png
Technical SEO | | jayoliverwright0 -
Is there an SEO advantage to blog content being a child of /blog/ rather than the homepage?
I'm working on a website where all the blog content is listed as separate pages from the homepage, eg: www.domain.com/first-blog-post
Technical SEO | | MillyShaw
www.domain.com/second-blog-post However, it would make my life easier if all blog content was listed under /blog/ so that I could analyse it better in Google Analytics. Eg I'd like it to be: www.domain.com/blog/first-blog-post
www.domain.com/blog/second-blog-post The developer is not keen because it would create extra work for him, and he's also said it's a bad idea from an SEO point of view. But is this the case? Presumably with 301s in place it wouldn't make a difference? Thanks for your help!0 -
Google Crawling Issues! How Can I Get Google to Crawl My Website Regularly?
Hi Everyone! My website is not being crawled regularly by Google - there are weeks when it's regular but for the past month or so it does not get crawled for seven to eight days. There are some specific pages, that I want to get ranked but they of late are not being crawled AT ALL unless I use the 'Fetch As Google' tool! That's not normal, right? I have checked and re-checked the on-page metrics for these pages (and the website as a whole, backlinking is a regular and ongoing process as well! Sitemap is in place too! Resubmitted it once too! This issue is detrimental to website traffic and rankings! Would really appreciate insights from you guys! Thanks a lot!
Technical SEO | | farhanm1 -
Does hidden text, which appears for an onclick event, get indexed by Google and what SEO impact does this have?
I'm trying to simplify a conversion process with an onclick event to show text rather than having a completely separate page, but wondering if this is going to negatively impact on SEO, especially considering it's hidden text. I've seen a couple of things out there where you could position the text off the screen and the onclick results in it coming on.
Technical SEO | | JuiceBoxOM0 -
How to remove crawl errors in google webmaster tools
In my webmaster tools account it says that I have almost 8000 crawl errors. Most of which are http 403 errors The urls are http://legendzelda.net/forums/index.php?app=members§ion=friends&module=profile&do=remove&member_id=224 http://legendzelda.net/forums/index.php?app=core&module=attach§ion=attach&attach_rel_module=post&attach_id=166 And similar urls. I recently blocked crawl access to my members folder to remove duplicate errors but not sure how i can block access to these kinds of urls since its not really a folder thing. Any idea on how to?
Technical SEO | | NoahGlaser780