Negative impact on crawling after upload robots.txt file on HTTPS pages
-
I experienced negative impact on crawling after upload robots.txt file on HTTPS pages. You can find out both URLs as follow.
Robots.txt File for HTTP: http://www.vistastores.com/robots.txt
Robots.txt File for HTTPS: https://www.vistastores.com/robots.txt
I have disallowed all crawlers for HTTPS pages with following syntax.
User-agent: *
Disallow: /Does it matter for that? If I have done any thing wrong so give me more idea to fix this issue.
-
Hi CP,
If you wish to use robots.txt to block crawlers, then your two robots.txt files should be as follows:
For your http protocol (http://vistastores.com/robots.txt
User-agent: * Allow: /
For the https protocol (https://vistastores.com/robots.txt
User-agent: * Disallow: / Personally, I prefer to use the noindex meta tag for page blocking because it is a more reliable way of ensuring that the pages are not indexed. (Never try to use both at once) This link explains the difference between the two: [Google Webmaster Tools Help.](http://www.google.com/support/webmasters/bin/answer.py?answer=35302 "Robots blocking crawlers") Hope that helps, Sha ```You can use a robots.txt file to request that search engines remove your site and prevent robots from crawling it in the future. (It's important to note that if a robot discovers your site by other means - for example, by following a link to your URL from another site - your content may still appear in our index and our search results. To entirely prevent a page from being added to the Google index even if other sites link to it, use a [noindex meta tag](http://www.google.com/support/webmasters/bin/answer.py?answer=61050).)
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why is my home page ranking much higher than my collection page?
Hi everyone, Why is my client's home page ranking high for a certain keyword phrase rather than a collection page I have which is well optimised for this keyword? The collection page is on the 10th SERPs page. I did see there were keywords used in the footer of page and the keyword was also used in some intro text on the home page so I removed the keyword from these two places nearly 2 weeks ago and requested google to reindex both the collection page and home page and I've not seen any improvement of the collection page's ranking in SERPs. I also changed the meta description and meta title as the ctr was poor but there wasn''t that many impressions either. It is a competitive keyword organically so maybe the collection page's authority is just not good enough compared to the competitors hence why they are choosing the home page as it has higher page authority however this still is not helpful to searchers who land on home page. Does anyone have any ideas of what else I can do to get google to rank the ocllection page higher for the keyword instead of home page?
Intermediate & Advanced SEO | | TZ19820 -
Robots.txt Syntax
I have been having a hard time finding any decent information regarding the robots.txt syntax that has been written in the last few years and I just want to verify some things as a review for myself. I have many occasions where I need to block particular directories in the URL, parameters and parameter values. I just wanted to make sure that I am doing this in the most efficient ways possible and thought you guys could help. So let's say I want to block a particular directory called "this" and this would be an example URL: www.domain.com/folder1/folder2/this/file.html
Intermediate & Advanced SEO | | DRSearchEngOpt
or
www.domain.com/folder1/this/folder2/file.html In order for me to block any URL that contains this folder anywhere in the URL I would use: User-agent: *
Disallow: /this/ Now lets say I have a parameter "that" I want to block and sometimes it is the first parameter and sometimes it isn't when it shows up in the URL. Would it look like this? User-agent: *
Disallow: ?that=
Disallow: &that= What about if there is only one value I want to block for "that" and the value is "NotThisGuy": User-agent: *
Disallow: ?that=NotThisGuy
Disallow: &that=NotThisGuy My big questions here are what are the most efficient ways to block a particular parameter and block a particular parameter value. Is there a more efficient way to deal with ? and & for when the parameter and value are either first or later? Secondly is there a list somewhere that will tell me all of the syntax and meaning that can be used for a robots.txt file? Thanks!0 -
SEO structure question: Better to add similar (but distinct) content to multiple unique pages or make one unique page?
Not sure which approach would be more SEO ranking friendly? As we are a music store, we do instrument repairs on all instruments. Currently, I don't have much of any content about our repairs on our website... so I'm considering a couple different approaches of adding this content: Let's take Trumpet Repair for example: 1. I can auto write to the HTML body (say, at the end of the body) of our 20 Trumpets (each having their own page) we have for sale on our site, the verbiage of all repairs, services, rates, and other repair related detail. In my mind, the effect of this may be that: This added information does uniquely pertain to Trumpets only (excludes all other instrument repair info), which Google likes... but it would be duplicate Trumpet repair information over 20 pages.... which Google may not like? 2. Or I could auto write the repair details to the Trumpet's Category Page - either in the Body, Header, or Footer. This definitely reduces the redundancy of the repeating Trumpet repair info per Trumpet page, but it also reduces each Trumpet pages content depth... so I'm not sure which out weighs the other? 3. Write it to both category page & individual pages? Possibly valuable because the information is anchoring all around itself and supporting... or is that super duplication? 4. Of course, create a category dedicated to repairs then add a subcategory for each instrument and have the repair info there be completely unique to that page...- then in the body of each 20 Trumpets, tag an internal link to Trumpet Repair? Any suggestions greatly appreciated? Thanks, Kevin
Intermediate & Advanced SEO | | Kevin_McLeish0 -
Recovering from robots.txt error
Hello, A client of mine is going through a bit of a crisis. A developer (at their end) added Disallow: / to the robots.txt file. Luckily the SEOMoz crawl ran a couple of days after this happened and alerted me to the error. The robots.txt file was quickly updated but the client has found the vast majority of their rankings have gone. It took a further 5 days for GWMT to file that the robots.txt file had been updated and since then we have "Fetched as Google" and "Submitted URL and linked pages" in GWMT. In GWMT it is still showing that that vast majority of pages are blocked in the "Blocked URLs" section, although the robots.txt file below it is now ok. I guess what I want to ask is: What else is there that we can do to recover these rankings quickly? What time scales can we expect for recovery? More importantly has anyone had any experience with this sort of situation and is full recovery normal? Thanks in advance!
Intermediate & Advanced SEO | | RikkiD220 -
Why is my XML sitemap ranking on the first page of google for 100s of key words versus the actual relevant page?
I still need this question answerd and I know it's something I must have changed. But google is ranking my sitemap for 100s of key terms versus the actual page. It's great to be on the first page but not my site map...... Geeeez.....
Intermediate & Advanced SEO | | ursalesguru0 -
Googlebot Can't Access My Sites After I Repair My Robots File
Hello Mozzers, A colleague and I have been collectively managing about 12 brands for the past several months and we have recently received a number of messages in the sites' webmaster tools instructing us that 'Googlebot was not able to access our site due to some errors with our robots.txt file' My colleague and I, in turn, created new robots.txt files with the intention of preventing the spider from crawling our 'cgi-bin' directory as follows: User-agent: * Disallow: /cgi-bin/ After creating the robots and manually re-submitting it in Webmaster Tools (and receiving the green checkbox), I received the same message about Googlebot not being able to access the site, only difference being that this time it was for a different site that I manage. I repeated the process and everything, aesthetically looked correct, however, I continued receiving these messages for each of the other sites I manage on a daily-basis for roughly a 10-day period. Do any of you know why I may be receiving this error? is it not possible for me to block the Googlebot from crawling the 'cgi-bin'? Any and all advice/insight is very much welcome, I hope I'm being descriptive enough!
Intermediate & Advanced SEO | | NiallSmith1 -
202 error page set in robots.txt versus using crawl-able 404 error
We currently have our error page set up as a 202 page that is unreachable by the search engines as it is currently in our robots.txt file. Should the current error page be a 404 error page and reachable by the search engines? Is there more value or is it a better practice to use 404 over a 202? We noticed in our Google Webmaster account we have a number of broken links pointing the site, but the 404 error page was not accessible. If you have any insight that would be great, if you have any questions please let me know. Thanks, VPSEO
Intermediate & Advanced SEO | | VPSEO0 -
Is there any delay between crawling a page by google and displaying of the ratings in rich snippet of the results in google?
Is there any delay between crawling a page by google and displaying of the ratings in rich snippet of the results in google?
Intermediate & Advanced SEO | | NEWCRAFT0