Robots.txt file issue.
-
Hi,
Its my third thread here and i have created many like it on many webmaster communities.I know many pro are here so badly needs help.
Robots.txt blocked 2k important URL's of my blogging site
http://Muslim-academy.com/ Especially of my blog area which are bringing good number of visitors daily.My organic traffic declined from 1k daily to 350.
I have removed the robots.txt file.Resubmitted existing Sitemap.Used all Fetch to index options and 50 URL submission option in Bing Webmaster Tool.
What Can I do know to have these blocked URL's back in Google index?
1.Create a NEW sitemap and submit it again in Google webmaster and bing webmaster tool?
2.Bookmark,linkbuilding or share the URL's.I did a lot of bookmarking for blocked URL's.
I fetch the list of blocked URLS Using BING WEBMASTER TOOLS.
-
Robert some good signs of life.New sitemap shows 5080 pages submitted and 4817 indexed.
These remaining pages are surely blocked ones?RightRobert though there is some improvement in Impressions and Clicks.Thanks a lot for staying that long with me solving this issue.
-
Christopher,
Have you looked at indexing in GWMT to see if they have indexed, how many pages, etc.?
-
Got your point but I Resubmit and its status is still pending.
I have test it and it was working but when I submit it 2 days ago up till now its status is pending. -
No, when you resubmit or submit a "new" sitemap, it just tells Google this is the sitemap now. There is no content issue with a sitemap.
Best,Robert
-
Just one last question Robert.Does not the duplicate sitemap creates duplicate pages in searches?
Sorry my question may looks like Crazy to you but at the moment with applying every possible fix I do not mess up and make things even more worse.
-
Given the only issue was the robots.txt error, I would resubmit. I do think it would not hurt to generate a sitemap and submit that in case there may be something you are missing though.
Best
-
Robert the question is either I need to create a new sitemap or resubmit the existing one?
-
Hello Christopher
It appears you have done a good deal to remediate the situation already. I would resubmit a sitemap to Google also. Have you looked in WMT to see what is now indexed? I would look at the graph of indexed and robots.txt and see if you are moving the needle upward again.
This begs a second question of "How did it happen?" You stated, "Robots.txt blocked 2k important URL's of my blogging site" and that sounds like it just occurred out of the ether. I would want to know that I had found the reason and make sure I have a way to keep it from happening going forward. (just a suggestion).Lastly, using the Index Status in WMT should be a great way to learn how effective what you tried in fixing it is. I like knowing that type of data and storing it somewhere retrievable for the future.
Best to you,
Robert
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Blocking Standard pages with Robots.txt (t&c's, shipping policy, pricing & privacy policies etc)
Hi I've just had best practice site migration completed for my old e-commerce store into a Shopify environment and I see in GSC that it's reporting my standard pages as blocked by robots.txt, such as these below examples. Surely I don't want these blocked ? is that likely due to my migrators or s defaults setting with Shopify does anyone know? : t&c's shipping policy pricing policy privacy policy etc So in summary: Shall I unblock these? What caused it Shopify default settings or more likely my migration team? All Best Dan
Reporting & Analytics | | Dan-Lawrence0 -
Oh Japanese Character and 404 issue
This one has me stumped, so I really hope someone can help. I am assisting a company who are having 404 errors on certain japanese pages of their site. I have checked in web master tools and analytics, can see the 404, but have no idea where it is coming from. I am starting to think it may be a encoding issue of some sort. Just wondered if anyone has come across this before? Site uses site finity. What is happening is reader is going to page domain.com/ja/blog/article then coming to this page /404?aspxerrorpath=/Sitefinity/WebsiteTemplates/App_themes/"website"/fonts/open sans regular Really odd - as we cannot re-create this 404 I don't know how they are getting to it? Also in Analytics some of the pages that are written in japanese that are giving 404's look like this C3%A3%E2%80%9A%C2%BD%C3%A3%C6%92%C2%AA%C3%A3%C6%92%C2%A5%C3%A3%C6%92%C2%BC%C3%A3%E2%80%9A%C2%B7%C3%A3%C6%92%C2%A7%C3%A3%C6%92%C2%B394.8%C3%A3%C2%81%C2%AE%C3%A5%C2%AF%C2%BE%C3%A8%C2%B1%C2%A1/%C3%A9%C2%A1%C2%A7%C3%A5%C2%AE%C2%A2%C3%A3%E2%80%9A%C2%B0%C3%A3%C6%92%C2%AB%C3%A3%C6%92%C2%BC%C3%A3%C6%92%E2%80%94%C3%A6%C2%AF%C5%BD%C3%A3%C2%81%C2%AE94.8 Any help much appreciated
Reporting & Analytics | | Kelly33300 -
Google Webmaster indicates robots.text access error
Seems that Google has not been crawling due to an access issue with our robots.txt
Reporting & Analytics | | jmueller0823
Late 2013 we migrated to a new host, WPEngine, so things might have changed, however this issue appears to be recent. A quick test shows I can access the file. This is the Google Webmaster Tool message: http://www.growth trac dot com/: Googlebot can't access your site January 17, 2014 Over the last 24 hours, Googlebot encountered 62 errors while attempting to access your robots.txt. To ensure that we didn't crawl any pages listed in that file, we postponed our crawl. Your site's overall robots.txt error rate is 8.8% Note the above message says 'over the last 24 hours', however the date is Jan-17 This is the response from our host:
Thanks for contacting WP Engine support! I looked into the suggestions listed below and it doesn't appear that these scenarios are the cause of the errors. I looked into the server logs and I was only able to find 200 server responses on the /robots.txt. Secondly I made sure that the server wasn't over loaded. The last suggestion doesn't apply to your setup on WP Engine. We do not have any leads as to why the errors occurred. If you have any other questions or concerns, please feel free to reach out to us. Google is crawling the site-- should I be concerned? If so, is there a way to remedy this? By the way, our robots file is very lean, only a few lines, not a big deal. Thanks!0 -
How to get a list of robots.txt file
This is my site. http://muslim-academy.com/ Its in wordpress.I just want to know is there any way I can get the list of blocked URL by Robots.txt In Google Webmaster its not showing up.Just giving the number of blocked URL's. Any plugin or Software to extract the list of blocked URL's.
Reporting & Analytics | | csfarnsworth0 -
When will traffic data be working ? also whats with the spike in duplicate listing issues with everyone.
Hi There, We have no traffic data, is this something we are doing wrong or is this an issue with SEOMOZ ? Also duplicate listings have gone sky high, check goggle analytics's and all ok ? Any answers ? Thanks Charlie
Reporting & Analytics | | pro580 -
Google Analytics Goal Funnel Visualization Issue
I've setup a goal funnel but am having an issue when I look at the funnel visualization. It doesn't appear to be recognizing the 1st step of the funnel that I've defined in the goal edit page. The "Property Listing page view" is located at /listings/xxx where xxx is the number of the property. Within the funnel, I've added /listings/*, but when I go to see the funnel visualization, I see 0 counts for this step (even though it clearly shows on the entrance page to the left "/listings/622, etc". I've attached a .pdf with a few images to help make this clearer. Any thoughts? CRD-Funnel.pdf
Reporting & Analytics | | chrisfree0 -
Meta Robots Tag - What's it really mean?
I used on a handful of pages recently and noticed that they're still popping up in the Google search index. I'd like to keep these from appearing, so I figured I needed a directive statement with stronger semantic meaning. From what I understand, is what I'm looking for. Using this will keep Google from not only crawling the page, but indexing the page, as well. I decided to see what the official robotstxt.org website said about it, so I checked (link here): the NOFOLLOW directive only applies to links on this page. It's entirely likely that a robot might find the same links on some other page without a NOFOLLOW (perhaps on some other site), and so still arrives at your undesired page. So, is their explanation saying that the page itself will be indexed, but the content / links on it won't be followed / indexed? Let me hear your thoughts, mozzers.
Reporting & Analytics | | mudbugmedia0 -
Google and bing search filed commands
Dose someone have / know a full list / resource with commands for google and bing ? Including filters for those commands ? (site:domain.com -filter etc) (like: site:domain.com, link:domain.com etc) I use the basic ones b ut I know there are much more and that there are several filters that can be used with success to filter down results. Thanks.
Reporting & Analytics | | eyepaq1