How to get a list of robots.txt file
-
This is my site.
Its in wordpress.I just want to know is there any way I can get the list of blocked URL by Robots.txt
In Google Webmaster its not showing up.Just giving the number of blocked URL's.
Any plugin or Software to extract the list of blocked URL's.
-
If you use Bing Webmaster tools you can see a complete list all URLs blocked by robots.txt. You can export the file and then filter.
Just go to Reports & Data > Crawl Information within your Bing webmaster account. I am not aware of this feature being in Google webmaster tools. Hope this helps.
-
simon_realbuzz buddy If I use this /classifieds/ it means I am blocking all URL starting with it.I want to get a list of all blocked URL's of site.
Example
http://muslim-academy.com/classifieds/
How many URL's associated with this classified are blocked by my robots.txt.
-
I'm sorry I don't follow. If you go to that URL you will see the list of blocked URLs as I've pasted below.
User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /forum/viewtopic.php?p=
Disallow: /forum/viewtopic.php?=&p=
Disallow: /forum/viewtopic.php?t=
Disallow: /forum/viewtopic.php?start=
Disallow: /forum/&view=previousDisallow: /forum/&view=next
Disallow: /forum/&sid=
Disallow: /forum/&p=
Disallow: /forum/&sd=a
Disallow: /forum/&start=0
Disallow: /forum/memberlist.php
Disallow: /forum/posting.php
Disallow: /classifieds/
Disallow: /forum/index.php
Disallow: /forum/ucp
Disallow: /http://muslim-academy.com/الا�%A..
Disallow: /http://muslim-academy.com/особенн%D
Disallow: /http://muslim-academy.com/ислам-ка%
Disallow: /http://muslim-academy.com/classifieds/ads/Disallow: /http://muslim-academy.com/значени%D..
Disallow: /.ifieds/
Disallow: /.ifieds/ads/
Disallow: /forum/alternatelogin/al_tw_connect.php?authentication=1
Disallow: /forum/search.php -
simon_realbuzz I need a list of blocked URL's not the robots.txt file path.
-
You can view your robots file simply by appending /robots.txt to your site URL. Just put the following http://muslim-academy.com/robots.txt and you'll be able to view your robots file.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Rogerbot directives in robots.txt
I feel like I spend a lot of time setting false positives in my reports to ignore. Can I prevent Rogerbot from crawling pages I don't care about with robots.txt directives? For example., I have some page types with meta noindex and it reports these to me. Theoretically, I can block Rogerbot from these with a robots,txt directive and not have to deal with false positives.
Reporting & Analytics | | awilliams_kingston0 -
GA4 account & property not showing in traffic property setup list.
Hi there, I've connected multiple client accounts to GA4 already, but three of our accounts that we have administrator rights to in GA4 are not showing up in our selectable accounts/properties list when logged in via Moz to add to the traffic settings area. Anyone else have this issue and find a fix?
Reporting & Analytics | | luminusagency0 -
PDF best practices: to get them indexed or not? Do they pass SEO value to the site?
All PDFs have landing pages, and the pages are already indexed. If we allow the PDFs to get indexed, then they'd be downloadable directly from google's results page and we would not get GA events. The PDFs info would somewhat overlap with the landing pages info. Also, if we ever need to move content, we'd now have to redirects the links to the PDFs. What are best practices in this area? To index or not? What do you / your clients do and why? Would a PDF indexed by google and downloaded directly via a link in the SER page pass SEO juice to the domain? What if it's on a subdomain, like when hosted by Pardot? (www1.example.com)
Reporting & Analytics | | hlwebdev1 -
Whats the best way to move 30% of our content behind a paywall and still get indexed without penalties and without letting people see our content before they subscribe.
Hi all - We want to create a membership program so that they can get more great stuff from us and offers, deals, etc. but only if they qualify to be a member via a purchase for example. The question is we want to move only some of our content (c.30%) behind the membership curtain - will be a mix of SEO value content. There are few questions/ concerns I am hoping you the SEO community can help me with: How can i ensure Google continues to index it without getting penalized. If i tell google bot to index but not allow Google and other sites to see the membership content will that create a penalty? Is that considered a form of cloaking? How can i prevent having to reveal 3 pages a day under Google's First Click Free set-up. I suppose i want my cake and eat it and i suspect the answer is well i cant. Any help or insights that can help me make this decision better is gratefully accepted.
Reporting & Analytics | | Adrian-phipps0 -
Gets traffic on both domain.dk and domain.dk/default.asp
Hi. Im runnning a couple of sites. And in my analytics/webmastertool I get both traffic on domain.dk and domain.dk/default.asp which are both essentially the same page. I'm pretty sure it would be better, if I somehow could make the default.asp "redirect" to "/". I dont wanna loose the linkjuice thou. Any smart suggestions for an easy fix? /Kasper
Reporting & Analytics | | KasperGJ0 -
How to get crawled pages indexed?
Hi, I've got over 1k pages crawled but approx 100 pages indexed. Although, i submit them on Google Fetch and the links are indexable,they are not indexed. What shall i do the get max pages indexed? Any input highly appreciated. Thanks!
Reporting & Analytics | | Rubix0 -
Getting Traffic for an Unranked Phrase
Over the last month, 40% of a client's search traffic as resulted from a phrase that they are not even ranking for in the top 100; nor are they popping up on PPC ads for it. How is this happening? I feel like I am missing something very obvious.
Reporting & Analytics | | ScriptiLabs0 -
How long does it take to get results from the Term Extractor Tool?
Hi! I entered a web page into the Term Extractor Tool, and it's been running for about 3 minutes now. How long does it usually take? Thanks 🙂 Kelley
Reporting & Analytics | | kinsana0