What should I block with a robots.txt file?
-
Hi Mozzers,
We're having a hard time getting our site indexed, and I have a feeling my dev team may be blocking too much of our site via our robots.txt file.
They say they have disallowed php and smarty files.
Is there any harm in allowing these pages?
Thanks!
-
Hi Andy, here you go: www.consumerbase.com/robots.txt
I know we want to block the .html files, but I am unsure about the other folders.
I guess I would need to know for certain from my programmers that none of our content is in there?
-
I'm not too hot on Smarty, but doesn't this generate the HTML templates?
However, this shouldn't cause a problem because the files that are being generated are html so as long as they have done this right, it should be fine.
Do you want to ping me the robots file or URL over and I will have a look for you?
Andy
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What do you add to your robots.txt on your ecommerce sites?
We're looking at expanding our robots.txt, we currently don't have the ability to noindex/nofollow. We're thinking about adding the following: Checkout Basket Then possibly: Price Theme Sortby other misc filters. What do you include?
Intermediate & Advanced SEO | | ThomasHarvey0 -
Meta Robot Tag:Index, Follow, Noodp, Noydir
When should "Noodp" and "Noydir" meta robot tag be used? I have hundreds or URLs for real estate listings on my site that simply use "Index", Follow" without using Noodp and Noydir. Should the listing pages use these Noodp and Noydr also? All major landing pages use Index, Follow, Noodp, Noydir. Is this the best setting in terms of ranking and SEO. Thanks, Alan
Intermediate & Advanced SEO | | Kingalan10 -
Google Indexing Duplicate URLs : Ignoring Robots & Canonical Tags
Hi Moz Community, We have the following robots command that should prevent URLs with tracking parameters being indexed. Disallow: /*? We have noticed google has started indexing pages that are using tracking parameters. Example below. http://www.oakfurnitureland.co.uk/furniture/original-rustic-solid-oak-4-drawer-storage-coffee-table/1149.html http://www.oakfurnitureland.co.uk/furniture/original-rustic-solid-oak-4-drawer-storage-coffee-table/1149.html?ec=affee77a60fe4867 These pages are identified as duplicate content yet have the correct canonical tags: https://www.google.co.uk/search?num=100&site=&source=hp&q=site%3Ahttp%3A%2F%2Fwww.oakfurnitureland.co.uk%2Ffurniture%2Foriginal-rustic-solid-oak-4-drawer-storage-coffee-table%2F1149.html&oq=site%3Ahttp%3A%2F%2Fwww.oakfurnitureland.co.uk%2Ffurniture%2Foriginal-rustic-solid-oak-4-drawer-storage-coffee-table%2F1149.html&gs_l=hp.3..0i10j0l9.4201.5461.0.5879.8.8.0.0.0.0.82.376.7.7.0....0...1c.1.58.hp..3.5.268.0.JTW91YEkjh4 With various affiliate feeds available for our site, we effectively have duplicate versions of every page due to the tracking query that Google seems to be willing to index, ignoring both robots rules & canonical tags. Can anyone shed any light onto the situation?
Intermediate & Advanced SEO | | JBGlobalSEO0 -
FIle Names
HI Guys, Would it make a difference if I named a URL 2014-ford-fiesta.html or 2014+ford+fiesta.html Thanks!
Intermediate & Advanced SEO | | oomdomarketing0 -
Is Google indexing Mp3 audio and MIDI music files? Can that cause any duplicate problems?
Hello, I own virtualsheetmusic.com website and we have several thousands of media files (Mp3 and MIDI files) that potentially Google can index. If that's the case, I am wondering if that could cause any "duplicate" issues of some sort since many of such media files have exact file names or same meta information inside. Any thoughts about this issue are very welcome! Thank you in advance to anyone.
Intermediate & Advanced SEO | | fablau0 -
Moving from a static HTML CSS site with .html files to a Wordpress Site while keeping link structure
Mozzers, Hope this finds you well. I need some advice. We have a site built with a dreamweaver template, and it is lacking in responsiveness, ease of updates, and a lot of the coding is behind traditional web standards (which I know will start to hurt our rank - if not the user experience). For SEO purposes, we would like to move the existing static based site to Wordpress so we can update it easily and keep content fresh. Our current site, thriveboston.com, has a lot of page extensions ending in .html. For the transition, it is extremely important for us to keep the link structure. We rank well in the SERPs for Boston Counseling, etc... I found and tested a plugin (offline) that can add a .html extension to Wordpress pages, which allows us to keep our current structure, but has anyone had any luck with this live? Has anyone had any luck moving from a static site - to a Wordpress site - while keeping the current link structure - without hurting any rank? We hope to move soon because if the site continues to grow, it will become even harder to migrate the site over. Also, does anyone have any hesitations? It this a bad move? Should we just stay on the current DWT template (the HTML and CSS) and not migrate? Any suggestions and advice will be heeded. Thanks Mozzers!
Intermediate & Advanced SEO | | _Thriveworks0 -
How to Disallow Tag Pages With Robot.txt
Hi i have a site which i'm dealing with that has tag pages for instant - http://www.domain.com/news/?tag=choice How can i exclude these tag pages (about 20+ being crawled and indexed by the search engines with robot.txt Also sometimes they're created dynamically so i want something which automatically excludes tage pages from being crawled and indexed. Any suggestions? Cheers, Mark
Intermediate & Advanced SEO | | monster990 -
How to Block Google Preview?
Hi, Our site is very good for Javascript-On users, however many pages are loaded via AJAX and are inaccessible with JS-off. I'm looking to make this content available with JS-off so Search Engines can access them, however we don't have the Dev time to make them 'pretty' for JS-off users. The idea is to make them accessible with JS-off, but when requested by a user with JS-on the user is forwarded to the 'pretty' AJAX version. The content (text, images, links, videos etc) is exactly the same but it's an enormous amount of effort to make the JS-off version 'pretty' and I can't justify the development time to do this. The problem is that Googlebot will index this page and show a preview of the ugly JS-off page in the preview on their results - which isn't good for the brand. Is there a way or meta code that can be used to stop the preview but still have it cached? My current options are to use the meta noarchive or "Cache-Control" content="no-cache" to ask Google to stop caching the page completely, but wanted to know if there was a better way of doing this? Any ideas guys and girls? Thanks FashionLux
Intermediate & Advanced SEO | | FashionLux0