How can I exclude display ads from robots.txt?
-
Google has stated that you can do this to get spiders to content only, and faster. Our IT guy is saying it's impossible.
Do you know how to exlude display ads from robots.txt?Any help would be much appreciated.
-
You'd want to make the URL paths where the display ads live to have the crawl disallowed in your robots.txt, just like any other section of your site. Here's some basics on robots.txt.
Hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Using one robots.txt for two websites
I have two websites that are hosted in the same CMS. Rather than having two separate robots.txt files (one for each domain), my web agency has created one which lists the sitemaps for both websites, like this: User-agent: * Disallow: Sitemap: https://www.siteA.org/sitemap Sitemap: https://www.siteB.com/sitemap Is this ok? I thought you needed one robots.txt per website which provides the URL for the sitemap. Will having both sitemap URLs listed in one robots.txt confuse the search engines?
Technical SEO | | ciehmoz0 -
Can you confirm legitimate Google Bot traffic?
We use Cloudflare as a firewall. I noticed a significant number of blocks of bot traffic. One of the things they do is try to block bad bot traffic. But it seems they are mistakenly blocking Google Bot traffic. If you use Cloudflare, you may want to look into this as well. Also, can you confirm if the following IPs are for legitimate Google Bots? 66.249.79.88
Technical SEO | | akin67
66.249.79.65
66.249.79.80 66.249.79.76 Thanks,1 -
Google and responsive content in display:none CSS
I’m building a WordPress site with Visual Composer and I’ve hit a point where I need to show a totally different section on a mobile compared to a desktop/tablet. My issue/question comes from the fact that both mobile and desktop rows will have the same content as well as H1/H2/H3 tags. From inspecting the elements I see the mobile only rows are hidden until the page size shrinks through being set to 'display: none' in the CSS (standard visual composer way of handling width & responsiveness) How will Google see this in terms of SEO? I don’t want to come across as if I’m cloaking text and H1 tags on the page (I have emailed the visual composer support but wanted to get an external opinion)
Technical SEO | | shloy23-2945840 -
Adding parameters in URLs and linking to a page
Hi, Here's a fairly technical question: We would like to implement badge feature where linking websites using a badge would use urls such as: domain.com/page?state=texas&city=houston domain.com/page?state=neveda&city=lasvegas Important note: the parameter will change the information and layout of the page: domain.com/page Would those 2 urls above along with their extra parameters be considered the same page as domain.com/page by google's crawler? We're considering adding the parameter "state" and "city" to Google WMT url parameter tool to tel them who to handle those parameters. Any feedback or comments is appreciated! Thanks in advance. Martin
Technical SEO | | MartinH0 -
Robots.txt and joomla
Hello, I use joomla for my website and automatically all those files are blocked is that good or bad, so I remove anything and if so why ? User-agent: *
Technical SEO | | seoanalytics
Disallow: /administrator/
Disallow: /cache/
Disallow: /components/
Disallow: /images/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /libraries/
Disallow: /media/
Disallow: /modules/
Disallow: /plugins/
Disallow: /templates/
Disallow: /tmp/
Disallow: /xmlrpc/ I also added to my robots.txt files my email address ( is that useful, I am afraid google passes PR to the email address )
and a javascript: void (0) because I have tabs on my webpage ( is that useful )
as well as a .pdf ( is it also useful ) any comments ? does anything need to be changed or is it ok ? Thank you,0 -
Robots.txt questions...
All, My site is rather complicated, but I will try to break down my question as simply as possible. I have a robots.txt document in the root level of my site to disallow robot access to /_system/, my CMS. This looks like this: # /robots.txt file for http://webcrawler.com/
Technical SEO | | Horizon
# mail webmaster@webcrawler.com for constructive criticism **User-agent: ***
Disallow: /_system/ I have another robots.txt file in another level down, which is my holiday database - www.mysite.com/holiday-database/ - this is to disallow access to /holiday-database/ControlPanel/, my database CMS. This looks like this: **User-agent: ***
Disallow: /ControlPanel/ Am I correct in thinking that this file must also be in the root level, and not in the /holiday-database/ level? If so, should my new robots.txt file look like this: # /robots.txt file for http://webcrawler.com/
# mail webmaster@webcrawler.com for constructive criticism **User-agent: ***
Disallow: /_system/
Disallow: /holiday-database/ControlPanel/ Or, like this: # /robots.txt file for http://webcrawler.com/
# mail webmaster@webcrawler.com for constructive criticism **User-agent: ***
Disallow: /_system/
Disallow: /ControlPanel/ Thanks in advance. Matt0 -
How to display the exact url of our subsite in Google
Hi, I'm new to SEO and we just recently relaunched our site. Our site consist of 6 hotels that acts as a subsite. We noticed that when search for one of the hotels what is coming in the google is the main website. Example: We search for flora grand. We expect that in Google it will display the first link as www.florahospitality.com/dubai-flora-grand-hotel.aspx. But it show the main site which is www.florahospitality.com What do I miss here?
Technical SEO | | shebinhassan0 -
Subdomain Removal in Robots.txt with Conditional Logic??
I would like to see if there is a way to add conditional logic to the robots.txt file so that when we push from DEV to PRODUCTION and the robots.txt file is pushed, we don't have to remember to NOT push the robots.txt file OR edit it when it goes live. My specific situation is this: I have www.website.com, dev.website.com and new.website.com and somehow google has indexed the DEV.website.com and NEW.website.com and I'd like these to be removed from google's index as they are causing duplicate content. Should I: a) add 2 new GWT entries for DEV.website.com and NEW.website.com and VERIFY ownership - if I do this, then when the files are pushed to LIVE won't the files contain the VERIFY META CODE for the DEV version even though it's now LIVE? (hope that makes sense) b) write a robots.txt file that specifies "DISALLOW: DEV.website.com/" is that possible? I have only seen examples of DISALLOW with a "/" in the beginning... Hope this makes sense, can really use the help! I'm on a Windows Server 2008 box running ColdFusion websites.
Technical SEO | | ErnieB0