Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Robots.txt Syntax for Dynamic URLs
-
I want to Disallow certain dynamic pages in robots.txt and am unsure of the proper syntax. The pages I want to disallow all include the string ?Page=
Which is the proper syntax?
Disallow: ?Page=
Disallow: ?Page=*
Disallow: ?Page=
Or something else? -
Thanks, Alick300 — unfortunately, the slash doesn't appear like that in the URLs on this site: they look like this
www.domain.com/page.html?Page= .........In running through an online robots.txt tester, all three versions in my original question seem to work. Until proven otherwise, I'm using the first one because it's the simplest.
-
Hi Bill,
Disallow: /?Page= will work
Thanks
-
Hi, James. It's not pagination I'm trying to disallow. The site structure has URLs that include things like "Page=give&...", that opens up a blank form ... but it comes from scores of web pages we want to spider. Since the "give" page is an empty form, we're getting tons of duplicate content errors as a result.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to handle dynamic product url that changes regularly
Hey Moz, It's actually my first post - although I look at the Q&As on a daily basis! I was hoping to get your opinions on how to handle dynamic product url that can change regularly. Before we start, our product page urls get populated by the product titles. So the situation is this. Let’s say we have a product url: /product/12345-abcde-fghj/ Then the client decides to change the title a week later, so the url changes with it to): /listing/12345-klm-qjk Another week later, the agent changes to: /listing/12345-jkhfk-jhf-kjdhfkjdhf So to note, the product ID will always remain the same. Naturally, 301 redirecting every time would cause a bit of page authority to be lost every time 301ed. Also potentially creating new a few hundreds of 301 redirect daily sounds totally mental. (I have been informed by the dev we expect a few hundreds to change url daily) Although I understand there’s no limit on how many 301s you can have on a single domain, this would look completely unnatural - really not ideal. So the potential solution we thought was: we’ll keep the original url, and make sure that is the only url that will get indexed**/product/12345-abcde-fghj/**and put canonical tag on any of the new urls, directing to the original url. The problem we will have then is that the most current url may not exactly match the description of the product -wouldn’t be ideal for ux. Has anyone had dealing with issues like this in the past? Would love to get your input! Many Thanks
Technical SEO | | MH-UK0 -
Robots.txt on subdomains
Hi guys! I keep reading conflicting information on this and it's left me a little unsure. Am I right in thinking that a website with a subdomain of shop.sitetitle.com will share the same robots.txt file as the root domain?
Technical SEO | | Whittie0 -
Blocked jquery in Robots.txt, Any SEO impact?
I've heard that Google is now indexing links and stuff available in javascript and jquery. My webmastertools is showing that some links are blocked in robots.txt of jquery. Sorry I'm not a developer or designer. I want to know is there any impact of this on my SEO? and also how can I unblock it for the robots? Check this screenshot: http://i.imgur.com/3VDWikC.png
Technical SEO | | hammadrafique0 -
Google insists robots.txt is blocking... but it isn't.
I recently launched a new website. During development, I'd enabled the option in WordPress to prevent search engines from indexing the site. When the site went public (over 24 hours ago), I cleared that option. At that point, I added a specific robots.txt file that only disallowed a couple directories of files. You can view the robots.txt at http://photogeardeals.com/robots.txt Google (via Webmaster tools) is insisting that my robots.txt file contains a "Disallow: /" on line 2 and that it's preventing Google from indexing the site and preventing me from submitting a sitemap. These errors are showing both in the sitemap section of Webmaster tools as well as the Blocked URLs section. Bing's webmaster tools are able to read the site and sitemap just fine. Any idea why Google insists I'm disallowing everything even after telling it to re-fetch?
Technical SEO | | ahockley0 -
I accidentally blocked Google with Robots.txt. What next?
Last week I uploaded my site and forgot to remove the robots.txt file with this text: User-agent: * Disallow: / I dropped from page 11 on my main keywords to past page 50. I caught it 2-3 days later and have now fixed it. I re-imported my site map with Webmaster Tools and I also did a Fetch as Google through Webmaster Tools. I tweeted out my URL to hopefully get Google to crawl it faster too. Webmaster Tools no longer says that the site is experiencing outages, but when I look at my blocked URLs it still says 249 are blocked. That's actually gone up since I made the fix. In the Google search results, it still no longer has my page title and the description still says "A description for this result is not available because of this site's robots.txt – learn more." How will this affect me long-term? When will I recover my rankings? Is there anything else I can do? Thanks for your input! www.decalsforthewall.com
Technical SEO | | Webmaster1230 -
Robots.txt Sitemap with Relative Path
Hi Everyone, In robots.txt, can the sitemap be indicated with a relative path? I'm trying to roll out a robots file to ~200 websites, and they all have the same relative path for a sitemap but each is hosted on its own domain. Basically I'm trying to avoid needing to create 200 different robots.txt files just to change the domain. If I do need to do that, though, is there an easier way than just trudging through it?
Technical SEO | | MRCSearch0 -
Robots.txt File Redirects to Home Page
I've been doing some site analysis for a new SEO client and it has been brought to my attention that their robots.txt file redirects to their homepage. I was wondering: Is there a benfit to setup your robots.txt file to do this? Will this effect how their site will get indexed? Thanks for your response! Kyle Site URL: http://www.radisphere.net/
Technical SEO | | kchandler0