Robots.txt question
-
I notice something weird in Google robots. txt tester
I have this line
Disallow: display=
in my robots.text but whatever URL I give to test it says blocked and shows this line in robots.text
for example this line is to block pages like
http://www.abc.com/lamps/floorlamps?display=table
but if I test
http://www.abc.com/lamps/floorlamps or any page
it shows as blocked due to Disallow: display=
am I doing something wrong or Google is just acting strange? I don't think pages with no display= are blocked in real.
-
Yes - there is bug in your robots.txt. You should wrote some as:
Disallow: /?display=table
or:
Disallow: /?display=*
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What to do when on your keyword there are no questions ?
Let me give me you an example. For example for the keyword title tag (let's imagine) I would want to rank on that. I go to the keyword explorer or related searches at the bottom of google there are many questions people have.. I find expressions (with the same user intent) such as "title tag length", "title tags generator", " "why are title tag importants" (I found this one using the are questions drop down menu of the keyword explorer). With this in hand I can create a page where I answer all those questions. I would have all those expressions being an H2 and answer the questions using related phrases and context word that I will find with the keyword explorer in my paragraph below. Let now take one of my keyword "Sicily bike tours". If I type this expression int he keyword explorer...the only related phrases (with the same user intent) that I find are "Sicily bike tour", "Sicily cycling tours", "Sicily bike trips"... (first thing I noticed is that it is just variation of my main expression not really question...). If I look at questions I find "what is the highest elevation in Sicily" or "How safe is Sicily for tourists". I don't imagine on a page that sells bikes tours in Sicily having h2 tags that answers those questions... and this is not what people that rank do, they describe their tour and this is what is confusing to me. Let's now take a secondary related keyword to main keyword. Let' s take "Sicily cycling tours" (it is a secondary related keyword to "Sicily bike tours". Based on the keyword explorer, the secondary related phrases to "Sicily cycling tours" are "tour of Sicily". "trips to Sicily".... ( isn't that going to be boring and look unnatural to use all those expressions ? ). There are all synonyms of my expression but not really different which is my worry ? Or can I use an expression such as "Sicilian villages" or "Sicily maps" even though they don't have the same user intent) as my secondary related keyword "Sicily cycling tours". Thank you,
Intermediate & Advanced SEO | | seoanalytics0 -
Best practice for disallowing URLS with Robots.txt
Hi Everybody, We are currently trying to tidy up the crawling errors which are appearing when we crawl the site. On first viewing, we were very worried to say the least:17000+. But after looking closer at the report, we found the majority of these errors were being caused by bad URLs featuring: Currency - For example: "directory/currency/switch/currency/GBP/uenc/aHR0cDovL2NlbnR1cnlzYWZldHkuY29tL3dvcmt3ZWFyP3ByaWNlPTUwLSZzdGFuZGFyZHM9NzEx/" Color - For example: ?color=91 Price - For example: "?price=650-700" Order - For example: ?dir=desc&order=most_popular Page - For example: "?p=1&standards=704" Login - For example: "customer/account/login/referer/aHR0cDovL2NlbnR1cnlzYWZldHkuY29tL2NhdGFsb2cvcHJvZHVjdC92aWV3L2lkLzQ1ODczLyNyZXZpZXctZm9ybQ,,/" My question now is as a novice of working with Robots.txt, what would be the best practice for disallowing URLs featuring these from being crawled? Any advice would be appreciated!
Intermediate & Advanced SEO | | centurysafety0 -
Question & Review should be seperate page
Hi pls look at the below page, http://www.powerwale.com/store/exide-xplore-xltz4-3ah-battery/76933 is questions and review should be in seperate page, as i think that in the future the comments, will become Key word stuffing for the product page. Pls suggest.. If yes, suggest the best url as well.. thanks
Intermediate & Advanced SEO | | Rahim1191 -
Dilemma about "images" folder in robots.txt
Hi, Hope you're doing well. I am sure, you guys must be aware that Google has updated their webmaster technical guidelines saying that users should allow access to their css files and java-scripts file if it's possible. Used to be that Google would render the web pages only text based. Now it claims that it can read the css and java-scripts. According to their own terms, not allowing access to the css files can result in sub-optimal rankings. "Disallowing crawling of Javascript or CSS files in your site’s robots.txt directly harms how well our algorithms render and index your content and can result in suboptimal rankings."http://googlewebmastercentral.blogspot.com/2014/10/updating-our-technical-webmaster.htmlWe have allowed access to our CSS files. and Google bot, is seeing our webapges more like a normal user would do. (tested it in GWT)Anyhow, this is my dilemma. I am sure lot of other users might be facing the same situation. Like any other e commerce companies/websites.. we have lot of images. Used to be that our css files were inside our images folder, so I have allowed access to that. Here's the robots.txt --> http://www.modbargains.com/robots.txtRight now we are blocking images folder, as it is very huge, very heavy, and some of the images are very high res. The reason we are blocking that is because we feel that Google bot might spend almost all of its time trying to crawl that "images" folder only, that it might not have enough time to crawl other important pages. Not to mention, a very heavy server load on Google's and ours. we do have good high quality original pictures. We feel that we are losing potential rankings since we are blocking images. I was thinking to allow ONLY google-image bot, access to it. But I still feel that google might spend lot of time doing that. **I was wondering if Google makes a decision saying, hey let me spend 10 minutes for google image bot, and let me spend 20 minutes for google-mobile bot etc.. or something like that.. , or does it have separate "time spending" allocations for all of it's bot types. I want to unblock the images folder, for now only the google image bot, but at the same time, I fear that it might drastically hamper indexing of our important pages, as I mentioned before, because of having tons & tons of images, and Google spending enough time already just to crawl that folder.**Any advice? recommendations? suggestions? technical guidance? Plan of action? Pretty sure I answered my own question, but I need a confirmation from an Expert, if I am right, saying that allow only Google image access to my images folder. Sincerely,Shaleen Shah
Intermediate & Advanced SEO | | Modbargains1 -
Diversifying anchor text question
Hi, I've seen a new article by Dr. Pete on diversifying links for 2013 (http://www.seomoz.org/blog/top-1-seo-tips-for-2013), now my question is this: Dr. Pete talks about mixing up the anchor text for links, is so we don't get caught out by Google or actually mixing it has a better impact? For example: 1. 20 anchor text links targeting just the target term. 2. 20 anchor text links targeting 4 variations of the target term. Is number 2 recommended so things look natural or does it actually have a better impact on SEO. Thanks
Intermediate & Advanced SEO | | activitysuper0 -
Can I use a "no index, follow" command in a robot.txt file for a certain parameter on a domain?
I have a site that produces thousands of pages via file uploads. These pages are then linked to by users for others to download what they have uploaded. Naturally, the client has blocked the parameter which precedes these pages in an attempt to keep them from being indexed. What they did not consider, was they these pages are attracting hundreds of thousands of links that are not passing any authority to the main domain because they're being blocked in robots.txt Can I allow google to follow, but NOT index these pages via a robots.txt file --- or would this have to be done on a page by page basis?
Intermediate & Advanced SEO | | PapaRelevance0 -
Ask a Question
We use DNN and we have case studies ran from our CMS. This is so we can have them in lists by category on service/market pages and show specific ones when needed. Then there is the case study detail page, (this is where the problem exists)to where you read out the case study in full detail and see the images and story. We enter our Case Studies into the CMS and this determines which website they show, and it creates URLs from the titles. However, on the detail page, the case studies all share the same page, Case Study.aspx, and they resolve to that page with their respected URLs in place. As seen here, http://www.structural.net/case-study/1/new-marlins-stadium.aspx Because they all share the same page they are being pulled as duplicate pages. They do show in the SERPS with the right title and URL and it all looks great, but they get errors for having duplicate page content and titles. Is there a way to solve this, or is this something I should even worry about?
Intermediate & Advanced SEO | | KJ-Rodgers0 -
Should I robots block site directories with primarily duplicate content?
Our site, CareerBliss.com, primarily offers unique content in the form of company reviews and exclusive salary information. As a means of driving revenue, we also have a lot of job listings in ouir /jobs/ directory, as well as educational resources (/career-tools/education/) in our. The bulk of this information are feeds, which exist on other websites (duplicate). Does it make sense to go ahead and robots block these portions of our site? My thinking is in doing so, it will help reallocate our site authority helping the /salary/ and /company-reviews/ pages rank higher, and this is where most of the people are finding our site via search anyways. ie. http://www.careerbliss.com/jobs/cisco-systems-jobs-812156/ http://www.careerbliss.com/jobs/jobs-near-you/?l=irvine%2c+ca&landing=true http://www.careerbliss.com/career-tools/education/education-teaching-category-5/
Intermediate & Advanced SEO | | CareerBliss0