Question about Robot.txt
-
I just started my own e-commerce website and I hosted it to one of the popular e-commerce platform Pinnacle Cart. It has a lot of functions like, page sorting, mobile website, etc. After adjusting the URL parameters in Google webmaster last 3 weeks ago, I still get the same duplicate errors on meta titles and descriptions based from Google Crawl and SEOMOZ crawl. I am not sure if I made a mistake of choosing pinnacle cart because it is not that flexible in terms of editing the core website pages. There is now way to adjust the canonical, to insert robot.txt on every pages etc. however it has a function to submit just one page of robot.txt. and edit the .htcaccess. The website pages is in PHP format.
For example this URL:
www.mycompany.com has a duplicate title and description with www.mycompany.com/site-map.html (there is no way of editing the title and description of my sitemap)
Another error is
www.mycompany.com has a duplicate title and description with http://www.mycompany.com/brands?url=brands
Is it possible to exclude those website with "url=" and my "sitemap.html" in the robot.txt? or the URL parameters from Google is enough and it just takes a lot of time.
Can somebody help me on the format of Robot.txt. Please? thanks
-
Thank you for your reply. This surely helps. I will probably edit the htaccess.
-
That's the problem with most sitebuilder type prgrams, they are very limited.
Perhaps look at your site title, and page titles. Usually the site title will be the included on all of your webpages followed by the page title so you could simply name your site www.yourcompany.com then add an individual page title to each page.
A robots.txt file is not supposed to be added to every page and only tells the bots what to crawl, and what not to.
If you can edit the htaccess, you should be able to get to the individual pages and insert/change the code for titles, just be aware that doing it manually can work, but sometimes when you go back to make an edit in the builder it may undo all of your manual changes, if that's the case, get your site perfect, then do the individual code changes as the last change.
Hope this helps.
-
I have no way of adding those too. Ooops thanks for the warning. I guess I would have to wait for Google to filter out the parameters.
Thanks for your answer.
-
You certainly don't want to block your sitemap file in robots.txt. It takes some time for Google to filter out the parameters and that is the right approach. If there is no way to change the title, I wouldn't be so concerned over a few pages with duplicate titles. Do you have the ability to add a noindex,follow meta tag on these pages?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I have two robots.txt pages for www and non-www version. Will that be a problem?
There are two robots.txt pages. One for www version and another for non-www version though I have moved to the non-www version.
Technical SEO | | ramb0 -
Webmaster tools question
Hi all. I have a question regarding http vs https. I have an https site and was wondering how to tell google in Webmaster tools to combine and use https. I have setup all sites in Webmaster tools. Both www and non www for both http and https. I see where to set up the www vs the non www but don't quite understand how to do the https part. I want all traffic to: https://www-creative -technology-solutions.com Thanks
Technical SEO | | twoacejr0 -
HTTP Status showing up in opensiteexplorer top pages as blocked by robot.txt file
I am trying to find an answer to this question it has alot of url on this page with no data when i go into the data source and search for noindex or robot.txt but the site is visible in the search engines ?
Technical SEO | | ReSEOlve0 -
Rel no follow question
Hello, I probably already know the answer to this question. But, When you use a rel no follow tag on an internal link or external link. Will the google bot still navigate to the link, in question? Thanks for your help.
Technical SEO | | PeterRota0 -
Back Link Question
Hi Folks, Our domain (www.alabu.com) has been around since 2000. We've accumulated a lot of back links over the years, many of which I don't recognize and didn't ask for. I've been reading on here recently about "cleaning up" back links. I do see a lot of ours that just aren't relevant and I don't know why they decided to link to us. We haven't gotten a warning from google or anything like that, but I wonder, how do I know if we could benefit from cleaning up our back links? Is there a benefit to it even if google hasn't warned us? Thanks! Hal
Technical SEO | | AlabuSkinCare0 -
Drupal Question
So on our site we have a plugin for our fan gallery. The issue is that I am getting a lot of duplication errors and it's saying the URL is too long and all the errors are coming from the Fan Gallery, which has over 8,000 errors. It seems to be pulling a long form query URL that has over 100 characters. You can't physically see it on the site, but the crawlers can. Anyway I'm trying to figure out a fix for this. One method would be to just stop those pages from being crawled, but I would hate to do that as the fan gallery for us would be a great source of links and content. So I'm wondering if anyone else has had an issue with these types of plugins before where the user can upload a photo or do a video embed and then it submits to the site. If you have a better method please let me know. I usually work on E-comm platforms so my experience with drupal is limited.
Technical SEO | | KateGMaker0 -
Duplicate titles Question
Hi eveyone, I have around 1000 duplicate titles and meta description. The poblem was that I had pages in my home page and different pages had the same title. For example, index.php/site/articles/should_you_eat_protein_every_2-3_hours_for_muscle_growth/ /index.php/site/articles/should_you_eat_protein_every_2-3_hours_for_muscle_growth/N12//index.php/site/articles/should_you_eat_protein_every_2-3_hours_for_muscle_growth/N1444//index.php/site/articles/should_you_eat_protein_every_2-3_hours_for_muscle_growth/N1448//index.php/site/articles/should_you_eat_protein_every_2-3_hours_for_muscle_growth/N1448/P6//index.php/site/articles/should_you_eat_protein_every_2-3_hours_for_muscle_growth/N1452/I have 172 of the same page!So I took off all the pagination on my home page and just added 'click fo more'. When they click more, it takes them to the category.So my question is will google slowly start deleting or non-indexing these duplicate titles or pages as I have removed it from my website? (Just so that you know I added a canonical link and figuring out how to add page numbers to met titles and meta description tags for categories with pages)
Technical SEO | | anoopbal0 -
Should I set up a disallow in the robots.txt for catalog search results?
When the crawl diagnostics came back for my site its showing around 3,000 pages of duplicate content. Almost all of them are of the catalog search results page. I also did a site search on Google and they have most of the results pages in their index too. I think I should just disallow the bots in the /catalogsearch/ sub folder, but I'm not sure if this will have any negative effect?
Technical SEO | | JordanJudson0