Should I check Use noindex for Tag Archives?
-
I have a page indexed > (http://mysite.com/mypost) and also http://mysite.com/tag/mypost The same post shows up twice, one with /tag/ one without when I search site:http://mysite.com
Is this a duplicate content?? Can I get penalized for this?In the All in one plugin should I check Use noindex for Tag Archives to avoid this or doesn't matter.
Thanks -
very good and well thought out answer.
-
thanks bemcapaz & Marcus for expert advice. wht u think if we control how much content to be shown on tag pages .. like this http://www.lancelhoff.com/change-the-excerpt-length-wordpress/. is it ok? actually i m getting traffic from TAG pages too... .
-
Most of my sites uses Wordpress, what I did to never have any duplicate content is the following:
First, I have the following plugins installed
- .html on PAGES
- All in ONe SEO Pack
- cbnet Ping Optimizer
- Google XML Sitemaps
Then, added the following text to the robots.txt
User-agent: *
Disallow: /wp-adminDisallow: /wp-includes
Disallow: /wp-content/plugins
Disallow: /wp-content/cache
Disallow: /wp-content/themes
Disallow: /trackbackDisallow: /comment
Disallow: /categoria//
Disallow: */trackback
Disallow: */comments
Disallow: /sem-categoria
Disallow: /pollsarchive
Disallow: /category
Disallow: /?
Disallow: /*?
Disallow: /*.php$
Disallow: /*.js$
Disallow: /*.inc$
Disallow: /*.css$
Disallow: /*.gz$
Disallow: /*.wmv$
Disallow: /*.cgi$
Disallow: /*.xhtml$
Allow: /wp-content/uploads
Google ImageUser-agent: Googlebot-ImageDisallow:Allow: /* # Google AdSenseUser-agent: Mediapartners-GoogleDisallow:Allow: / # digg mirrorUser-agent: duggmirrorDisallow: / Sitemap: http://www.YOURSITE.com.br/sitemap.xml
On the admin in the PERMALINK config tab, in the COMMON SETTINGS i choose
- Custom Structure -> /%postname%.html
In SETTINGS > cbnet Ping Optimizer I inserted
http://blogsearch.google.com/ping/RPC2
http://ping.weblogalot.com/rpc.php
http://ping.syndic8.com/xmlrpc.php
http://rpc.technorati.com/rpc/ping
http://rpc.reader.livedoor.com/ping
http://www.blogpeople.net/servlet/weblogUpdates
http://audiorpc.weblogs.com/RPC2
I use this plugin to make sure that wordpress will not ping those services everytime I change or update something on the page,actually I seted the plugin to ping only after 30 min after the last ping.
So If you just posted something on Wordpress it will ping all those services for a fast index, however, if for some reason you have to edit the post when you save this plugin will make sure the services will not get pingged again in such a short space of time.
In the All in One Seo config I have the following config
UNCHECKED
- Use noindex for Categories
- Use noindex for Tag Archives
- Autogenerate Description
I also set Exlcude Pages, Additional Post Headers, Additional Page Headers and Additional Home Headers all blank
With all this configuration if someone access any post of my blogs they get a page with http://domain/post permalink structure, doesnt matter if the access came from a TAG, Categorie or normal Search.
Aditionally the main Tag page and Categorie page are indexed, so i ended up using my tags for some sort of relevance to the posts that belongs to that Tag. In google search the same post could appear by many related TAGs filter but the content of the post appears in an unique page.
Hope that helps
PS.: Suggestions to improve this config are welcome
-
Hey, I am guessing this is a wordpress site?
You could solve this a couple of ways
- 301 redirect the duplicate page - not recommended as it is a valid page but it may work
- canonical link on both pages showing the main version of this content
- noindex the tag page
Alternatively, you can use options 2 & 3 and it will resolve it.
If it's wordpress, happy to take a look at the actual page if it helps? I spend a lot of time tinkering with wordpress so there maybe another way & this answer is based on some assumption without real links.
Marcus
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Near Duplicate Title Tag Checker
Hi Everyone, I know there are a lot of tools like Siteliner, which can check the uniqueness of body copy, but are there any that can restrict the check to the title tags alone? Alternatively, is there an Excel or Google Sheets function that would allow me to do the same thing? Thanks, Andy
Intermediate & Advanced SEO | | AndyRSB0 -
Ranking dropped after changing title tag
I recently changed my company's site homepage title tag to make it start with our target keyword. The page was originally at page #7 or #8 and dropped to page #17 directly after I changed the page title. Is this normal? Is it's a temporary drop or should I change it back to the previous title.
Intermediate & Advanced SEO | | ForumOne0 -
Best Practices for Homepage Title Tag
Hi, I would like to know if there is any update about the best practices for the homepage title tag. I mean, a couple of years ago, it was still working placing main keywords in the homepage title tag. But since the last google SERP update, the number of characters that are being shown were reduced, and now we try to work with 55 and 56 characters. That has reduced our capacity of including many keywords on the title tag. Besides, search engines are smarter now to choose the correct inner page to show in SERP. But I am wondering if the Homepage Title should have a branded orientation or should include main keywords, cause it is still working that strategy. I would appreciatte any update in this issue. Thank you!
Intermediate & Advanced SEO | | teconsite0 -
Using Canonical URL to poin to an external page
I was wondering if I can use a canonical URL that points to a page residing on external site? So a page like:
Intermediate & Advanced SEO | | llamb
www.site1.com/whatever.html will have a canonical link in its header to www.site2.com/whatever.html. Thanks.0 -
Avoiding Duplicate Content with Used Car Listings Database: Robots.txt vs Noindex vs Hash URLs (Help!)
Hi Guys, We have developed a plugin that allows us to display used vehicle listings from a centralized, third-party database. The functionality works similar to autotrader.com or cargurus.com, and there are two primary components: 1. Vehicle Listings Pages: this is the page where the user can use various filters to narrow the vehicle listings to find the vehicle they want.
Intermediate & Advanced SEO | | browndoginteractive
2. Vehicle Details Pages: this is the page where the user actually views the details about said vehicle. It is served up via Ajax, in a dialog box on the Vehicle Listings Pages. Example functionality: http://screencast.com/t/kArKm4tBo The Vehicle Listings pages (#1), we do want indexed and to rank. These pages have additional content besides the vehicle listings themselves, and those results are randomized or sliced/diced in different and unique ways. They're also updated twice per day. We do not want to index #2, the Vehicle Details pages, as these pages appear and disappear all of the time, based on dealer inventory, and don't have much value in the SERPs. Additionally, other sites such as autotrader.com, Yahoo Autos, and others draw from this same database, so we're worried about duplicate content. For instance, entering a snippet of dealer-provided content for one specific listing that Google indexed yielded 8,200+ results: Example Google query. We did not originally think that Google would even be able to index these pages, as they are served up via Ajax. However, it seems we were wrong, as Google has already begun indexing them. Not only is duplicate content an issue, but these pages are not meant for visitors to navigate to directly! If a user were to navigate to the url directly, from the SERPs, they would see a page that isn't styled right. Now we have to determine the right solution to keep these pages out of the index: robots.txt, noindex meta tags, or hash (#) internal links. Robots.txt Advantages: Super easy to implement Conserves crawl budget for large sites Ensures crawler doesn't get stuck. After all, if our website only has 500 pages that we really want indexed and ranked, and vehicle details pages constitute another 1,000,000,000 pages, it doesn't seem to make sense to make Googlebot crawl all of those pages. Robots.txt Disadvantages: Doesn't prevent pages from being indexed, as we've seen, probably because there are internal links to these pages. We could nofollow these internal links, thereby minimizing indexation, but this would lead to each 10-25 noindex internal links on each Vehicle Listings page (will Google think we're pagerank sculpting?) Noindex Advantages: Does prevent vehicle details pages from being indexed Allows ALL pages to be crawled (advantage?) Noindex Disadvantages: Difficult to implement (vehicle details pages are served using ajax, so they have no tag. Solution would have to involve X-Robots-Tag HTTP header and Apache, sending a noindex tag based on querystring variables, similar to this stackoverflow solution. This means the plugin functionality is no longer self-contained, and some hosts may not allow these types of Apache rewrites (as I understand it) Forces (or rather allows) Googlebot to crawl hundreds of thousands of noindex pages. I say "force" because of the crawl budget required. Crawler could get stuck/lost in so many pages, and my not like crawling a site with 1,000,000,000 pages, 99.9% of which are noindexed. Cannot be used in conjunction with robots.txt. After all, crawler never reads noindex meta tag if blocked by robots.txt Hash (#) URL Advantages: By using for links on Vehicle Listing pages to Vehicle Details pages (such as "Contact Seller" buttons), coupled with Javascript, crawler won't be able to follow/crawl these links. Best of both worlds: crawl budget isn't overtaxed by thousands of noindex pages, and internal links used to index robots.txt-disallowed pages are gone. Accomplishes same thing as "nofollowing" these links, but without looking like pagerank sculpting (?) Does not require complex Apache stuff Hash (#) URL Disdvantages: Is Google suspicious of sites with (some) internal links structured like this, since they can't crawl/follow them? Initially, we implemented robots.txt--the "sledgehammer solution." We figured that we'd have a happier crawler this way, as it wouldn't have to crawl zillions of partially duplicate vehicle details pages, and we wanted it to be like these pages didn't even exist. However, Google seems to be indexing many of these pages anyway, probably based on internal links pointing to them. We could nofollow the links pointing to these pages, but we don't want it to look like we're pagerank sculpting or something like that. If we implement noindex on these pages (and doing so is a difficult task itself), then we will be certain these pages aren't indexed. However, to do so we will have to remove the robots.txt disallowal, in order to let the crawler read the noindex tag on these pages. Intuitively, it doesn't make sense to me to make googlebot crawl zillions of vehicle details pages, all of which are noindexed, and it could easily get stuck/lost/etc. It seems like a waste of resources, and in some shadowy way bad for SEO. My developers are pushing for the third solution: using the hash URLs. This works on all hosts and keeps all functionality in the plugin self-contained (unlike noindex), and conserves crawl budget while keeping vehicle details page out of the index (unlike robots.txt). But I don't want Google to slap us 6-12 months from now because it doesn't like links like these (). Any thoughts or advice you guys have would be hugely appreciated, as I've been going in circles, circles, circles on this for a couple of days now. Also, I can provide a test site URL if you'd like to see the functionality in action.0 -
Have Title Tags Changed After Hummingbird?
Now that Hummingbird is really looking at longer-tail searches and almost a Q&A style search, should the way we do our title tags change? Moz still recommends: Optimal Format Primary Keyword - Secondary Keyword | Brand Name
Intermediate & Advanced SEO | | netviper
or
Brand Name | Primary Keyword and Secondary Keyword But is this really right anymore after Hummingbird? Should we be more of a Q&A type title tag, while still using our Primary Keyword? For example: If I am targeting Red Nike Shoes, should my title tag be: Red Nike Shoes, Nike Shoes | Shoes.com or now: We carry the latest Red Nike Shoes | Shoes.com or Find Red Nike Shoes on sale at shoes.com What are your thoughts?0 -
Geo-tagging using cookie - Is it Good or Bad for Rankings
We have a fairly large site which does a cookie-based 302 redirect to the the specific city page if someone types in the Home page URL. Though if the cookie is not available (first time user) it goes to the Homepage and asks user to select the city as our services are city specific. Everything is working fine with this setup. Though our tech team now wants to display the contents of city page on homepage URL itself if the cookie is available without 302 redirecting to new URL. Though no-cookie available scenario remains unchanged. Technically, I think this change should work fine without any ranking issues as still the first time users see the actual homepage as does Googlebot. Please confirm possible issues in rankings with this change from your experiences as based upon city present in the cookie homepage will display different content.
Intermediate & Advanced SEO | | Webmaster_SEO0 -
Do you use your own Blog networks?
Do you use a network of sites you own for links to your clients in your seo efforts? I see so many seo companies doing this from such junk sites with all their clients in the blog roll, it seems totally crazy. It seems this stuff works do any of you do this if so how do you keep it white hat?
Intermediate & Advanced SEO | | DavidKonigsberg0