Restricted by robots.txt and soft bounce issues (related).
-
In our web master tools we have 35K (ish) URLs that are restricted by robots.txt and as have 1200(ish) soft 404s. WE can't seem to figure out how to properly resolve these URLs so that they no longer show up this way. Our traffic from SEO has taken a major hit over the last 2 weeks because of this.
Any help?
Thanks, Libby
-
**These are duplicate URLs that we can't figure out how they are getting created. **
I want to be sure we are talking about the same thing here. When I hear "duplicate URL" I am thinking of multiple URLs which point to the same web page. Depending on how your site is set up it is possible to have many different URLs point to the same web page. Possible examples are:
www.mydomain.com/tennis-rackets
www.mydomain.com/tennis-rackets/
mydomain.com/tennis-rackets?sort=asc
Above are three examples of URLs which can all lead to the same page. You can have dozens of URLs all lead to a page with identical content. How these issues get resolved depends upon how they were created.
The best tool to help you figure this out is your crawl report. Use the SEOmoz crawl tool, then examine the crawl report. It can be a bit overwhelming at first, but you can narrow things down real fast if you use Excel.
Select the header row for your data (begins with the URL field), then select Data > Filter > Auto Filter from the menu. Then start by looking at fields such as "Duplicate Page Content", "URLs with duplicate content", etc. Simply choose YES in the drop down menu to filter for that particular data. This will help you uncover the source of these issues.
The URLs in my example should all be 301'd or canonicalized to the primary page to resolve the duplication issue.
-
Well, part of the problem is these are duplicate URLs that we can't figure out how they are getting created. They were supposed to resolve to our 404 page... Should we remove them all?
-
Hi Libby.
How do you intend to resolve these URLs? Ideally you would remove your robots.txt entries and restrict the pages with meta tags such as "noindex follow" or whatever is appropriate. Any links to 404 pages should be updated or removed.
What further direction do you seek?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Website rank 5th position than bounce
Hello A website client won't rank dor branded keyword (site name) for several months, i optimize onpage seo, did some backlinks, the DA jump to 8 (from 1). Yesterday the website rank for branded keyword 5, than 4, tomorrow can't found it.
Technical SEO | | Zidani0 -
Adding your sitemap to robots.txt
Hi everyone, Best practice question: When adding your sitemap to your robots.txt file, do you add the whole sitemap at once or do you add different subcategories (products, posts, categories,..) separately? I'm very curious to hear your thoughts!
Technical SEO | | WeAreDigital_BE0 -
Issues with Duplicates and AJAX-Loader
Hi, On one website, the "real" content is loaded via AJAX when the visitor clicks on a tile (I'll call a page with some such tiles a tile-page here). A parameter is added to the URL at the that point and the content of that tile is displayed. That content is available via an URL of its own ... which is actually never called. What I want to achieve is a canonicalised tile-page that gets all of the tiles' content and is indexed by google - if possible with also recognising that the single-URLs of a tile are only fallback-solutions and the "tile-page" should be displayed instead. The current tile-page leads to duplicate meta-tags, titles etc and minimal differences between what google considers a page of its own (i.e. the same page with different tiles' contents). Does anybody have an idea on what one can do here?
Technical SEO | | netzkern_AG0 -
What are the negative implications of listing URLs in a sitemap that are then blocked in the robots.txt?
In running a crawl of a client's site I can see several URLs listed in the sitemap that are then blocked in the robots.txt file. Other than perhaps using up crawl budget, are there any other negative implications?
Technical SEO | | richdan0 -
Google indexing despite robots.txt block
Hi This subdomain has about 4'000 URLs indexed in Google, although it's blocked via robots.txt: https://www.google.com/search?safe=off&q=site%3Awww1.swisscom.ch&oq=site%3Awww1.swisscom.ch This has been the case for almost a year now, and it does not look like Google tends to respect the blocking in http://www1.swisscom.ch/robots.txt Any clues why this is or what I could do to resolve it? Thanks!
Technical SEO | | zeepartner0 -
Robots.txt | any SEO advantage to having one vs not having one?
Neither of my sites has a robots.txt file. I guess I have never been bothered by any particular bot enough to exclude it. Is there any SEO advantage to having one anyways?
Technical SEO | | GregB1230 -
E-commerce solution and subdomain issues
Hello All,
Technical SEO | | CherieP
In light of Wil Reynold's closing keynote at Portland's Searchfest, I thought I might try posting here to get some advice. We run a family business on the side and we're looking at starting to use volusion.com for our e-commerce solution. The catch is we currently have a wordpress site summitmining.com running on thesis with great SEO. Ranking #1 & #2 for our highest trafficked terms. Ideally, I'd like Summitmining.com to direct to the Volusion store and then summitmining.com/blog to go to our wordpress installation BUT since the volusion site will be hosted with the company and they will not host our wordpress installation we'd have to use a subdomain instead of a subdirectory which I understand will be bad for SEO. Does anyone have any recommendation on how to set this up without totally screwing up our ranking OR any recommendations of an easy to use shopping cart (I've worked on a magento site before and it's too complex for us) that wouldn't require a separate or subdomain? Thank you so much!
-Cherie Prochaska
503-816-3557
cherie@c-squaredassociates.com
@cherieprochaska0 -
Website Ranking Issue
Hi, We have been performing our own onsite of offsite SEO along with external assistance and have ranked well over the years with minimal impact from Google updates. Howevr the last so called Panda update has affected us heavily pushing our main phrase 'web design melbourne' from 2nd to 7th where we have been for almost 2 months now on Google.com.au irrespective of onsite or offsite work. We have been trying to find signs of any onsite, IP, duplicate content, titles or other issues that may be holding us back to no avail. The only flag that Google webmaster tools is showing is a number of bad internal site links, which I think is a glitch with the CMS we are using. Even the SEO MOZ tool gives us a higher ranking compared to most competitors on page 1 of Google.com.au for our main phrase. The biggest difference between us and competitors is we chose to target an internal page specific to the topic rather than our homepage. With this sadi we have also reduced our keyword density and content quantity inline with the other sites homepages. Can anyone help shed some light on this? and perhaps something obvious that we have missed, or where we should be looking? Thanks.
Technical SEO | | paulsid0