What to do with extremely high number of URLs on your site?
-
Here is the situation:
The site has tons of business and personal profiles, the information needed to be categorized as such directories were created in an attempt to keep the URL structure clean - so for example:
www.abc.com/product/um/name-here/city-name/state/lastname:3458765
Each profile has a unique ID#, and for some reason there needed to be a category for a user in this case /um/ stands for user name.
Webmaster tool steps to resolve state to use an rel=canonical which can be done for that directory /um/ but I am concerned about the bot not being able to find the other pages beyond that directory, like the profile name, city, state associated. So I guess my ultimate question is if I use rel=canonical will the rest of the content not get crawled or indexed as well?
-
This is not what the canonical tag is intended for.
The personal profiles will most likely be very low content dupes of each other like these which are indexed and should not be:
if pages deeper in that folder are good content worthy of being indexed then:
a) add noindex,follow to these profile pages
b) add index, follow to the deeper pages
that will keep the bots crawling the profile pages to the deeper folders with content you want indexed.
You can also disallow the /un/ (user name) folder and allow the deeper folders with robots.txt commands. We were just discussing this:
http://www.seomoz.org/q/allow-or-disallow-first-in-robots-txt
-
Does everything need to be indexed? If not, perhaps the personal profiles could be noindexed. Let the search engines crawl all of your content, but only have them index pages that provide value to the SERPs.\
Only use rel=canonical if the content on different URLs is the exact same. Using it incorrectly will cause content to not be indexed.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Content from Another Site
Hi there - I have a client that says they'll be "serving content by retrieving it from another URL using loadHTMLFile, performing some manipulations on it, and then pushing the result to the page using saveHTML()." Just wondering what the SEO implications of this will be. Will search engines be able to crawl the retrieved content? Is there a downside (I'm assuming we'll have some duplicate content issues)? Thanks for the help!!
Technical SEO | | NetStrategies1 -
Why are URLs like www.site.com/#something being indexed?
So, everything after a hash (#) is not supposed to be crawled and indexed. Has that changed? I see a clients site with all sorts of URLs indexed like ... http://www.website.com/#!category/c11f For the above URL, I thought it was the same as simply http://www.website.com/. But they aren't, they're getting indexed and all the content on the pages with these hash tags are getting crawled as well. Thanks!
Technical SEO | | wiredseo0 -
Linking out to authoritive sites from my ecommerce site
Good afternoon SEOmoz community. I was looking for a specific answer or advice or opinion about linking out to other sites. My Site www.tacticalbootstore.com has been undergoing a complete content rewrite. In the process we have been told and read where it can be good to link out to other authoritive sites. One of the pages we have rewritten is here. http://www.tacticalbootstore.com/belleville-boots-sizing-chart-a-97.html We have not added the graphics yet as they are being built now. This is just an informational page about sizing of a particular manufacturers boots. Once you get to the bottom of the text we have added a link to the actual manufacturers page. Is this helpful for us in the SERPS or not? Thank you for your time. Chris
Technical SEO | | scamper0 -
Single URL not indexed
Hi everyone! Some days ago, I noticed that one of our URLs (http://www.access.de/karriereplanung/webinare) is no longer in the Google index. We never had any form of penalty, link warning etc. Our traffic by Google is constantly growing every month. This single page does not have an external link pointing to it - only internal links. The page has been indexed all the time. The HTTP status code is 200, there is no noindex or something in the code. I submitted the URL on GWMT to let Google send it to the index. It was crawled successfully by Google, sent to the index 5 days ago - nothing happened, still not indexed. Do you have any suggestions why this page is no longer indexed? It is well linked internally and one click away from the home page. There is still the PR of 5 showing, I always thought that pages with PR are indexed.......
Technical SEO | | accessKellyOCG0 -
Friendly URLs
Hi, I have an important news site and I am trying to implement user friendly URLs. Now, when you click a news in the homepage, it goes to a redirect.php page and then goes to a friendly url. the question is, It is better to have the friendly URL in the first link or it is the same for the robot having this in the finally url? Thanks
Technical SEO | | informatica8100 -
How to extract URLs from a site (without bringing the server down!)
Hi everybody. One of my clients is migrating to a new ecommerce platform, and we need to get a list of urls from the existing site to start mapping out the 301 redirects. Usually, I'd use a tool like Xenu or Integrity to crawl and output a list. However, the database and server setup is so bad that it can't handle the requests from these tools and it sends the site down. This, unsurprisingly, is one of the reasons for the migration. Does anybody know of a way to get a full list of urls without having to make a bunch of http requests which will kill the site? Any advice would be much appreciated!
Technical SEO | | neooptic0 -
URL Length
What is the ideal length for an item's URL. Theirs a few different options. A) www.mydomain.com/item-name B) www.mydomain.com/category-name/product-name C) www.mydomain.com/category-name/sub-category-name/product-name Please choose A, B, or C and explain why you made that decision. Looking forward to the responses.
Technical SEO | | Romancing0 -
URL Rewrite
Using the .htaccess file how do I rewrite a url from www.exampleurl.com/index.php?page=example to www.exampleurl.com/example removing index.php?page= Any help is muchly appreciated
Technical SEO | | CraigAddyman0