What to do with extremely high number of URLs on your site?
-
Here is the situation:
The site has tons of business and personal profiles, the information needed to be categorized as such directories were created in an attempt to keep the URL structure clean - so for example:
www.abc.com/product/um/name-here/city-name/state/lastname:3458765
Each profile has a unique ID#, and for some reason there needed to be a category for a user in this case /um/ stands for user name.
Webmaster tool steps to resolve state to use an rel=canonical which can be done for that directory /um/ but I am concerned about the bot not being able to find the other pages beyond that directory, like the profile name, city, state associated. So I guess my ultimate question is if I use rel=canonical will the rest of the content not get crawled or indexed as well?
-
This is not what the canonical tag is intended for.
The personal profiles will most likely be very low content dupes of each other like these which are indexed and should not be:
if pages deeper in that folder are good content worthy of being indexed then:
a) add noindex,follow to these profile pages
b) add index, follow to the deeper pages
that will keep the bots crawling the profile pages to the deeper folders with content you want indexed.
You can also disallow the /un/ (user name) folder and allow the deeper folders with robots.txt commands. We were just discussing this:
http://www.seomoz.org/q/allow-or-disallow-first-in-robots-txt
-
Does everything need to be indexed? If not, perhaps the personal profiles could be noindexed. Let the search engines crawl all of your content, but only have them index pages that provide value to the SERPs.\
Only use rel=canonical if the content on different URLs is the exact same. Using it incorrectly will cause content to not be indexed.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Strange URL's for client's site
We just picked up a new client and I've been doing some digging around on their site. They have quite the wide variety of URL's that make for a rather confusing experience. One of the milder examples is their "About" page. Normally I would expect something along the lines of: www.website.com/about I see: www.website.com/default.asp?Page=About I'm typically a graphic designer and know basically nothing about code, but I just assume this has something funky to do with how their website was constructed. I'm assuming this isn't particularly SEO friendly, but it doesn't seem too bad. Until I got to another section of their site. It's a section that logically should look like: www.website.com/training/public-seminars It's: www.website.com/default.asp?Page=MT&Area=Seminars&Sub=MRM Now that's nonsensical to me! Normally if a client has terrible URL's, I'd say let's do some redirects, but I guess I'm a little intimidated by these. Do the URL's have to be structured like this for some reason? Am I missing some important area of coding here? However, the most bizarre example is a link back to their website from yellowpages.com. Where normally I would expect it to lead to their homepage, I get this bizarre-looking thing: http://website1-px.rtrk.com/?utm_source=ReachLocal&utm_medium=PPC&utm_campaign=AssetManagement&reference_id=15&publisher=yellowpages&placement=ypwebsitemip&action_target=listing_website And as you browse through the site, that strange domain stays. For example the About page is now: http://website1-px.rtrk.com/default.asp?Page=About I would try to google this but I have no idea where to even start! What is going on with these links? Will we be able to fix them to something presentable without breaking their website?
Technical SEO | | everestagency0 -
URL Changes And Site Map Redirects
We are working on a site redesign which will change/shorten our url structure. The primary domain will remain the same however most of the other urls on the site are getting much simpler. My question is how should this be best handled when it comes to sitemaps because there are massive amounts of URLS that will be redirected to the new shorter URL how should we best handle our sitemaps? Should a new sitemap be submitted right at launch? and the old sitemap removed later. I know that Google does not like having redirects in sitemaps. Has anyone done this on a large scale, 60k URLs or more and have any advice?
Technical SEO | | RMATVMC0 -
Should I make a new URL just so it can include a target keyword, then 301 redirect the old URL?
This is for an ecommerce site, and the company I'm working with has started selling a new line of products they want to promote.Should I make a new URL just so it can include a target keyword, then 301 redirect the old URL? One of my concerns is losing a little bit of link value from redirecting. Thank you for reading!
Technical SEO | | DA20130 -
What might make Bing.bot find a URL that looks like this on our site?
I have been doing something Richard Baxter recently suggested and reviewing our server logs. I have found an oddity that hopefully some of you smart Mozzers can help me figure out. Here is the line from the server log (there are many more like this): 157.55.32.166 - - [04/Mar/2013:08:00:59 -0800] "GET /StoreFront/category/www.ccisolutions.com/StoreFront/category/shure-se-earphones HTTP/1.1" 200 94133 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" "-" See how the www.ccisolutions.com appears after /StoreFront/category/ ? We used to see weird URLs reported in GWT that looked like this, but ever since we fixed our canonical tags to be absolute instead of relative URLs, they no longer appeared in our Webmaster Tools reports. However, it seems there is still a problem. Where/how could Bingbot be seeing URLs configured this way? Could it be a server issue, or is it most likely a data problem? Thanks in advance! Dana P.S. Could this be resulting from our massive use of relative URLs all over the site?
Technical SEO | | danatanseo0 -
Mobile Site Domain/URL Structure
We are currently building a mobile optimised version of our main website and I had some questions with regard to SEO. 1. Is it best to structure the domain as: m.yourdomain.com yourdomain/m 2. It is correct to place rel="cannonical" on the mobile pages and to have only the main site indexed? Thanks in advance and links or books on mobile seo you can direct me to that would be greatly appreciated. Phil
Technical SEO | | Phily0 -
How to do a no follow on site search
We have a site search that is causing a huge amount of errors as the SEOmoz crawler is showing these as duplicate content. Our first thought was to do a no-follow on the site-search directory, but we realized that the site search is /site-search.aspx and URl strings appear at the end for hundreds of pages. How dow we/how can we no-follow an undetermined amount of URL strings?
Technical SEO | | Apptixweb0 -
Keywords in Vanity URL
If I set up a vanity URL that just 301's to the main site, do the search engines look at the keywords in the vanity URL when determing how to rank the site. For example, if I set up a vanity URL of www.coolnewtechgear.com, and redirect it to www.company.com/products/, would the search engines view the keywords of cool, new, tech, and gear and associate that with the page it's getting redirected to? Or does it ignore the vanity URL and only look at the content of the page itself?
Technical SEO | | ryanwats0 -
Re-write of url
Hi, I would like your input on the following dilemma I am wanting to target the keyword "download xml". at the moment Google indexes us on page 2 and indexes the page www.ourdomain.com/download.aspx I would like to rewrite the url to be /download-xml-editor.aspx The current page is a pr5 and is our most trafficked and externally inked to page. My thoughts are quite mixed on how to do this. approach 1: re-write url of "download.aspx" and setup permanent 301 redirect of download.aspx to download-xml-editor.aspx approach 2: create a new page called download-xml-editor and 301 redirect that to the current stronger page which is download.aspx approach 3: create new page called download-xml-editor with unique content and try and get that page to rank over time, allowing it to build up links and not compromise the current page, then later 301 redirect How would you deal with this and what are your recommendations
Technical SEO | | LiquidTech0