Help Crawl friendliness for large site
-
After watching Rand's video I am trying to think of the best way to make my large site more crawl friendly.
Background
I have a large site with over 100k product skus and so when you get to a particular page of products there are tons of different refinements and options that help you sort the products. Most of these are noindex followed, but I was wondering if I should be nofollowing the internal links as well in order to keep bots out of those pages and going to the pages that I want them to go too. Is this a good way to handle it?
Also, does anyone have good recommendations of links to posts that deal with helping the crawl friendliness of a large site?
Thanks!
-
Good point. If you don't want the filter pages crawled at all, it would be better to just block them via robots.txt. My preferred approach is to use query parameters for filters, and canonicaling the filtered pages back to the original, unfiltered page.
Another approach is to use AJAX to dynamically filter the page. This takes more programming overhead, but won't result in tons of extra pages being crawled and potentially indexed.
-
Nofollowing internal links is almost never a good idea. You're just wasting valuable link juice.
Google actually just recently came out with a good guide for how to handle ecommerce navigation with lots of product options: http://googlewebmastercentral.blogspot.com/2014/02/faceted-navigation-best-and-5-of-worst.html
Also, if you have a lot of categories in you store, try to show navigation that is only relevant to the section of the store the user is in. For example, if the user is in the Flowers section, don't show a ton of links for Cellphones.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved URL dynamic structure issue for new global site where I will redirect multiple well-working sites.
Dear all, We are working on a new platform called [https://www.piktalent.com](link url), were basically we aim to redirect many smaller sites we have with quite a lot of SEO traffic related to internships. Our previous sites are some like www.spain-internship.com, www.europe-internship.com and other similars we have (around 9). Our idea is to smoothly redirect a bit by a bit many of the sites to this new platform which is a custom made site in python and node, much more scalable and willing to develop app, etc etc etc...to become a bigger platform. For the new site, we decided to create 3 areas for the main content: piktalent.com/opportunities (all the vacancies) , piktalent.com/internships and piktalent.com/jobs so we can categorize the different types of pages and things we have and under opportunities we have all the vacancies. The problem comes with the site when we generate the diferent static landings and dynamic searches. We have static landing pages generated like www.piktalent.com/internships/madrid but dynamically it also generates www.piktalent.com/opportunities?search=madrid. Also, most of the searches will generate that type of urls, not following the structure of Domain name / type of vacancy/ city / name of the vacancy following the dynamic search structure. I have been thinking 2 potential solutions for this, either applying canonicals, or adding the suffix in webmasters as non index.... but... What do you think is the right approach for this? I am worried about potential duplicate content and conflicts between static content dynamic one. My CTO insists that the dynamic has to be like that but.... I am not 100% sure. Someone can provide input on this? Is there a way to block the dynamic urls generated? Someone with a similar experience? Regards,
Technical SEO | | Jose_jimenez0 -
Transfering Site from Http to HTTPS
Migrating all of our pages from HTTP to HTTPS. I am listing few of my concerns regarding the same: Currently, all HTTPS traffic to our Homepage and SEO page is 301 Redirected to HTTP equivalent. So, when we enable HTTPS on all our pages and 301 all HTTP traffic to HTTPS and stop current 301 Redirection to HTTP, will it still cause a loop during Google crawl due to old indexing? Will we move whole SEO facing site to HTTPS at once or will it be in phases? Which of the two approach is better keeping SEO in mind? what all SEO changes will be required on all pages.(eg. Canonical URLs on our website as well as affiliate websites), sitemaps etc.
Technical SEO | | RobinJA1 -
Googlebot cannot access your site
Hello, I have a website http://www.fivestarstoneinc.com/ and earlier today I got an emil from webmaster tools saying "Googlebot cannot access your site" Wondering what the problem could be and how to fix it.
Technical SEO | | Rank-and-Grow0 -
Redirect chains after a site migration
Hi A clients site was originally canonicalised to the www. from the non www versions Now its migrating to an international config of www.domain.com/uk and www.domain.com/us with the existing pages/urls (such as www.domain.com/pageA) 301'd to the new www.domain.com/uk/pageA for example Will this will create a 301 redirect chain due to the existence of the original canonicalised urls or is the way that works 'catch all' so to speak, and automatically update the canonical 301 redirects of the non www old architexcture url's to the new international architecture URL's ? I presume so but just want to check ? cheers dan
Technical SEO | | Dan-Lawrence0 -
My beta site (beta.website.com) has been inadvertently indexed. Its cached pages are taking traffic away from our real website (website.com). Should I just "NO INDEX" the entire beta site and if so, what's the best way to do this? Please advise.
My beta site (beta.website.com) has been inadvertently indexed. Its cached pages are taking traffic away from our real website (website.com). Should I just "NO INDEX" the entire beta site and if so, what's the best way to do this? Are there any other precautions I should be taking? Please advise.
Technical SEO | | BVREID0 -
Way to spider Wordpress site
I have an old Wordpress site and I want to move it to a new server and take it off Wordpress (too many hacks). I am trying to spider the site so as to get static, non-Wordpress, pages. I am having trouble doing this. When I spider the site, it changes the URLs. For instance, if the URL is www.domain.com/page/ the URL I get out of the spider is /page/index.html And those are not the URLs in the search engine indices. There are about 2000 pages on this site, so it is not feasible to set up 301 redirects. I tried using these spidering programs: WinHTTack Website Copier and PageNest Does anyone know of another method of turning a Wordpress site into a non Wordpress site?
Technical SEO | | DanCrean0 -
How does Google Crawl Multi-Regional Sites?
I've been reading up on this on Webmaster Tools but just wanted to see if anyone could explain it a bit better. I have a website which is going live soon which is going to be set up to redirect to a localised URL based on the IP address i.e. NZ IP ranges will go to .co.nz, Aus IP addresses would go to .com.au and then USA or other non-specified IP addresses will go to the .com address. There is a single CMS installation for the website. Does this impact the way in which Google is able to search the site? Will all domains be crawled or just one? Any help would be great - thanks!
Technical SEO | | lemonz0 -
How do I set up a site review for a password protected site?
We need to conduct a SEO analysis for a website that is on a private, password protected development site -- is there anyway for SEOMoz tools to access and analyze a PW protected site? Thank you, Sara Merten
Technical SEO | | kev110