How do we ensure our new dynamic site gets indexed?
-
Just wondering if you can point me in the right direction. We're building a 'dynamically generated' website, so basically, pages don’t technically exist until the visitor types in the URL (or clicks an on page link), the pages are then created on the fly for the visitor.
The major concern I’ve got is that Google won’t be able to index the site, as the pages don't exist until they're 'visited', and to top it off, they're rendered in JSPX, which makes things tricky to ensure the bots can view the content
We’re going to build/submit a sitemap.xml to signpost the site for Googlebot but are there any other options/resources/best practices Mozzers could recommend for ensuring our new dynamic website gets indexed?
-
Hi Ryan,
Mirroring what Alan said, if the links are html text links - and they should be - then you will reduce your crawling problem with Google.
If you must use javascript links, make sure to duplicate them using
<noscript>tags so that Google will follow them.</p> <p><a href="http://support.google.com/webmasters/bin/answer.py?hl=en&answer=66355">http://support.google.com/webmasters/bin/answer.py?hl=en&answer=66355</a></p> <p>But be careful, Google doesn't treat <noscript> links like regular html links. At best, it's a poor alternative.</p> <p>Google derives so many signals from HTML links (anchor text, page rank, context, etc) that it's almost essential for a search engine friendly site to include them.</p> <p>The Beginners Guide to SEO has a relevant chapter on the basics of Search Engine Friendly Design and Development:</p> <p><a href="http://www.seomoz.org/beginners-guide-to-seo/basics-of-search-engine-friendly-design-and-development">http://www.seomoz.org/beginners-guide-to-seo/basics-of-search-engine-friendly-design-and-development</a></p> <p>Best of luck!</p></noscript>
-
Definitely want to get it right before launch. It's not going anywhere until it is absolutely ready!
-
The project this reminds me of took six months to complete and the 301's alone were a full time job.
Get it right the first time... you do not want to restructure like this on a large dynamic site.
I must say the project worked out but I got all my grey hair the day we threw the switch...
-
When I say its costly to rewrite 200,000+ URLS I mean it. Correcting mistakes here can cost big dollars.
In this case it wascostly to the tune of $60,000+ in costs and loss, however the bottle of bubbly at the end of the six month project was tasty.
Point being is to do it right the first time.
As I said before your best bet is documentation. Large dynamic sites generate large dynamic problems very quickly if not watched closely.
-
Thank you Khem, very helpful replies.
-
One more thing, I missed. Internal linking, make sure each of the page is linked with some text link. But avoid over linking. don't try to link all the pages from home page. Generally we links all the categories, pages from footer or site-wide links
-
Okay, lets do it step by step.
First, if it's a product website, create a separate feed for products and submit the sitemap with Google.
if not, that may you would have separate news/articles/videos sections, create separate xml sitemap for each section and submit with Google
If not, make sure to have only search engine friendly URLs, who says rewriting 200,000+ pages is costly, compare this cost with the business you'll loose when all your products would be listed in Google. So, make sure to rewrite all the dynamic URLs, if you feel that Google might face problem in crawling your website's URLs
Second, study webmaster tool's data very carefully for warnings, errors, so that you can figure out the issues which Google might have been facing while visits your websites.
Avoid duplicate entries of products, generally we don't pay attention to these things, and show same products on different pages in different categories. Google will filter all those duplicate pages, and can even penalize your website because of the duplicate content issue.
Third, keep promoting, but avoid grey/black hat techniques, there is no shortcut to the success. you'll have to spend time and money.
-
It's definitely something we're taking a very close look at. Another thing not mentioned is the use of canonical tags to head off duplicate content issues, which I'll be ensuring is implemented.
My next mugshot might have significantly grayer hair after this is all done...
-
Thanks very much for the replies.
I'll ensure proper cross linking from navigation, on pages themselves and submit a full XML sitemap, along with the social media options suggested. My other concern is that the content itself won't be visible to Googlebot due to the site being largely javascript driven, but that's something I'm working with the developers to resolve.
-
As you can tell from the response above indexation is not what you should be worried about.
Dynamic content is not fool proof. The mistakes are costly and you never want to be involved rewriting 200,000+ pages of dynamic rats nest.
Sorting abilities can cause dynamic urls and duplicate content.
Structure changes or practice changes can cause crawl errors. I looked at a report for a client early today that had 3000+ errors today compared to 20 last week. This was all due to a request made by the owner to the developer.
When enough attention is not paid to this stuff it causes real issues.
The best advice I can offer is to make sure you have a best practices document that must be followed by all developers.
-
Make sure every page you would like to be crawled is linked to in any matter. You can create natural links to them, e.g. from your navigation or in text links, or you can put them in a sitemap.
You can also link to these pages from websites like facebook, twitter to have fast crawling.
Tell Google in your robots.txt that it can access your website and make sure non of the pages you would like to be indexed carry the noindex-value in the robots meta-tag.
Good luck!
-
any link, but i should correct what i said, they will be crawled, not necessary indexwed
-
Thanks for the reply Alan, do you mean links from the sitemap?
-
If you have links to the pages they will be indexed, dynamic of static it does not matter
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sudden Indexation of "Index of /wp-content/uploads/"
Hi all, I have suddenly noticed a massive jump in indexed pages. After performing a "site:" search, it was revealed that the sudden jump was due to the indexation of many pages beginning with the serp title "Index of /wp-content/uploads/" for many uploaded pieces of content & plugins. This has appeared approximately one month after switching to https. I have also noticed a decline in Bing rankings. Does anyone know what is causing/how to fix this? To be clear, these pages are **not **normal /wp-content/uploads/ but rather "index of" pages, being included in Google. Thank you.
Technical SEO | | Tom3_150 -
How to setup an iFrame to be indexed as the parent site
Hi, we are trying to move all of our website content from www.mysite.com to a subdomain (i.e. content.mysite.com), and make "www.mysite.com" nothing more than an iFrame displaying the content from content.mysite.com. We have about 10 pages linking from the home page, all indexed separately, so I understand we'll have to do this for every one of them. (www.mysite.com/contact will be an iframe containing the content from content.mysite.com/contact, and we'll need to do this for every page) How do we do this so Google continues to index the content hosted at content.mysite.com with the parent page in organic results (www.mysite.com). We want all users to enter the site through www.mysite.com or www.mysite.com/xxxxxx, which will contain no content except for iFrames pulling in content from content.mysite.com. Our fear is that google will start directing users directly to content.mysite.com, rather than continue feeding to www.mysite.com. If we use www1.mysite.com or www2.mysite.com as the location of the content, instead of say content.mysite.com, would these subdomain names work better for passing credit for the iFramed content to the parent page (www.mysite.com)? Thanks! SIDE NOTE: Before someone asks why we need to do this, the content on mysite.com ranks very well, but site has a huge bounce rate due to a poorly designed CMS serving the content. The CMS does not load the page in pieces (like most pages load), but instead presents the visitor with a 100% blank page while the page loads in the background for about 5-10 seconds, and then boom 100% of the page shows up. We've been back and forth with our CMS provider about doing something about this for 5 years now, and we have given up. We tested moving our adwords links to xyz.mysite.com, where users are immediately shown a loading indicator, with our site (www.mysite.com) behind it in an iFrame. The immediate result was resounding success... our bounce rate PLUMMETED, and the root domain www.mysite.com saw a huge boost in search results. Problem with this is our site still comes up in organic results as www.mysite.com, which does not have any kind of spinning disk loading indicator, and still has a very high bounce rate.
Technical SEO | | vezaus0 -
Tracing Redirects to a Site
I wonder if anyone has used any tools where you can trace the redirects pointing to a site? I know there are a number of tools out there that can be used to check where a URL redirects to, but I was wondering if anyone has used a tool where I could trace all redirects with the final URL? I am using this for competitor research so I don't have access to Analytics or Webmaster Tools.
Technical SEO | | BeattieGroup0 -
Why are my images not being indexed?
I have submitted an image sitemap with over 2,000 images yet only about 35 have been indexed. Could you please help me understand why Google is not indexing my images? www.creative-calendars.com
Technical SEO | | nicole20140 -
Mobile site not getting indexed
My site is www.findyogi.com - a shopping comparison site The mobile site is hosted at m.findyogi.com I fixed my sitemap and attribution to mobile site in May last week. My mobile site pages are getting de-indexed since then. Website - www.findyogi.com/mobiles/motorola/motorola-moto-g-16gb-b95ef8/price - indexed Mobile - m.findyogi.com/mobiles/motorola/motorola-moto-g-16gb-b95ef8/price - _not indexed. _ Google is crawling my website and mobile site normally. What am I am doing wrong?
Technical SEO | | namansr0 -
What is the best way to find missing alt tags on my site (site wide - not page by page)?
I am looking to find all the missing alt tags on my site at once. I have a FF extension that use to do it page by page, but my site is huge and that will take forever. Thanks!!
Technical SEO | | franchisesolutions1 -
Can someone help me get this site ranked? www.2sponsors.com
Hi, I am have been trying for months to get a site ranked for one of my customers and I am not doing very well. I have been doing SEO for years and have gotten lots of sites ranked but this one has been the most difficult. Does anyone have time to look at it for me? Thanks The sites PR=4. I am trying to get it ranked in www.google.com.ar Thanks Carla skype: carla.dawson78
Technical SEO | | Carla_Dawson0 -
Brand New Site Penalized?
I recently launched 2 completely separate and unrelated websites at the same time. Both are new domains and hosting accounts. neither have any links. One is ranking for a branded search and the other is not. The interesting thing is that I tested both sites on the back end of my server before launch. The site that is not ranking for branded search IS ranking still on the back end of my site for the branded search. I have removed all content and 301 redirected the testing urls back to my portfolio page. Could this be do to Google indexing one but not the other. Does it have anything to do with testing on my server first and my DA being higher than current new sites? Or is it something completely different I'm missing completely. Is this a Penalty?
Technical SEO | | CDUBP0