Getting pages that load dynamically into the SE's
-
SEO'ers,
Am dealing with an issue I cannot figure out the best way to handle. Working on a website that shows the definitions of words which are loaded dynamically from an open source. Source such as: wiktionary.org
When you visit a particular page to see the definition of the word, say; www.example.com/dictionary/example/ the definition is there. However, how can we get all the definition pages to get indexed in search engines? The WordPress sitemap plugin is not picking up these pages to be added automatically - guess because it's dynamic - but when using a sitemap crawler pages are detected.
Can anybody give advice on how to go about getting the 200k+ pages indexed in the SE's? If it helps, here's a reference site that seems to load it's definitions dynamically and has succeeded in getting its pages indexed: http://www.encyclo.nl/begrip/sample
-
I see what you mean there - thanks for sharing your expertise and views on this issue. Much appreciated
-
The only way I'd let those pages be indexed is if they had unique content on them AND/OR provided value in other ways besides just providing the Wiki definition. There are many possibilities for doing this, none of them scalable in an automated fashion, IMHO.
You could take the top 20% of those pages (based on traffic, conversions, revenue...) and really customize them by adding your own definitions and elaborating on the origin of the word, etc... Beyond that you'd probably see a decline in ROI.
-
Everett, yes that's correct. I will go ahead and follow up on what you said. I do still wonder what the best way would be to go about getting it indexed - if I wanted to do that in the future. If you could shed some light on how to go about that, I'd really appreciate it. Thanks so much in advance!
-
It appears that your definitions are coming from wiktionary.org and are therefore duplicate content. If you were providing your own definitions I would say keep the pages indexable, but in this case I would recommend adding a noindex, follow robots meta tag to the html header of those pages.
-
Hi Everett, I've been looking at the index for word definitions and there's so many pages that are very similar to each other. It's worth giving it a shot I think. If you can provide feedback please do. Here's the domain: http://freewordfinder.com. The dictionary is an addition to users who'd like to see what a word means after they've found a word from random letters. You can do a search at the top to see the results, then click through to the definition of the word. Thanks in advance
-
Ron,
We could probably tell you how to get those pages indexed, but then we'd have to tell you how to get them removed from the index when Google sees them all as duplicate content with no added value. My advice is to keep them unindexed, but if you really want them to be indexed tell us the domain and I'll have a look at how it's working and provide some feedback.
-
Hi Keri, did you think that the site might get penalized because it would in essence be duplicate content from another site? Even though the source is linked from the page? Please let me know your thoughts when you can
-
No they currently do not have additional information on them. They are simply better organized on my pages compared to the 3rd party. The unique information is what drives visitors to the site and from those pages it links to the definitions just in case they're interested understanding the meaning of a word. Does that help?
-
Do the individual pages with the definitions have additional information on them, or are they just from a third party, with other parts of the site having the unique information?
-
Hi Keri, thanks for your response. Well, I see what you're saying. The pages that show the definition pulled from the 3rd party are actually supplementary to the solution the site provides (core value). Shouldn't that make a difference?
-
I've got a question back for you that's more of a meta question. Why would the search engines want to index your pages? If all the page is doing is grabbing information from another source, your site isn't offering any additional value to the users, and the search engine algos aren't going to see the point in sending you visitors.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Content Regarding Translated Pages
If we have one page in English, and another that is translated into Spanish, does google consider that duplicate content? I don't know if having something in a different language makes it different or if it will get flagged. Thanks, Ruben
International SEO | | KempRugeLawGroup1 -
How should I handle hreflang tags if it's the same language in all targeting countries?
My company is creating an international version of our site at international.example.com. We are located in the US with our main site at www.example.com targeting US & Canada but offering slightly different products elsewhere internationally. Ideally, we would have hreflang tags for different versions in different languages, however, it's going to be an almost duplicate site besides a few different SKUs. All language and content on the site is going to be in English. Again, the only content changing is slightly different SKUs, they are almost identical sites. The subdomain is our only option right now. Should we implement hreflang tags even if both languages are English and only some of the content is different? Or will having just canonicals be fine? How should we handle this? Would it make sense to use hreflang this way and include it on both versions? I believe this would be signaling for US & Canda visitors to visit our main site and all other users go to the international site. Am I thinking this correctly or should we be doing this a different way?
International SEO | | tcope250 -
Redirect to 'default' or English (/en) version of site?
Hi Moz Community! I'm trying to work through a thorny internationalization issue with the 'default' and English versions of our site. We have an international set-up of: www.domain.com (in english) www.domain.com/en www.domain.com/en-gb www.domain.com/fr-fr www.domain.com/de-de and so on... All the canonicals and HREFLANGs are set up, except the English language version is giving me pause. If you visit www.domain.com, all of the internal links on that page (due to the current way our cms works) point to www.domain.com/en/ versions of the pages. Content is identical between the two versions. The canonical on, say, www.domain.com/en/products points to www.domain.com/products. Feels like we're pulling in two different directions with our internationalization signals. Links go one way, canonical goes another. Three options I can see: Remove the /en/ version of the site. 301 all the /en versions of pages to /. Update the hreflangs to point the EN language users to the / version. **Redirect the / version of the site to /en. **The reverse of the above. **Keep both the /en and the / versions, update the links on / version. **Make it so that visitors to the / version of the site follow links that don't take them to the /en site. It feels like the /en version of the site is redundant and potentially sending confusing signals to search engines (it's currently a bit of a toss-up as to which version of a page ranks). I'm leaning toward removing the /en version and redirecting to the / version. It would be a big step as currently - due to the internal linking - about 40% of our traffic goes through the /en path. Anything to be aware of? Any recommendations or advice would be much appreciated.
International SEO | | MaxSydenham0 -
How To Proceed With Int'l Language Targeting if Subfolders Not An Option?
I’m currently working with my team to sort out the best way to build out the international versions of our website. Any advice on how to move forward is greatly appreciated! Current Setup: Subdirectories to target languages - i.e. domain.com/es/. We chose this because… We are targeting languages not countries Our product offering does not change from country to country Translated site content is almost identical to the english version Current Problem: Our site is built on WordPress and our database can’t handle the build out of 4 more international versions of the site. The database is slowing down and our site speed is being affected for multiple reasons (WordPress multilingual plugin being one of them). **What to do next? **My developers have said that we cannot continue with our current subdirectory structure due to the technical infrastructure issues I’ve mentioned above (as well as others I’m yet to get full details on). Now I’m left with a decision: Change to a subdomain structure Change to a ccTLD structure Is there an option 3? From what I’ve read it does not make sense to build out language targeted sites on a ccTLD structure because that limits the ability for people outside of the targeted country to find the content organically. I.e. a website at www.domain.es is targeted to searchers in Spain so someone in Columbia is less likely to find that content through the engines. Is this correct? If so, how much can it hurt organic discovery? What’s the optimal setup to move forward with in this case? Thanks!
International SEO | | UnbounceVan0 -
What's the difference between 'en-gb' and 'en-uk; when choosing Search engines in campaign set up?
Hi What's the difference search engine wise and which one should I choose, i presume GB since covers entire British landmass whereas UK excludes Ireland according to political definition, is it the same according to Google (& other engines) ? All Best Dan
International SEO | | Dan-Lawrence0 -
Low Index: 72 pages submitted and only 1 Indexed?
Hi Mozers, I'm pretty stuck on this and wondering if anybody else can give me some heads up around what might be causing the issues. I have 3 top level domains, NZ, AU, and USA. For some od reason I seem to be having a real issue with these pages indexing and also the sitemaps and I'm considering hiring someone to get the issue sorted as myself or my developer can''t seem to find the issues. I have attached an example of the sitemap_au.xml file. As you can see there is only 1 page that has been indexed and 72 were submitted. Basically because we host all of our domains on the same server, I was told last time our sitemaps were possibly been overwritten hence the reason why we have sitemap_au.xml and its the same for the other sitemap_nz.xml and sitemap_us.xml I also orignially had sitemap.xml for each. Another issue I am having is the meta tag des for each home page in USA and AU are showing the meta tag for New Zealand but when you look into the com and com.au code meta tag description they are all different as you can see here http://bit.ly/1KTbWg0 and here http://bit.ly/1AU0f5k Any advice around this would be so much appreciated! Thanks Justin new
International SEO | | edward-may0 -
Can I point some rel alternate pages to a 404?
Hi everyone, I'm just setting up a series international websites and need to use rel="alternate" to make sure Google indexes the right thing and doesn't hit us with duplicate content. The problem is that rel="alternate" is page specific, and our international websites aren't exact copies of the main UK website. We've taken out the ecommerce module and a few blog categories because they aren't relevant. Can I just blanket implement rel="alternate" and let it sometimes point to a 404 on the alternate websites? Or is Google going to find that a bit weird? Thanks,
International SEO | | OptiBacUK
James0 -
Duplicated 404 Pages (Travel Industry)
Our website has creating numberous "future pages" with no alt tag or class tag that are showing up as 404 pages, To make matters worst, they are causing duplicate 404 pages because we have different languages. The visitors cant find the 404s but the searchbots can. Would it better to remove or add the links to robot.txt or add nofollow/noindex tag? This is an example. http://www.solmelia.com/nGeneral.LINK_FAQ http://www.solmelia.com/nGeneral.LINK_HOTELESDESTINOS_BODAS http://www.solmelia.com/nGeneral.LINK_CONDICIONES http://www.solmelia.com/nGeneral.LINK_MAPSITE http://www.solmelia.com/nGeneral.LINK_HOTELESDESTINOS_EMPRESA
International SEO | | Melia0