Will Google Count Links Loaded from JavaScript Files After the Page Loads
-
Hi,
I have a simple question. If I want to put an image with a link to another site like a banner ad on my page, but do not want it counted by Google. Can I simply load the link and banner using jQuery onload from a separate .js file?
The ideal result would be for Google to index a script tag instead of a link.
-
Good Answer. I completely abandoned the banner I was thinking of using. It was from one of those directories that will list your site for free if you show their banner on your site. Their code of course had a link to them with some optimized text. I was looking for a way to display the banner without becoming a link farm for them.
Then I just decided that I did not want that kind of thing on my site even if it is in a javascript onload event if Google is going to crawl it anyway, so I just decided not to add it.
Then I started thinking about user generated links. How could I let people cite a source in a way that the user can click on without exposing my site to hosting spammy links. I originally used an ASP.Net linkbutton with a confirm button extender from the AJAX Control ToolKit that would display the url and ask the user if they wanted to go there. Then they would click the confirm button and be redirected. The problem was that the URL of the page was in the head part of the DOM.
I replaced that with a feature using a modal popup that calls a javascript function when the link button is clicked. That function then makes an ajax call to a webservice that gets the link from the database. Then the javascript writes an iframe to a div in the modal's panel. The result should be the user being able to see the source without leaving the site, but a lot of sites appear to be blocking the frame by using stuff like X-Frame-Options, so I'm probably going to use a different solution that uses the modal without the iframe. I am thinking of maybe using something like curl to grab content from the page to write to the modal panel along with a clickable link. All of this of course after the user clicks the linkbutton so none of that will be in the source code when the page loads.
-
I think what we really need to understand is, what is the purpose of hiding the link from Google? If it's to prevent the discovery of a URL or prevent the indexation of a certain page (or set of pages) - it's easier to achieve the same thing by using Meta no-index directives or wildcard-based robots.txt rules or by simply denying Gooblebot's user-agent, access to certain pages entirely
Is is that important to hide the link, or is it that you want to prevent access to certain URLs from within Google's SERPs? Another option is obviously to block users / sessions referred from Google (specifically) from accessing the pages. There's lots can be done, but a bit of context would be cool
By the way, no-follow does not prevent Google from following links. It actually just stops PageRank from passing across. I know, it was named wrong
-
What about a form action? Where instead of an a element with a href attribute you add a form element with an action attribute to what the href would be in a link.
-
Thanks for that answer. You obviously know a lot about this issue. I guess they would be able to tell if the .js script file creates an a element with a specific href attribute and then add that element to a specific div tag after the page loads.
It sounds like it might be easier just to nofollow those links instead of going to all the trouble to redirect the .js file whenever Google Bot crawls the page. I fear that could be considered cloaking.
Another possibility would be a an alert that requires a user interaction before grabbing a url from a database. The user would click on the link without an href, the javascript onclick fires, the javascript grabs the the url from a database, the user is asked to click a button if they want to proceed, and then the user is redirected to the external url. That should keep the external URL out of the script code.
-
Google can crawl JavaScript and its contents, but most of the time they are unlikely to do so. In order to do this, Google has to do more than just a basic source code scrape. Like everyone else seeking to scrape data from inside of generated elements, Google has to actually check the modified source-code, after all of the scripts have run (the render) rather than the base (non-modified) source code before any scripts fire
Google's mission is to index the web. There's no doubt that, non-rendered crawls (which do not contain the generated HTML output of scripts) can be done in a fraction of the time it takes to get a rendered snapshot of the page-code. On average I have found rendered crawling to take 7x to 10x longer than basic source scraping
What we have found is that Google are indeed, capable of crawling generated text and links and stuff... but they won't do this all the time, or for everyone. Those resources are more precious to Google and they crawl more sparingly in that manner
If you deployed the link in the manner which you have described, my anticipation is that Google would not notice or evaluate the link for a month or two (if you're not super popular). Eventually, they would determine the presence of the link - at which point it would be factored and / or evaluated
I suppose you could embed the script as a link to a '.js' module, and then use Robots.txt to ban Google from crawling that particular JavaScript file. If they chose to obey that directive, the link would pretty much remain hidden from them. But remember, it's only a directive!
If you wanted to be super harsh you could block Googlebot (user agent) from that JS file and do something like, 301 them to the homepage when they tried to access it (instead of allowing them to open and read the JS file). That would be pretty hardcore but would stand a higher chance of actually working
Think about this kind of stuff though. It would be pretty irregular to go to such extremes and I'm not certain what the consequences of such action(s) would be
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Grade F page on Moz positions No 1 on Google Keywords not contained
Hi I am trying to understand why a page list in position 1 on Google despite the fact it does not include the search terms anywhere in the page source. One of our sites has been in that position for years has great content and links for the key word terms so how can the other page overtake it and all of the other keywords without so much as a sniff of the keyword in the URL, Meta, content or images. It grades F on Moz! How can I discover the technique that has been used. This really is black art stuff or do Google accept payment from major corporations to list their pages irrespective of content?
On-Page Optimization | | Eff-Commerce0 -
Page Layout Updates and Mobile Pages with Ads
I have been trying to do some research on the Page Layout Algorithm and Top Heavy Ads and much of what I read does not mention about mobile pages as apposed to desktop. I am curious if the Page Layout updates can be effected by mobile pages as well and if there is any good articles on this subject. Also is this Algorithm been incorporated into its regular algorithm or do we still have to wait for refreshes to see the impact? Cesar
On-Page Optimization | | cbielich0 -
Is it better to create more pages of content or expand on current pages of content?
I am assuming that one way of improving the rankings of current pages will be to create more content on the keywords used... should this be an expansion of the content on current pages I am optimising for a keyword or is it better to keep creating new pages and if we are creating new pages is it best to use an extension of the keyword on the new page – for example if we are optimising one page for ‘does voltage optimisation work’ would it then be worth creating a page optimised for ‘does voltage optimisation work in hotels’ for example and so on? I am guessing maybe both might help, this is just a question I have had from one of my clients.
On-Page Optimization | | TWSI1 -
On hover my links are with additional Parameters while links that are indexed are without additional parameters
On hover my links are with additional Parameters while links that are indexed are without additional parameters does it impact in a negative way. For ex: i have a site http://www.yoursite.com and Its internal pages that are linked to the site are in pattern of http://www.yoursite.com/jobs-in-india?xz=3_0_5 and these are the pages which are interlinked through out the site. When any user click the link they will land to the similar pages with additional parameter even on mouse hover any one can see the same link. while we have used Canonical, so pages that are getting indexed are http://www.yoursite.com/jobs-in-india. But my concern is: - To showing two different link as when Google crawler follow the site they will get the links with additional parameter while in its index its a URL without additional parameter so is there problem that we can encounter or is there any negative impact on ranking?
On-Page Optimization | | vivekrathore0 -
When You Add a Robots.txt file to a website to block certain URLs, do they disappear from Google's index?
I have seen several websites recently that have have far too many webpages indexed by Google, because for each blog post they publish, Google might index the following: www.mywebsite.com/blog/title-of-post www.mywebsite.com/blog/tag/tag1 www.mywebsite.com/blog/tag/tag2 www.mywebsite.com/blog/category/categoryA etc My question is: if you add a robots.txt file that tells Google NOT to index pages in the "tag" and "category" folder, does that mean that the previously indexed pages will eventually disappear from Google's index? Or does it just mean that newly created pages won't get added to the index? Or does it mean nothing at all? thanks for any insight!
On-Page Optimization | | williammarlow0 -
Sold Products appear as duplicate pages 'Page Not Found' ???
Hi there, I'm down to just 6 duplicate page warnings but I'm not sure how to deal with this one: Information Page Not Found! http://www.vintageheirloom.com/index.php?route=information/information&information_id=6 My Ecommerce shopping site products are unique, 1 of a kind. So once one product has sold and been delivered we take the product off our website, hence the Information Page Not Found! As I understand when search engines re-index these warnings will drop off but new sold products would replace them. So redirecting seems like hard work and never ending. Is it ok to ignore these warnings? Thanks Mozzers..
On-Page Optimization | | well-its-1-louder0 -
Blog page outranks static page for KW -- why?
Blog page ranks 10 in Google, while the static page is on page 7. What makes it more interesting is that the blog page scores an "F" with the Term Target tool while the static page scores an "A". Static page has more inbound links and a mR/mT of 3.89/ 4.54 vs. 3.71/ 4.14 for the blog page. Any ideas on how to approach this one?
On-Page Optimization | | 540SEO0 -
Prevent link juice to flow on low-value pages
Hello there! Most of the websites have links to low-value pages in their main navigation (header or footer)... thus, available through every other pages. I especially think about "Conditions of Use" or "Privacy Notice" pages, which have no value for SEO. What I would like, is to prevent link juice to flow into those pages... but still keep the links for visitors. What is the best way to achieve this? Put a rel="nofollow" attribute on those links? Put a "robots" meta tag containing "noindex,nofollow" on those pages? Put a "Disallow" for those pages in a "robots.txt" file? Use efficient Javascript links? (that crawlers won't be able to follow)
On-Page Optimization | | jonigunneweg0