Looks like it caught up with them: http://searchengineland.com/confirmed-google-venture-backed-thumbtack-hit-with-manual-action-for-unnatural-links-222664
Posts made by YairSpolter
-
RE: What's the Story on Mozscape Updates?
Thanks for the update, Rand, and good luck on getting Mozscape V2 up and running ASAP. I'm sure I'm not alone in wanting to see Moz emerge as the leader in this industry, bringing the tools up to par with the amazing community assets that you guys provide.
Looking forward to seeing the mustache disappear!
-Yair
-
RE: HELP! How do I stop scraper sites - is there any recourse?
Thanks for the reassuring response, Alick.
Based on what you're saying (and that post from Niel Patel) it's a waste of time to even fill out Google form (these sites are not outranking us). Agree?
-
HELP! How do I stop scraper sites - is there any recourse?
Our site has lots of unique content and photos and it is constantly being scraped and posted on other websites. Most of these are no-name sites that pop up and exist for adwords revenue.
Aside from the fact that we don't want our content being copied, this is an SEO nightmare because they often link back to us from pages that are stuffed with keywords and have very low domain authority (it's a form of negative SEO).
My question is:
Does anyone have experience with fighting this phenonmenon?
What have you done that is effective?
Does anyone have experience with a service such as http://www.dmca.com/ProtectionPro.aspx ? Does it work/is it worth it?
Any input is appreciated!
-
Suggestion: Moz Domain Authority should take disavow into account
Since Moz is trying to predict how Google ranks your site, and Google claims to take the disavow file into account, I'd like to suggest that Moz allow webmasters to upload their disavow file. I imagine this data would be useful to Moz in determining Domain Authority (they may even think of other ways to use it and might even help come to a conclusion on the great debate) and it gives a chance for sites to improve their Moz DA when they are bombarded by spammy links.
I'd love to hear the community's thoughts on this idea, as well as the what Wizards of Moz have to say.
-
RE: Moving to https: Double Redirects
Thanks Matt.
In the meantime I found out that if we go to https we will lose a lotf our ad revenue (we rely heavily on Adsense) so we're going to hold off for now.
-
Moving to https: Double Redirects
We're migrating our site to https and I have the following question:
We have some old url's that we are 301ing to new ones. If we switch over to https then we will be forced to do a double-redirect for these url's. Will this have a negative SEO impact? If so, is there anything that we can do about it?
-
RE: Google Sitelinks Search Box
Thanks - but are SURE that adding the schema will make the box appear? (that was my original question)
-
RE: Google Sitelinks Search Box
I don't know what you're talking about. We have not added the code yet (we're doing it next week).
-
RE: Google Sitelinks Search Box
Thank Ricardo, but I'm not sure you're understanding my issue. In your screen shot there is NO search box in the results for Hometalk.
-
RE: Google Sitelinks Search Box
Thanks for the response.
You're seeing a search box in the results?
Can you show me a screenshot?
Here's what I'm seeing:
and here's what I'd like to see:
-
Google Sitelinks Search Box
For some reason, a search for our company name (“hometalk”) does not produce the search box in the results (even though we do have sitelinks).
We are adding schema markup as outlined here, but we're not sure about:
Will adding the code make the search bar appear (or at least increase the chances), or is it only going to change the functionality of the search box (to on-site search) for results that are already showing a search bar?
-
RE: A few important mobile SEO questions
Thanks so much!
This is exactly what I wanted to know.
-
RE: Do you lose link juice when stripping query strings with canonicals?
Thanks for the quick and thorough response, Sajeet.
I just need a little clarification:
In the example you gave: www.mysite.com/main-page?medium=abc this page will be canonicaled to www.mysite.com/main-page. Are you saying that in such a case I will lose some link juice but not when the query string has utm parameters? If this is what you mean, how do you know that Google treats different query strings differently?
-
Do you lose link juice when stripping query strings with canonicals?
It is well known that when page A canonicals to page B, some link juice is lost (similar to a 301). So imagine I have the following pages:
Page A: www.mysite.com/main-page which has the tag: <link rel="canonical" href="http: www.mysite.com="" main-page"=""></link rel="canonical" href="http:>
Page B: www.mysite.com/main-page/sub-page which is a variation of Page A, so it has a tag
I know that links to page B will lose some of their SEO value, as if I was 301ing from page B to page A.
Question:
What about this link: www.mysite.com/main-page?utm_medium=moz&utm_source=qa&utm_campaign=forum
Will it also lose link juice since the query string is being stripped by the canonical tag? In terms of SEO, is this like a redirect?
-
A few important mobile SEO questions
I have a few basic questions about mobile SEO. I'd appreciate if any of you fabulous Mozzers can enlighten me.
Our site has a parallel mobile site with the same urls, using an m. domain for mobile and www. for desktop. On mobile pages, we have a rel="canonical" tag pointing to the matching desktop URL and on desktop pages we have a rel="alternate" tag pointing to the matching mobile URL. When someone visits a www. page using a mobile device, we 301 them to the mobile version.
Questions:
1. Do I want my mobile pages to be indexed by Google? From Tom's (very helpful) answers here, it seems that I only want Google indexing the full site pages and if the mobile pages are indexed it's actually a duplicate content issue. This is really confusing to me since Google knows that it's not duplicate content based on the canonical tag. But - he makes a good point - what is the value of having the mobile page indexed if the same page on desktop is indexed (I know that Google is indexing both because I see them in search results. When I search on mobile Google serves the mobile page and when I search on desktop Google serves me the desktop page.)? Are these pages competing with each other? Currently, we are doing everything we can do ensure that our mobile pages are crawled (deeply) and indexed, but now I'm not sure what the value of this is? Please share your knowledge.
2. Is a mobile page's ranking affected by social shares of the desktop version of the same page? Currently, when someone uses the share buttons on our mobile site, we share the desktop url (www. - not m.). The reason we do this is that we are afraid that if people are sharing our content with 2 different url's (m.mysite.com/some_post and www.mysite.com/some_post) the share count will not be aggregated for both url's. What I'm wondering is: will this have a negative effect on mobile SEO, since it will seem to Google that our mobile pages have no shares, or is this not a problem, since the desktop pages have a rel="alternate" tag pointing to mobile pages, so Google gives the same ranking to the mobile page as the desktop page (which IS being shared)?
-
Does Bing Support same sitemap for full site, mobile, and images?
We have 1 sitemap for our desktop site, mobile site, and images. This works for Google, but I'm not sure if it's supported by Bing or if they require separate sitemaps.
Anyone know?
-
Does Bing Support ?
We have a mobile site that uses angular js, so we are using Pushstate and adding the tag to each page so Google receives an HTML snapshot.
My question is if Bing supports this meta tag and will fetch the correct version of the page?
-
RE: Block in robots.txt instead of using canonical?
Thanks Robert.
The pages that I'm talking about disallowing do not have rank or links. They are sub-pages of a profile page. If anything, the main page will be linked to, not the sub-pages.
Maybe I should have explained that I'm talking about a large site - around 400K pages. More than 1,000 new pages are created per week. That's why I am concerned about managing crawl budget. The pages that I'm referring to are not linked to anywhere on the site. Sure, Google can potentially get to them if someone decides to link to them on their own site, but this is unlikely and certainly won't happen on a large scale. So I'm not really concerned about about losing pagerank on the main profile page if I disallow them. To be clear: we have many thousands of pages with content that we want to rank. The pages I'm talking about are not important in those terms.
So it's really a question of balance... if these pages (there are MANY of them) are included in the crawl (and in our sitemap), potentially it's a real waste of crawl budget. Doesn't this outweigh the minuscule, far-fetched potential loss?
I understand that Google designed rel=canonical for this scenario, but that does not mean that it's necessarily the best way to go considering the other options.
-
RE: Block in robots.txt instead of using canonical?
Thanks Takeshi.
Maybe I should have explained that I'm talking about a large site - around 400K pages. More than 1,000 new pages are created per week. That's why I am concerned about managing crawl budget. The pages that I'm referring to are not linked to anywhere on the site. Sure, Google can potentially get to them if someone decides to link to them on their own site, but this is unlikely (since it's a sub-page of the main profile page, which is where people would naturally link to) and certainly won't happen on a large scale. So I'm not really concerned about about link-juice evaporation. According to AJ Kohn here, it's not enough to see in Webmaster Tools that Google has indexed all pages on our site. There is also the issue of how often pages are being crawled, which is what we are trying to optimize for.
So it's really a question of balance... if these pages (there are MANY of them) are included in the crawl (and in our sitemap), potentially it's a real waste of crawl budget. Doesn't this outweigh the minuscule, far-fetched potential loss?
Would love to hear your thoughts...
-
RE: Block in robots.txt instead of using canonical?
Thanks for the response, Robert.
I have read lots of SEO advice on maximizing your "crawl budget" - making sure your internal link system is built well to send the bots to the right pages. According to my research, since bots only spend a certain amount of time on your site when they are crawling, it is important to do whatever you can to ensure that they don't "waste time" on pages that are not important for SEO. Just as one example, see this post from AJ Kohn.
Do you disagree with this whole approach?
-
Block in robots.txt instead of using canonical?
When I use a canonical tag for pages that are variations of the same page, it basically means that I don't want Google to index this page. But at the same time, spiders will go ahead and crawl the page. Isn't this a waste of my crawl budget? Wouldn't it be better to just disallow the page in robots.txt and let Google focus on crawling the pages that I do want indexed?
In other words, why should I ever use rel=canonical as opposed to simply disallowing in robots.txt?
-
RE: Is it bad practice to create pages that 404?
Thanks Everett,
As far as I know, nofollows don't conserve crawl budget. The bots will crawl the link, they just wont transfer any PR.
-
RE: Is it bad practice to create pages that 404?
Thanks for the clear and concise answer, Jane. You hit the nail right on the head! I appreciate your input.
One question, though. You say that noindex will block bot access to these pages. I'm pretty sure the bots will still crawl the pages (if they find them), just they won't be indexed and presumably they won't be "counted against us" like 404 pages. Is that what you meant?
If you have a minute, maybe you can help me out with this question next: http://moz.com/community/q/internal-nofollows
(Side note: Er_Maqul was referring to the original version of the question (before I edited it) where I had mistakenly written that we nofollow the links.)
-
Internal nofollows?
We have a profile page on our site for members who join. The profile page has child pages that are simply more specific drill-downs of what you get on the main profile page. For example: /roger displays all of roger's posts, questions, and favorites and then there are /roger/posts, /roger/questions, /roger/favorites.
Since the child pages contain subsets of the content on the main profile page, we canonical them back to the main profile page.
Here's my question:
The main profile page has navigation links to take you to the child pages. On /roger, there are links to: /roger/posts, /roger/questions, and /roger/favorites. Currently, we nofollow these links. Is this the right way to do it? It seems to me that it's a mistake, since the bots will still crawl those pages but will not transfer PR. What should we do instead:
1. Make the links js links so the child pages won't be crawled at all?
2. Make the links follow so that PR will flow (see Matt Cutts' advice here)? Apprehension about doing this: won't it dilute crawl budget (as opposed to #1)?
3. Something else?
In case the question wasn't confusing enough... here's another piece:
We also have a child page of the profile that is simply a list of members (/roger/friends). Since this page does not have any real content, we are currently noindex/nofollow -ing it and the link to this page is also nofollow. I'm thinking that there's a better solution for this as well. Would love your input!
-
RE: Is it bad practice to create pages that 404?
Wow - thanks for the thorough response, HashtagHustler!
Let me explain a little better...
We get hundreds of signups a day. Each new member has a profile page, which is empty until they do something. Sometimes they never do. So we don't link to the empty pages and they return a 404. As soon as the page has some content, we do link to it and it returns a 200.
Google is not reporting 404s for these pages because they are not linked to. In the pat, when we did link to them, Google reported them as soft 404s.
The current system is working fine.
My question is simply if it makes more sense to allow Google to find these pages (link to them) but noindex them since they do not have content (and are considered soft 404s by Google) or if we should continue doing it the way we are today (which makes me a little uncomfortable since we are creating 1000's of pages - that theoretically may be linked to by other sites - that are 404s)?
-
RE: Noindex search pages?
Our search results are not appearing in Google's index and we are not having any issues with getting our content discovered, so I really don't mind disallowing search pages and noindexing them. I was just wondering what advantage there is to disallowing and what I would lose if I only noindex. Isn't it better to allow many avenues of content discovery for the bots?
-
RE: Noindex search pages?
Thanks for the response, Doug.
The truth is that it's unlikely that the spiders will find the search results, but if they do why should I consider it a "spider trap"? Even though I don't want the search results pages indexed, I do want the spiders crawling this content. That's why I'm wondering if it's better to just noindex and not disallow in robots.txt?
-
RE: Is it bad practice to create pages that 404?
Thanks for your quick response, Er.
You are correct about the 404s and I realized that what I wrote in the question was a mistake. We don't have any internal links to these pages (not even nofollow). Until there is content on the page, we make all links to the page into js links. I corrected this in the question now.
Concerning what you said about the pages being useful for SEO even without content: I don't think this is correct. Before we started 404ing the empty profile pages, Webmaster was reporting them as soft 404s. Doesn't this mean that they were hurting us (especially since we have many of them)?
-
Noindex search pages?
Is it best to noindex search results pages, exclude them using robots.txt, or both?
-
Is it bad practice to create pages that 404?
We have member pages on our site that are initially empty, until the member does some activity. Currently, since all of these pages are soft 404s, we return a 404 for all these pages and all internal links to them are js links (not links as far as bots are concerned). As soon as the page has content, we switch it to 200 and make the links into regular hrefs.
After doing some research, I started thinking that this is not the best way to handle this situation. A better idea would be to noindex/follow the pages (before they have content) and let the links to these pages be real links.
I'd love to hear input and feedback from fellow Mozzers. What are your thoughts?
-
RE: How to handle (internal) search result pages?
Thanks for the quick response.
If the pages are presently not indexed, is there any advantage to follow/noindex over blocking via robots.php?
I guess my question is whether it's better or worse to have those pages spidered (by definition, any content that appears on these pages exists somewhere else on the site, since it is a search page)... what do you think?
-
RE: How to handle (internal) search result pages?
Hi Mark,
Can you explain why this is better than excluding the pages via robots.txt?