Yes, Google does not appear to be crawling or indexing any of the pages in question, and GWT doesn't note any issues with crawl budget.
- Home
- EhrenReilly
EhrenReilly
@EhrenReilly
Job Title: SEO & Content Product Manager for a Major Online Publisher
Company: Ask.com
Favorite Thing about SEO
Helping people find what they want
Latest posts made by EhrenReilly
-
RE: "Extremely high number of URLs" warning for robots.txt blocked pages
-
RE: "Extremely high number of URLs" warning for robots.txt blocked pages
This is what my other research has suggested, as well. Google is "discovering" millions of URLs that go into a queue to get crawled, and they're reporting the extremely high number of URLs in Webmaster Tools before they actually attempt to crawl, and see that all these URLs are blocked by robots.txt.
-
RE: "Extremely high number of URLs" warning for robots.txt blocked pages
Federico, my concern is how do I get Google to spend spending so much crawl time on those pages. I don't want Google to waste time crawling pages that are blocked in my robots.txt.
-
"Extremely high number of URLs" warning for robots.txt blocked pages
I have a section of my site that is exclusively for tracking redirects for paid ads. All URLs under this path do a 302 redirect through our ad tracking system:
http://www.mysite.com/trackingredirect/blue-widgets?ad_id=1234567 --302--> http://www.mysite.com/blue-widgets
This path of the site is blocked by our robots.txt, and none of the pages show up for a site: search.
User-agent: *
Disallow: /trackingredirect
However, I keep receiving messages in Google Webmaster Tools about an "extremely high number of URLs", and the URLs listed are in my redirect directory, which is ostensibly not indexed.
If not by robots.txt, how can I keep Googlebot from wasting crawl time on these millions of /trackingredirect/ links?
-
Good technical SEO resources for a newly hired front end dev who has no SEO experience
I am a SEO/product manager for a big online publisher. I work with several front end developers who have great technical and design skills, but zero SEO or social media knowledge. I end up answering a lot of SEO 101 questions and/or making them redo things with SEO and social media in mind.
What are some good resources for them? I was already going to send them these SEOmoz links.
- http://www.seomoz.org/beginners-guide-to-seo/basics-of-search-engine-friendly-design-and-development
- http://www.seomoz.org/beginners-guide-to-seo/how-usability-experience-and-content-affect-search-engine-rankings
Can anyone recommend other, more thorough, more technical reading and reference materials (print or online)?
-
RE: Rel Canonical on Home Page
Yes. One additional thing you can try to do to help Google get the correct page is to make sure all your internal links from both domains point to the target domain that you want to have in the index. For example, if your sites are mydomain1.com and mydomain2.com, and you want to canonicalize everything to mydomain1.com, then any links on mydomain2.com that point to the homepage should point to http://www.mydomain1.com/index.html not to http://www.mydomain1.com/index.html
Best posts made by EhrenReilly
-
RE: Rel Canonical on Home Page
Yes. One additional thing you can try to do to help Google get the correct page is to make sure all your internal links from both domains point to the target domain that you want to have in the index. For example, if your sites are mydomain1.com and mydomain2.com, and you want to canonicalize everything to mydomain1.com, then any links on mydomain2.com that point to the homepage should point to http://www.mydomain1.com/index.html not to http://www.mydomain1.com/index.html
I was an academic cognitive scientist. Then a brand namer at a brand naming agency. Then SEO consultant for big fancy agency. Then an in-house growth hacker for successful quiz/game/publishing startup. Then I was a product manager for a major online publisher. Now, I run SEO & Growth for a variety of companies across IAC/InteractiveCorp.
Looks like your connection to Moz was lost, please wait while we try to reconnect.