Problem crawling a website with age verification page.
-
Hy every1,
Need your help very urgent. I need to crawl a website that first has a page where you need to put your age for verification and after that you are redirected to the website. My problem is that SEOmoz, crawls only that first page, not the whole website. How can I crawl the whole website?, do you need me to upload a link to the website?
Thank you very much
Catalin
-
Hello Catalin,
Our crawler will not be able to get past an age verification page. You will need to find or unlock a subfolder or subdomain to bypass this if you would like our crawlers to be able to get through. Luckily, Google's crawlers are a bit more thorough a will be able to index your site properly. We are hoping to add this ability soon and I hope you can find a way for us to get through in the meantime.
-
the problem is that the pages are not in a subfolder. I have to pass the verification page every time :(. SEOMoz is crawling only the first page.
-
Well that's a small side note to your problem ;-), are you able to just set up a crawl for a sub folder? Or do you have to pass the verification at all times?
-
OK, thank you for your short answer, but the thing is I didn't understand anything from what you wrote :).
I want to add that I do not own the website. I dont have acces to back-end, cms, etc. The client just wants me to crawl the whole website to see if something is wrong. I can see with my own eyes that the website has duplicate content, but seomoz doesnt crawls the website, because of that first page with verification.
-
Hi Catalin,
The best way do to this is of course to include a link to the rest of the Web site (you could remove the link of course when Roger came by). But what you also could is redirect the user based on the user agent when linking wouldn't be an option.
Hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to find those website who are using our content
I'm tring to figure it out that by using seo moz how can i find all website who are using our content.
Moz Pro | | Showhow20 -
How do I find out which pages are being indexed on my site and which are not?
Hi, I doing my first technical audit on my site. I am learning how to do an audit as i go and am a lost. I know some page won't be indexed but how do I: 1. Check the site for all pages, both indexed and not indexed 2. Run a report to show indexed pages only (i am presuming i can do this via screaming Frog or webmaster tool) 3. I can do a comparison between the two list and work out which pages are not being indexed. I'll then need to figure out way. I'll cross this bridge once i get to it Thanks Ben
Moz Pro | | benjmoz0 -
Duplicate page report
We ran a CSV spreadsheet of our crawl diagnostics related to duplicate URLS' after waiting 5 days with no response to how Rogerbot can be made to filter. My IT lead tells me he thinks the label on the spreadsheet is showing “duplicate URLs”, and that is – literally – what the spreadsheet is showing. It thinks that a database ID number is the only valid part of a URL. To replicate: Just filter the spreadsheet for any number that you see on the page. For example, filtering for 1793 gives us the following result: | URL http://truthbook.com/faq/dsp_viewFAQ.cfm?faqID=1793 http://truthbook.com/index.cfm?linkID=1793 http://truthbook.com/index.cfm?linkID=1793&pf=true http://www.truthbook.com/blogs/dsp_viewBlogEntry.cfm?blogentryID=1793 http://www.truthbook.com/index.cfm?linkID=1793 | There are a couple of problems with the above: 1. It gives the www result, as well as the non-www result. 2. It is seeing the print version as a duplicate (&pf=true) but these are blocked from Google via the noindex header tag. 3. It thinks that different sections of the website with the same ID number the same thing (faq / blogs / pages) In short: this particular report tell us nothing at all. I am trying to get a perspective from someone at SEOMoz to determine if he is reading the result correctly or there is something he is missing? Please help. Jim
Moz Pro | | jimmyzig0 -
Merged websites issue
Hi Before my time, company big.com took over company small.com. They decided to replicate big.com web pages onto small.com - so both websites have identical pages and copy, just different domains. Within the small.com sitmap.xml they list only big.com urls. They are also using big.com google analytics tracking code on small.com. I have no idea what happened to the original content on small.com or if they put 301 redirects on. I am thinking: do a 301 redirect on small domain to big domain. A) Agree? Small domain is likely to have valuable historic inbound links which are now going to 404 pages. After I do the 301 should these then appear in big.com SEOMoz campaign and on big.com webmaster tool for me to fix? B) Views? Or should I get up webmaster tool on small.com and fix that first? C) Views? Many thanks in advance guys, sorry its a long statement! Richard
Moz Pro | | Richard5550 -
I have another Duplicate page content Question to ask.Why does my blog tags come up as duplicates when my page gets crawled,how do I fix it?
I have a blog linked to my web page.& when rogerbot crawls my website it considers tags for my blog pages duplicate content.is there any way I can fix this? Thanks for your advice.
Moz Pro | | PCTechGuy20120 -
SEOmoz crawl error questions
I just got my first seomoz crawl report and was shocked at all the errors it generated. I looked into it and saw 7200 crawl errors. Most of them are duplicate page titles and duplicate page content. I clicked into the report and found that 97% of the errors were going off of one page It has ttp://legendzelda.net/forums/index.php/members/page__sort_key__joined__sort_order__asc__max_results__20 http://legendzelda.net/forums/index.php/members/page__sort_key__joined__sort_order__asc__max_results__20__quickjump__A__name_box__begins__name__A__quickjump__E etc Has 20 pages of slight variations of this link. It is all my members list or a search of my members list so it is not really duplicate content or anything. How can I get these errors to go away and make search my site is not taking a hit? The forum software I use is IPB.
Moz Pro | | NoahGlaser780 -
Can I exclude pages from my Crawl Diagnostics?
Right now my crawl diagnostic information is being skewed because it's including the onsite search from my website. Is there a way to remove certain pages like search from the errors and warnings of the crawl diagnostic? My search pages are coming up as: Long URL Title Element Too Long Missing Meta Description Blocked by meta-robots (Which is how I want it) Rel Canonical Here is what the crawl diagnostic thinks my page URL looks like: website.com/search/gutter%25252525252525252525252525252525252525252525252525252525 252525252525252525252525252525252525252525252525252525252525252 525252525252525252525252525252525252525252525252525252525252525 252525252525252525252525252525252525252525252525252525252525252 52525252525252525252525252525252525252525252525252Bcleaning/ Thank you, Jonathan
Moz Pro | | JonathanGoodman0 -
What the . . ! Duplicate Pages and Titles WAY up?
My duplicate pages went up 50 plus in the past week, and my duplicate page titles went over more then 100. We recently redesigned the website, but it has been up for several weeks now. The only change I made specifically last week or late the week before was to get my 301 redirects done to get the www. version and the non www version pointing to the same place (as well as a couple other sites that point to it). I'm sure this is not enough info to figure out what went wrong . . . I'd love some help in figuring this out though.
Moz Pro | | damon12120