Crawling a website with redirects
-
Hi,
I started a campaign for a website which uses multiple redirects before showing the real content. in the crawling report only one page is crawled.
Is there a way to let the crawler pass the redirects to get usefull reports?
The website is www.cegeka.be
Thank you
-
Hi Peter
It would be helpful to know how you are redirecting these pages (301 etc) and for what reason?
Also whether any of your pages are blocked in your robots.txt file - perhaps to prevent duplicate content?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Block Moz (or any other robot) from crawling pages with specific URLs
Hello! Moz reports that my site has around 380 duplicate page content. Most of them come from dynamic generated URLs that have some specific parameters. I have sorted this out for Google in webmaster tools (the new Google Search Console) by blocking the pages with these parameters. However, Moz is still reporting the same amount of duplicate content pages and, to stop it, I know I must use robots.txt. The trick is that, I don't want to block every page, but just the pages with specific parameters. I want to do this because among these 380 pages there are some other pages with no parameters (or different parameters) that I need to take care of. Basically, I need to clean this list to be able to use the feature properly in the future. I have read through Moz forums and found a few topics related to this, but there is no clear answer on how to block only pages with specific URLs. Therefore, I have done my research and come up with these lines for robots.txt: User-agent: dotbot
Moz Pro | | Blacktie
Disallow: /*numberOfStars=0 User-agent: rogerbot
Disallow: /*numberOfStars=0 My questions: 1. Are the above lines correct and would block Moz (dotbot and rogerbot) from crawling only pages that have numberOfStars=0 parameter in their URLs, leaving other pages intact? 2. Do I need to have an empty line between the two groups? (I mean between "Disallow: /*numberOfStars=0" and "User-agent: rogerbot")? (or does it even matter?) I think this would help many people as there is no clear answer on how to block crawling only pages with specific URLs. Moreover, this should be valid for any robot out there. Thank you for your help!0 -
Help - Analyzing Web Traffic Across Multiple Websites
Hi Moz Community, Hope you can help. Is there any way to discover the most visited pages for a particular website, one that I do not administer? I wouldn't need exact numbers, just a relative breakdown of the "Most Visited" pages/sections. For example, if I was reviewing www.jcrew.com, I'd be interested in determining the 10/20/50 most visited pages/products. And just to provide another example, I would be interested in the 10/20/50 most visited pages/stories on www.buzzfeed.com. Any and all help is greatly appreciated. Thank you!
Moz Pro | | MountArashi0 -
Crawl Diagnostics
My site was crawled last night and found 10,000 errors due to a Robot.txt change implemented last week in between Moz crawls. This is obviously very bad so we have corrected it this morning. We do not want to wait until next Monday (6 days) to see if the fix has worked. How do we force a Moz crawl now? Thanks
Moz Pro | | Studio330 -
SEO on-demand crawl
what happened to the on-demand crawl you could do in PRO when they switched to the new MOZ site?
Moz Pro | | Vertz-Marketing0 -
A 301 redirect to a page with a rel canonical to a page with a 301 question...
MOZ registers thousands of DC and Duplicate titles on a Drupal site which has a little strange setup. Example: www.1234.com/en-us 301 redirects to www.realsite.com/en-us which has a rel canonical to www.1234.com which 301 redirects to www.realsite.com. If you're still with me I thank you.
Moz Pro | | Crunchii
My question is since MOZ registers errors, if indeed the rel canonical isn't recognized due to a 301 redirect?0 -
Crawl Errors from URL Parameter
Hello, I am having this issue within SEOmoz's Crawl Diagnosis report. There are a lot of crawl errors happening with pages associated with /login. I will see site.com/login?r=http://.... and have several duplicate content issues associated with those urls. Seeing this, I checked WMT to see if the Google crawler was showing this error as well. It wasn't. So what I ended doing was going to the robots.txt and disallowing rogerbot. It looks like this: User-agent: rogerbot Disallow:/login However, SEOmoz has crawled again and it still picking up on those URLs. Any ideas on how to fix? Thanks!
Moz Pro | | WrightIMC0 -
My website was hacked last Thursday
My business website was hacked (for the 2nd time in 12 months) last Thursday and all data lost. I've been rebuilding the site and database since then but I'm still getting Hacking Warnings each day. The latest warning says: Dear Colin/Administrator,
Moz Pro | | NileCruises
Someone has attempted to inject SQL into your domain:
HACK DETECTED!
PHP TYPE
IP: 94.100.17.134
Scriptname: /index.cfm
PathInfo: /index.cfm
QueryString: src=http%3A%2F%2Fpicasa.com.oprst.in%2Fshow.php%3Fid%3D16907217 My Technical advisro tells me the IP address is that of Inferno Solutions of The Netherlands. I wonder if anyone has suffered hacking like this what steps they too and what I could do about the potential hackers? Colin0 -
Strange Website Activity
Hello, I have been building websites for about 4 months now and finally had my first real success with a website. I found a niche that I was able to get on the first page with. This site was fine and then boom it dropped to #17 around the time of the recent Google changes. So I just thought it was that, I decided to add it as a campaign on here. This is when I noticed that the pages were not being crawled. So being new at all of this, I researched that. Of course, the main things were is the privacy and robots.txt. It is running on wordpress and I know not to set the blog to private, but I wasn't familiar with how to edit the robots.txt. I found a good plugin that easily allowed me to set the text to allow all bots. It seemed to be set to the normal wordpress settings before, and I never had a problem with any of my websites not being crawled. Anyway, once I just set it to allow all bots on ALL of my website, the pages started being crawled again. My traffic went back up and I was on the first page all day yesterday. Today, so far big drop off. So, I deleted the campaign and set up a new one. Sure enough no pages crawled yet. I made some security changes using Bulletproof Security and another plugin to see if that effects it. Nothing yet. I am just really confused as to what is going on, so if any of you have any ideas that would be great. It is a simple site, and I made some changes like theme when I was trying to figure out why the pages weren't being crawled. So it is not the most beautiful design right now. Also, I try my best to put up well-written useful content, so I don't think that is the issue for the rankings drops. I don't have many if any actual backlinks yet because of the newness of my site, could be the reason for it acting strange BUT none of that explains the pages not being crawled at the same time my site drops????? Sorry so long but had to explain it all! Thanks in advance to anyone who has anything to say about this situation! Edit: I should clarify, I put the the security plugins on yesterday while I was on the first page and have deleted them to see if that allows the pages to be crawled. Sorry if I wasn't clear.
Moz Pro | | iheartkelby0