Rogerbot crawls my site and causes error as it uses urls that don't exist
-
Whenever the rogerbot comes back to my site for a crawl it seems to want to crawl urls that dont exist and thus causes errors to be reported...
Example:- The correct url is as follows:
/vw-baywindow/cab_door_slide_door_tailgate_engine_lid_parts/cab_door_seals/genuine_vw_brazil_cab_door_rubber_68-79_10330/
But it seems to want to crawl the following:
/vw-baywindow/cab_door_slide_door_tailgate_engine_lid_parts/cab_door_seals/genuine_vw_brazil_cab_door_rubber_68-79_10330/?id=10330
This format doesn't exist anywhere and never has so I have no idea where its getting this url format from
The user agent details I get are as follows:
IP ADDRESS: 107.22.107.114
USER AGENT: rogerbot/1.0 (http://moz.com/help/pro/what-is-rogerbot-, rogerbot-crawler+pr1-crawler-17@moz.com) -
The first thing I would do is download the crawl report as an excel sheet. You can do this from your crawl report page.
From there, sort by the 404 error column, bringing "True" to the top. The top of the list is now the broken URL's. One of the very last columns on the right is the "referrer" column. This will show you the page where Roger is getting the bad link from.
Make Sense?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How do I fix 5xx errors on pdf's?
Hi all, I'm having a huge (and very frustrating) issue-- I have 437 pages with 5xx errors and have no idea how to go about fixing them. All of them are links to pdf's, and when I click on the link on the Site Crawl page, it opens just fine. I've been trying to find the answer to how to fix this issue for weeks with no luck at all. I tried linking the pdf's to their parent page, tried renaming the links to be more concise and crawler friendly, and I've searched all types of articles involving pdf's and 5xx errors, all with no luck. Does anyone have any idea why these pages are getting this error and how I can fix them? Huge thanks in advance! Daniela S.
Moz Pro | | laurelleafep0 -
Spammy inbound links: Don't Fix It If It's Not Broken?
Hi Moz community, Our website is nearing the end of a big redesign to be mobile-responsive. We decided to delay any major changes to text content so that if we do suffer a rankings drop upon launch, we'll have some ability to isolate the cause. In the meantime I'm analyzing our current SEO strengths and weaknesses. There is a huge discrepancy between our rankings and our inbound link profile. Specifically, we do great on most of our targeted keywords and in fact had a decent surge in recent months. But Link Profiler turned up hundreds of pages of inbound links from spammy domains, many of which don't even display a webpage when I click there. (shown in uploaded image) "Don't fix it if it's not broken" is conflicting with my natural repulsion to these sorts of referrals. Assuming we don't suffer a rankings drop from the redesign, how much of a priority should this be? There are too many and most are too spammy to contact the webmasters, so we'll need to do it through a Disavow. I couldn't even open the one at the top of the list because our business web proxy identified it as adult content. It seems like a common conception is that if Google hasn't penalized us for it yet, they will eventually. Are we talking about the algorithm just stumbling upon these links and hurting us or would this be something we would find in Manual Actions? (or both?) How long after the launch should we wait before attacking these bad links? Is there a certain spam score that you'd say is a threshold for "Yes, definitely get rid of it"? And when we do, should we Disavow domains one domain at a time to monitor any potential drops or all at once? (this seems kind of obvious but if the spam score and domain authority alone is enough of a signal that it won't hurt us, we'd rather get it done asap) How important is this compared to creating fresh new content on all the product pages? Each one will have new images as well as product reviews, but the product descriptions will be the same ones we've had up for years. I have new content written but it's delayed pending any fallout from the redesign. Thanks for any help with this! d1SB2JP.jpg
Moz Pro | | jcorbo0 -
Moz and HubSpot SSL - crawl error?
I'm getting an error message when Moz tries to crawl my site, however when I check in Google Search Console, they return no errors. Our site is hosted on HubSpot. Is Moz still having trouble crawling HubSpot sites that have enabled their SSL? I read an article that this should have been corrected in early 2017, but I'm getting an error.
Moz Pro | | jennygriffin0 -
Having 1 page crawl error on 2 sites
Help! A few weeks back, my dev team did some "changes" (that I don't know anything about), but ever since then, my Moz crawl has only shown one page for either http://betamerica.com or http://fanex.com. Moz service was helpful in talking about a redirect loop that existed, and I asked my team to fix it, which it looks to me like they have. Still, 1 page. I used SEO Book's spider tool and it also only sees 1 page, and sees the sites as http://https://betamerica.com (for example), which is just weird. I don't know enough about HT Access or server stuff to figure out what's going on, so if someone can help me figure that out, I'd appreciate it.
Moz Pro | | BetAmerica0 -
Crawl report shows Title Element too long but they aren't
Hi, My latest crawl report says that I have a stack of pages with Title Element Too Long on them - e.g. Build My Ride - charity team building event with real purposeBuild My Ride - charity team building event with real purpose http://www.teamelevate.co.nz/events/build-my-ride1.html You can see that it shows the title element as doubled-up. When I look at the title element on the live page it is not double. GWT shows that there are no issues with long title elements. Any ideas anyone...? Chris
Moz Pro | | chris.elevate0 -
How often does seomoz crawl the site? Can you force a crawl at a specific time ?
How often does seomoz crawl the site? Can you force a crawl at a specific time ?
Moz Pro | | stewbuch18720 -
Crawl Issues
My website - qtmoving.com - has 26 articles and when the SEOmoz did a crawl it only found 13 articles. Can someone please give me some insight as to why not all pages are being crawled.
Moz Pro | | CohesiveMarketing0 -
Duplicate page content reports duplicates, but pages don't show duplication
My duplicate page reports shows 376 pages with duplicate content. After reviewing the pages the report claims have duplicate content, i can't find duplications. could this be an error, or is there some source code that doesn't display that could be causing this issue?
Moz Pro | | noonzie0