How to find 20 hidden 404s
-
Hello,
We have like twenty 404s left to find. How do you find these when:
1. They don't show up in Google Webmaster Tools
2. They don't have any other internal or external pages linking to them.
3. They don't show up in site:domain.com (We have 9000 pages and only 600 show up - I fixed those out of the 600).
4. They are probably causing high bounce rates.
5. They're not in the sitemap
Thanks!
-
You should be able to crawl the entire site without increasing the RAM, once you buy the paid version.
-
Right in the paid version wont stop! You can see in their website that the limit of urls will be remove once you buy the license. http://www.screamingfrog.co.uk/seo-spider/licence/
It should catch everything, maybe you can contact their support just to make sure, they are very good on support over twitter.
-
Yes, but if it only check 397 in the free version is it going to stop there in the paid version as well. Just making sure.
Also, will it catch everything?
-
Yes I just run a craw for 500 000 urls
-
The free version checked 397 URLs. Will the purchased version check all 9000?
-
You can try crawl the site using screamingfrog, make sure you check the box “Check Links Outside Folder” in the spider menu.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Dealing with 404s during site migration
Hi everyone - What is the best way to deal with 404s on an old site when you're migrating to a new website? Thanks, Luke
Intermediate & Advanced SEO | | McTaggart0 -
Suspected hacking - Google has detected that some of your pages may contain hidden text or cloaking
I got below message from google, But I did not see any hidden text, Please check it. http://www.astrologerravisharma.com/: Suspected hacking Google has detected that some of your pages may contain hidden text or cloaking, techniques that are outside our Webmaster Guidelines. Specifically, we detected that your site may have been modified by a third party. Typically, the offending party gains access to an insecure directory that has open permissions. Many times, they will upload files or modify existing ones, which then show up as spam in our index. Sample URLs: http://www.astrologerravisharma.com/ http://www.astrologerravisharma.com/about-us/ http://www.astrologerravisharma.com/achievements/ Recommended action Clean up the hacked content so that your site meets Google's Webmaster Guidelines.
Intermediate & Advanced SEO | | bondhoward0 -
2.3 million 404s in GWT - learn to live with 'em?
So I’m working on optimizing a directory site. Total size: 12.5 million pages in the XML sitemap. This is orders of magnitude larger than any site I’ve ever worked on – heck, every other site I’ve ever worked on combined would be a rounding error compared to this. Before I was hired, the company brought in an outside consultant to iron out some of the technical issues on the site. To his credit, he was worth the money: indexation and organic Google traffic have steadily increased over the last six months. However, some issues remain. The company has access to a quality (i.e. paid) source of data for directory listing pages, but the last time the data was refreshed some months back, it threw 1.8 million 404s in GWT. That has since started to grow progressively higher; now we have 2.3 million 404s in GWT. Based on what I’ve been able to determine, links on this particular site relative to the data feed are broken generally due to one of two reasons: the page just doesn’t exist anymore (i.e. wasn’t found in the data refresh, so the page was simply deleted), or the URL had to change due to some technical issue (page still exists, just now under a different link). With other sites I’ve worked on, 404s aren’t that big a deal: set up a 301 redirect in htaccess and problem solved. In this instance, setting up that many 301 redirects, even if it could somehow be automated, just isn’t an option due to the potential bloat in the htaccess file. Based on what I’ve read here and here, 404s in and of themselves don’t really hurt the site indexation or ranking. And the more I consider it, the really big sites – the Amazons and eBays of the world – have to contend with broken links all the time due to product pages coming and going. Bottom line, it looks like if we really want to refresh the data on the site on a regular basis – and I believe that is priority one if we want the bot to come back more frequently – we’ll just have to put up with broken links on the site on a more regular basis. So here’s where my thought process is leading: Go ahead and refresh the data. Make sure the XML sitemaps are refreshed as well – hopefully this will help the site stay current in the index. Keep an eye on broken links in GWT. Implement 301s for really important pages (i.e. content-rich stuff that is really mission-critical). Otherwise, just learn to live with a certain number of 404s being reported in GWT on more or less an ongoing basis. Watch the overall trend of 404s in GWT. At least make sure they don’t increase. Hopefully, if we can make sure that the sitemap is updated when we refresh the data, the 404s reported will decrease over time. We do have an issue with the site creating some weird pages with content that lives within tabs on specific pages. Once we can clamp down on those and a few other technical issues, I think keeping the data refreshed should help with our indexation and crawl rates. Thoughts? If you think I’m off base, please set me straight. 🙂
Intermediate & Advanced SEO | | ufmedia0 -
Site architecture: Deep drop menus & flat hidden menu?
I hope this makes sense. I am creating a site that will have normal drop down menu structure that will be about 3 levels deep: site.com/category/topic/sub-topic . I also want to add content that will be set up under a hidden menu, but with a sidebar module (placed on the relevant pages that are set up under the drop down) with links to other custom pages that will be relevant to the drop menu pages, but i'm hoping that the flat structure pages will show better for search: site.com/content-page The reason I am asking is because I have seen a competitor do this for a personal injury law firm and they show everywhere (throughout California) for vanity search -"city car accident lawyer". When you go to the site, they have a personal injury drop down that is 3 layers deep, but when you click down the layers, and look at the URL, they are all "flat" site.com/car-accident-lawyer, not site.com/personal-injury/accidents/car-accident-lawyer. Is having a hidden menu a problem? Is this strategy problematic in any way? Hope that makes sense. Thank you for any direction. BB
Intermediate & Advanced SEO | | BBuck0 -
Google's Stance on "Hidden" Content
Hi, I'm aware Google doesn't care if you have helpful content you can hide/unhide by user interaction. I am also aware that Google frowns upon hiding content from the user for SEO purposes. We're not considering anything similar to this. The issue is, we will be displaying only a part of our content to the user at a time. We'll load 3 results on each page initially. These first 3 results are static, meaning on each initial page load/refresh, the same 3 results will display. However, we'll have a "Show Next 3" button which replaces the initial results with the next 3 results. This content will be preloaded in the source code so Google will know about it. I feel like Google shouldn't have an issue with this since we're allowing the user action to cycle through all results. But I'm curious, is it an issue that the user action does NOT allow them to see all results on the page at once? I am leaning towards no, this doesn't matter, but would like some input if possible. Thanks a lot!
Intermediate & Advanced SEO | | kirmeliux0 -
Does having small ticket items (say under $1) available for customers to find & buy help or hurt our site?
I feel really silly asking this question to begin with, but... as a music store, we have a lot of "smalls" for products, like a guitar pick. We sell picks for $0.50 each, or a single clarinet reed at $0.79. Some believe this is too small, finicky, and cumbersome to have listed for sale on our site. To me, I wholeheartedly disagree with the notiion of excluding "smalls" for a plethera of SEO, customer service, & online SALES reasons... Also we offer USPS shipping to offer low shipping costs on small goods. Can I really be wrong about this? Thanks, Kevin
Intermediate & Advanced SEO | | Kevin_McLeish1 -
Site less than 20 pages shows 1,400+ pages when crawled
Hello! I’m new to SEO, and have been soaking up as much as I can. I really love it, and feel like it could be a great fit for me – I love the challenge of figuring out the SEO puzzle, plus I have a copywriting/PR background, so I feel like that would be perfect for helping businesses get a great jump on their online competition. In fact, I was so excited about my newfound love of SEO that I offered to help a friend who owns a small business on his site. Once I started, though, I found myself hopelessly confused. The problem comes when I crawl the site. It was designed in Wordpress, and is really not very big (part of my goal in working with him was to help him get some great content added!) Even though there are only 11 pages – and 6 posts – for the entire site, when I use Screaming Frog to crawl it, it sees HUNDREDS of pages. It stops at 500, because that is the limit for their free version. In the campaign I started here at SEOmoz, and it says over 1,400 pages have been crawled…with something like 900 errors. Not good, right? So I've been trying to figure out the problem...when I look closer in Screaming Frog, I can see that some things are being repeated over and over. If I sort by the Title, the URLs look like they’re stuck in a loop somehow - one line will have /blog/category/postname…the next line will have /blog/category/category/postname…and the next line will have /blog/category/category/category/postname…and so on, with another /category/ added each time. So, with that, I have two questions Does anyone know what the problem is, and how to fix it? Do professional SEO people troubleshoot this kind of stuff all of the time? Is this the best place to get answers to questions like that? And if not, where is? Thanks so much in advance for your help! I’ve enjoyed reading all of the posts that are available here so far, it seems like a really excellent and helpful community...I'm looking forward to the day when I can actually answer the questions!! 🙂
Intermediate & Advanced SEO | | K.Walters0 -
Hidden keywords - how many per page?
Hi All, We have a booking website we want to optimize for keywords we cannot really show, because some of our partners wouldn't want it. We figured we can put said keywords or close synonyms onpage in various places that are not too dangerous though (e.g. image names, image alt tags, URLs, etc.). The question is how much keywords we can target though? We know keyword stuffing is detrimental, and we will not start to create long URLs stuffed with keywords, same for H1 tags or page titles. So how many is acceptable/not counterproductive? Thanks!
Intermediate & Advanced SEO | | Philoups0