How to handle blank, auto generated system pages/urls
-
Hi Guys
Our backend system has been creating listing pages based on out of date and irrelevant data meaning we have hundreds of thousands of pages that are blank but currently indexable and active. They're almost impossible to access from the front end and have 0 traffic pointing at them but you can access these pages if you have the URL and i'm pretty sure due to the site architecture, google is crawling them regardless. For the most part, I think its likely best to 301 these pages to the most closely related page on the site but I'm concerned we're wasting crawl budget here. We don't want these pages to be crawled or found. Would a sound solution be to make them inactive, no-index and create a custom 404 in the event anyone (or the crawler) managed to get to them? Would this enormous increase in 404 pages cause us issues?
Many thanks
-
Thanks for such a speedy reply! Its such a daunting task as there's literally thousands and thousands of pages so we want to be sure we're doing the right thing. I appreciate your help. Now i'll investigate blocking within the robots.txt and using google search console to remove the URLs
-
First, do not 404 them, use a 410 error code instead as that denotes intended permanent deletion. In addition, I would also block the files/folder in robots.txt. Finally, I would use Google Search Console to remove these URLs. Good luck.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why is my home page ranking much higher than my collection page?
Hi everyone, Why is my client's home page ranking high for a certain keyword phrase rather than a collection page I have which is well optimised for this keyword? The collection page is on the 10th SERPs page. I did see there were keywords used in the footer of page and the keyword was also used in some intro text on the home page so I removed the keyword from these two places nearly 2 weeks ago and requested google to reindex both the collection page and home page and I've not seen any improvement of the collection page's ranking in SERPs. I also changed the meta description and meta title as the ctr was poor but there wasn''t that many impressions either. It is a competitive keyword organically so maybe the collection page's authority is just not good enough compared to the competitors hence why they are choosing the home page as it has higher page authority however this still is not helpful to searchers who land on home page. Does anyone have any ideas of what else I can do to get google to rank the ocllection page higher for the keyword instead of home page?
Intermediate & Advanced SEO | | TZ19820 -
Why a certain URL ( a category URL ) disappears?
the page hasn't been spammed. - links are natural - onpage grader is perfect - there are useful high ranking articles linking to the page...pretty much everything is okay.....also all of my websites pages are okay and none of them has disappeared only this one ( the most important category of my site. )
Intermediate & Advanced SEO | | mohamadalieskandariii0 -
What does Disallow: /french-wines/?* actually do - robots.txt
Hello Mozzers - Just wondering what this robots.txt instruction means: Disallow: /french-wines/?* Does it stop Googlebot crawling and indexing URLs in that "French Wines" folder - specifically the URLs that include a question mark? Would it stop the crawling of deeper folders - e.g. /french-wines/rhone-region/ that include a question mark in their URL? I think this has been done to block URLs containing query strings. Thanks, Luke
Intermediate & Advanced SEO | | McTaggart0 -
Redirecting homepage to internal page (2nd Tier page)
We are planning to experiment redirecting our homepage to one of the 2nd tier page. I mean....example.com to example.com/page. We need this page to rank well, but it doesn't have much internal links or external back-links, so we opt for this redirect. Advantage with this page is, it has "keyword" we want to rank for in URL. "page" in example.com/page. Will this help or hurt us in SEO? I think we are missing keyword in our root domain, so interested to highlight this page. Thanks, Satish
Intermediate & Advanced SEO | | vtmoz0 -
SEO within the URL /
If I were optimizing for 'marketing success' and my URL structure was domain.com/marketing/success would that count? I'm not sure if the '/' affects the keyword term. My assumption is that it does, but I wasn't 100% sure. Thanks!
Intermediate & Advanced SEO | | KristinaWitmer0 -
Is there any importance in including http:// in the url?
I have seen some sites that always redirect to https and some sites that always redirect to http://, but lately I have seen sites that force the url to just the site. As in [sitename].com, no www. no http://. Does this affect SEO in anyway? Is it good or bad for other things? I was surprised when I saw it and don't really know what effect it has.
Intermediate & Advanced SEO | | MarloSchneider0 -
Do in page links pointing to the parent page make the page more relevant for that term?
Here's a technical question. Suppose I have a page relevant to the term "Mobile Phones". I have a piece of text, on that page talking about "mobile phones", and within that text is the term "cell phones". Now if I link the text "cell phones", to the page it is already placed on (ie the parent page) - will the page gain more relevancy for the term "cell phones"?? Thanks
Intermediate & Advanced SEO | | James770 -
Should we block urls like this - domainname/shop/leather-chairs.html?brand=244&cat=16&dir=ascℴ=price&price=1 within the robots.txt?
I've recently added a campaign within the SEOmoz interface and received an alarming number of errors ~9,000 on our eCommerce website. This site was built in Magento, and we are using search friendly url's however most of our errors were duplicate content / titles due to url's like: domainname/shop/leather-chairs.html?brand=244&cat=16&dir=asc&order=price&price=1 and domainname/shop/leather-chairs.html?brand=244&cat=16&dir=asc&order=price&price=4. Is this hurting us in the search engines? Is rogerbot too good? What can we do to cut off bots after the ".html?" ? Any help would be much appreciated 🙂
Intermediate & Advanced SEO | | MonsterWeb280