Over 500 thin URLs indexed from dynamically created pages (for lightboxes)
-
I have a client who has a resources section. This section is primarily devoted to definitions of terms in the industry. These definitions appear in colored boxes that, when you click on them, turn into a lightbox with their own unique URL.
Example URL: /resources/?resource=dlna
The information for these lightboxes is pulled from a standard page: /resources/dlna.
Both are indexed, resulting in over 500 indexed pages that are either a simple lightbox or a full page with very minimal content. My question is this:
Should they be de-indexed? Another option I'm knocking around is working with the client to create Skyscraper pages, but this is obviously a massive undertaking given how many they have.
Would appreciate your thoughts. Thanks.
-
This was an issue that yoast came up with on an upgrade with wordpress in the middle of 2018. It may be worth a little research into how their "purge" plugin worked as it did exactly this.
Using the htaccess file simply tell google not to index the resource pages then they will naturally over time fall out of the search or you can purge by
- Log into the Google Search Console and select the desired website.
- Click on “Optimization” in the left-hand navigation.
- Click on “Remove URL” in the sub-menu.
- Click on the button “create a new request for removal” on this page.
Once this is done and they are set to no index. Problem solved.
-
I think the only way to do that is to delete the actual Resource page, since that's where the lightbox pulls the content. I don't want to delete these pages.
-
I would simply delete the file and then it will 404.
However, if you think that valuable links might be pointing to it, then I would do a 301 redirect to the most relevant page.
-
Thanks for the advice. Any idea on how to deindex just the lightbox URL? Never been in this position and I'd like to direct my client to the right resolution.
-
Thank you. My answer is then easy.
If this was my site, I would give the entire definition on the resource page and get rid of the fancy pants lightbox.
A page with 500 definitions is a lot. So I might divide them up into logical categories and optimize those pages for more specific queries.
If there are definitions that are of very high visitor interest, I would make a full page article about them and link to them from the resource page.
-
Nope. Almost none.
-
over 500 indexed pages that are either a simple lightbox or a full page with very minimal content
Are these pages pulling in much traffic from search? I have other thoughts but would like to consider this information before saying how I would handle this if it were my site.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Page Indexing without content
Hello. I have a problem of page indexing without content. I have website in 3 different languages and 2 of the pages are indexing just fine, but one language page (the most important one) is indexing without content. When searching using site: page comes up, but when searching unique keywords for which I should rank 100% nothing comes up. This page was indexing just fine and the problem arose couple of days ago after google update finished. Looking further, the problem is language related and every page in the given language that is newly indexed has this problem, while pages that were last crawled around one week ago are just fine. Has anyone ran into this type of problem?
Technical SEO | | AtuliSulava1 -
Spam pages being redirected to 404s but sill indexed
Client had a website that was hacked about a year ago. Hackers went in and added a bunch of spam landing pages for various products. This was before the site had installed an SSL certificate. After the hack, the site was purged of the hacked pages and and SLL certificate was implemented. Part of that process involved setting up a rewrite that redirects http pages to the https versions. The trouble is that the spam pages are still being indexed by Google, even months later. If I do a site: search I still see all of those spam pages come up before most of the key "real" landing pages. The thing is, the listing on the SERP are to the http versions, so they're redirecting to the https version before serving a 404. Is there any way I can fix this without removing the rewrite rule?
Technical SEO | | SearchPros1 -
Sudden decrease in indexed AMP pages after 8/1/16 update
After the AMP update on 8/1/16, the number of AMP pages indexed suddenly dropped by about 50% and it's crushing our search traffic- I haven't been able to find any documentation on any changes to look out for and why we are getting a penalty- any advice or something I should look out for?
Technical SEO | | nystromandy0 -
404 Errors for Form Generated Pages - No index, no follow or 301 redirect
Hi there I wonder if someone can help me out and provide the best solution for a problem with form generated pages. I have blocked the search results pages from being indexed by using the 'no index' tag, and I wondered if I should take this approach for the following pages. I have seen a huge increase in 404 errors since the new site structure and forms being filled in. This is because every time a form is filled in, this generates a new page, which only Google Search Console is reporting as a 404. Whilst some 404's can be explained and resolved, I wondered what is best to prevent Google from crawling these pages, like this: mydomain.com/webapp/wcs/stores/servlet/TopCategoriesDisplay?langId=-1&storeId=90&catalogId=1008&homePage=Y Implement 301 redirect using rules, which will mean that all these pages will redirect to the homepage. Whilst in theory this will protect any linked to pages, it does not resolve this issue of why GSC is recording as 404's in the first place. Also could come across to Google as 100,000+ redirected links, which might look spammy. Place No index tag on these pages too, so they will not get picked up, in the same way the search result pages are not being indexed. Block in robots - this will prevent any 'result' pages being crawled, which will improve the crawl time currently being taken up. However, I'm not entirely sure if the block will be possible? I would need to block anything after the domain/webapp/wcs/stores/servlet/TopCategoriesDisplay?. Hopefully this is possible? The no index tag will take time to set up, as needs to be scheduled in with development team, but the robots.txt will be an quicker fix as this can be done in GSC. I really appreciate any feedback on this one. Many thanks
Technical SEO | | Ric_McHale0 -
Get List Of All Indexed Google Pages
I know how to run site:domain.com but I am looking for software that will put these results into a list and return server status (200, 404, etc). Anyone have any tips?
Technical SEO | | InfinityTechnologySolutions0 -
Is it better to have URLs of internal pages that are geo-targeted or point geo-targeted links to the homepage?
For example... Having links that are geo-targeted and pointing to this URL www.test.com/state-service/ or Not having any geo-targeted internal pages and just having links that are geo-targeted and pointing to this URL www.test.com Eventually the site will be a national campaign, so I am concerned about having so many geo-targeted internal pages. Thanks in advance!
Technical SEO | | Cyclone0 -
What happens to content under a category page that is not indexed?
We are reevaluating our URL structure. We have a flat architecture but would like to add subfolders per recommendations here and elsewhere. Some of our category pages are ad heavy/content light so we have them no indexed. We do have lots of quality content on the site that we would like to put under some of these keyword categories. Should we leave it flat? If Google does not see that category page then there will be no link from the homepage to the content page? Now: homepage/content-page Proposed: homepage/category/content-page (category is not indexed)
Technical SEO | | hoch0 -
De-indexing thin content & Panda--any advantage to immediate de-indexing?
We added the nonidex, follow tag to our site about a week ago on several hundred URLs, and they are still in Google's index. I know de-indexing takes time, but I am wondering if having those URLs in the index will continue to "pandalize" the site. Would it be better to use the URL removal request? Or, should we just wait for the noindex tags to remove the URLs from the index?
Technical SEO | | nicole.healthline0