Mod rewrite question
-
Sorry in advance if this isn't the best place to ask this question.
Google Webmaster Tools has recently identified a ton of "Not Found" pages, which are actual pages with some digits appended at the end.
For example, suppose an actual page on my blog is:
(A) http://www.example.com/blog/2012/09/my-post-title/
This page works just fine.
However, GWT has identified the following page as a "not found" page:
(B) http://www.example.com/blog/2012/09/my-post-title/9157586677/1846732913010
This appears to be happening to hundreds of posts on my site. In each case, the "9157586677" portion of the URL is identical, but the remaining 13 digits change from page to page.
I haven't been able to determine exactly what is causing this to happen - it's probably a social plug-in for Wordpress, or perhaps Disqus, but I'm not sure which one. I'll go through a process of elimination to narrow it down over the coming week.
As a quick fix, I'd like to create a ModRewrite rule so that requests for (B) get 301 redirected to (A). Since there are hundreds of posts, I need to do this in a way that works regardless of what's in the "/2012/09/my-post-title/" part of the URL.
Unfortunately, mod-rewrite is outside of my area of expertise. Can somebody please suggest how I can handle this? Thanks in advance.
PS - As for tracking down the cause, I've looked at the source of the pages in the "Linked From" area of GWT and the Not Found link is nowhere to be found. That is why I assume the bad link is being generated by some javascript that is a part of one of my plug-ins.
Update: It seems like Disqus is the source of these phantom links. There's considerable discussion here. I'll continue searching for a long-term solution. Meanwhile, I'd still appreciate help with the mod-rewrite question above. Thanks again.
-
I've found a solution and am posting it here in case anybody else is having the same problem:
RewriteRule ^([0-9]{4})/([0-9]{2})/([^/]+)/[0-9]+ /blog/$1/$2/$3/ [L,R=301]
-
I hadnt seen the update over Disquss at the end of the post.
Please, post all your advances on this topic Ahirai
Best regards!
-
Hi ahirai,
I was gonna say you should check the linked from tab in GWT but since you actually did it, for me its pretty sure that a plugin that drives content is creating this issue from scratch.
Since i´m neither an apache expert, i can´t give you a method to do the dirty work, but i can tell you the problem is created by some 3rd party plugin driving content of site.
Please, post your advances in the topic!
Good luck!!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Meta Title Tags - Quick question!
Hi all, Our category Meta Title Tags are a little woeful and so I'm in the process of rewriting them. Let's say you have a product for sale.... some inkjet cartridges for a Canon BJ10V printer for example. In an effort to keep things concise I was thinking that for this category I should have the meta title set simply as: 'Canon BJ10V Inkjet Cartridges' and perhaps our company name after this text (and a pipe delimiter) This takes us just under 50 characters which is ideal but doesn't include any real keyword variation and will result in the company name being duplicated at the tail of the title tag on 6,000 odd pages. A large number of my competitors have title tags along the lines of: 'Canon BJ10V Cheap Inkjet Cartridges for Canon BJ-10V Ink Printers' I understand the reasoning behind this but does the variation of keywords compensate for the fact that the title looks spammy (to both humans and Search Engines). What would you do? Keep it clean and concise or stuff the title full of keywords. In the event of the former would you include the company name in each title in the knowledge they would be well under 50 characters without? Thanks for your help.
Technical SEO | | ChrisHolgate1 -
Webmaster tools question
Hi all. I have a question regarding http vs https. I have an https site and was wondering how to tell google in Webmaster tools to combine and use https. I have setup all sites in Webmaster tools. Both www and non www for both http and https. I see where to set up the www vs the non www but don't quite understand how to do the https part. I want all traffic to: https://www-creative -technology-solutions.com Thanks
Technical SEO | | twoacejr0 -
Mobile website question
Hi Mozzers, A website I manage has a mobile friendly version of their main website and a /m version as well. I was wondering if anyone had any experience in the best way of handling this? Should we just get rid of the /m version and tag the mobile friendly version? Thanks!
Technical SEO | | KarlBantleman0 -
Help with google news application url question
Hi, i am going to be applying to have out site in google news but i have come across the below and not sure how we do this. I use joomla and our site is www.in2town.co.uk and the page we are including is http://www.in2town.co.uk/latest-news-headlines Article URLs. To make sure that we only crawl new articles, please make sure your URLs are unique with at least 3 digits, and are permanent. can anyone please let me know how i do this with the url please
Technical SEO | | ClaireH-1848860 -
Question about collapsible/Expandable
I will premise this question by saying I am not a developer, so please forgive my ignorance on this one. Is there a way to achieve an expandable/collapsible without relying on javascript? Thanks all! Dana
Technical SEO | | danatanseo0 -
Questionable SEO
Chess Telecom appears first when you search for 'business phone lines' in the UK so I used a campaign to check them out. It seems they've got tons of unrelated links and using comment spamming to increase their ranking. Along with fake twitter accounts and other things. Search for 'jewel jubic chess' and you'll see what i mean. I assumed this wasnt a good idea and been trying to get my link on relevant websites only. Any comments or suggestions? Should I simply trust that google will hopefully punish them eventually? Or should I be fighting fire with fire? Thanks Dan
Technical SEO | | DanFromUK0 -
Summarize your question.Crawl Diagnostics Summary
Hi, Crawl Diagnostics Summary pointed on some mistakes I've done, I fixed them, but Crawl Diagnostics Summary still shows same errors, how often does ithe data refreshes?
Technical SEO | | AndreyStotsky0 -
Question concerning a 302 Redirect
Hi! I've already done some research on redirects, but I still have a question concerning a 302 redirect implemented at the homepage of a website. The Website www.domainA.com has a 302 redirect to www.domainA.com/content/.... Also all subsequent pages have the /content/ directory in their URLs: e.g domainA.com/content/products First thing I was wondering about, was the use of a redirect to a new site using an additional directory /content/... Why would anyone do this? Would it be enough to replace the 302 with a 301 redirect, or would you recommend to change the entire structure and eliminate this /content/ directory? The most logical structure would be www.domainA.com/products/.., and not www.domainA.com/content/products, right? Second thing: Given that 302 means temporary redirect, what are the actual implications when redirecting from domainA.com to domainA.com/content? I've heard that 302 redirects don't pass linkjuice and are detrimental for the site's rankings... What are the actual implications concerning the example above (302 redirect from domainA.com to domainA.com/content ? Would be great to get some advice about the first problem and maybe some insights about the second one concerning 302s in general. Thanks in advance! Cheers, Chris
Technical SEO | | adwordize0