Mod rewrite question
-
Sorry in advance if this isn't the best place to ask this question.
Google Webmaster Tools has recently identified a ton of "Not Found" pages, which are actual pages with some digits appended at the end.
For example, suppose an actual page on my blog is:
(A) http://www.example.com/blog/2012/09/my-post-title/
This page works just fine.
However, GWT has identified the following page as a "not found" page:
(B) http://www.example.com/blog/2012/09/my-post-title/9157586677/1846732913010
This appears to be happening to hundreds of posts on my site. In each case, the "9157586677" portion of the URL is identical, but the remaining 13 digits change from page to page.
I haven't been able to determine exactly what is causing this to happen - it's probably a social plug-in for Wordpress, or perhaps Disqus, but I'm not sure which one. I'll go through a process of elimination to narrow it down over the coming week.
As a quick fix, I'd like to create a ModRewrite rule so that requests for (B) get 301 redirected to (A). Since there are hundreds of posts, I need to do this in a way that works regardless of what's in the "/2012/09/my-post-title/" part of the URL.
Unfortunately, mod-rewrite is outside of my area of expertise. Can somebody please suggest how I can handle this? Thanks in advance.
PS - As for tracking down the cause, I've looked at the source of the pages in the "Linked From" area of GWT and the Not Found link is nowhere to be found. That is why I assume the bad link is being generated by some javascript that is a part of one of my plug-ins.
Update: It seems like Disqus is the source of these phantom links. There's considerable discussion here. I'll continue searching for a long-term solution. Meanwhile, I'd still appreciate help with the mod-rewrite question above. Thanks again.
-
I've found a solution and am posting it here in case anybody else is having the same problem:
RewriteRule ^([0-9]{4})/([0-9]{2})/([^/]+)/[0-9]+ /blog/$1/$2/$3/ [L,R=301]
-
I hadnt seen the update over Disquss at the end of the post.
Please, post all your advances on this topic Ahirai
Best regards!
-
Hi ahirai,
I was gonna say you should check the linked from tab in GWT but since you actually did it, for me its pretty sure that a plugin that drives content is creating this issue from scratch.
Since i´m neither an apache expert, i can´t give you a method to do the dirty work, but i can tell you the problem is created by some 3rd party plugin driving content of site.
Please, post your advances in the topic!
Good luck!!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
General SSL Questions After Move
Hello, We have moved our site to https, Google Analytics seems to be tracking correctly. However, I have seen some conflicting information, should I create a new view in analytics? Additionally, should I also create a new https property in Google search console and set it as the preferred domain? If so, should I keep the old sitemap for my http property while updating the sitemap to https only for the https property? Thirdly, should I create a new property as well as new sitemaps in Bing webmaster? Finally, after doing a crawl on our http domain which has a 301 to https, the crawl stopped after the redirect, is this a result of using a free crawling tool or will bots not be able to crawl my site after this redirect? Thanks for all the help in advance, I know there are a lot of questions here.
Technical SEO | | Tom3_150 -
Is it necessary to 301 rewrite /index.php to /?
Hi, We have build a lot of external link to http://www.oursite.com/ Do I have to do a 301 redirect from http://www.oursite.com/index.php to http://www.outsite.com/? Thanks
Technical SEO | | LauraHT0 -
One more redirect question
If there are two URLs like below: example.com/toys/batman-toys
Technical SEO | | IceIcebaby
example.com/birthday/batman-toys Both have the exact same everything, except URL key. The first example ranks for all KWs and search terms in the SEs. Does having the second page hurt my ranking potential for the first page? Should I redirect the 2nd page to the first or just leave it? As always, thanks for your help.0 -
A few misc Webmaster tools questions & Robots.txt etc
Hi I have a few general misc questions re Robots.tx & GWT: 1) In the Robots.txt file what do the below lines block, internal search ? Disallow: /?
Technical SEO | | Dan-Lawrence
Disallow: /*? 2) Also the sites feeds are blocked in robots.txt, why would you want to block a sites feeds ? **3) **What's the best way to deal with the below: - old removed page thats returning a 500 response code ? - a soft 404 for an old removed page that has no current replacement old removed pages returning a 404 The old pages didn't have any authority or inbound links hence is it best/ok to simply create a url removal request in GWT ? Cheers Dan0 -
Controlling PageRank Flow Question
One of my competitors rose above me drastically, above everyone actually. His website has a sidebar on the homepage, but when you click a post it leads to a full width page of his content - with no sidebar at all. The only button on page is HOME. My site has the sidebar on all pages, meaning juice is flowing around in all directions. Would it be smarter for me to remove my sidebar as well? In theory, this would create a boost in rankings correct?
Technical SEO | | PrivatePartners0 -
Just read Travis Loncar's YouMoz post and I have a question about Pagination
This was a brilliant post. I have a question about Pagination on sites that are opting to use Google Custom Search. Here is an example of a search results page from one of the sites I work on: http://www.ccisolutions.com/StoreFront/category/search-return?q=countryman I notice in the source code of sequential pages that the rel="next" and rel="prev" tags are not used. I also noticed that the URL does not change when clicking on the numbers for the subsequent pages of the search results. Also, the canonical tag of every subsequent page looks like this: Are you thinking what I'm thinking? All of our Google Custom Search pages have the same canonical tag....Something's telling me this just can't be good. Questions: 1. Is this creating a duplicate content issue? 2. If we need to include rel="prev" and rel="next" on Google Custom Search pages as well as make the canonical tag accurate, what is the best way to implement this? Given that searchers type in such a huge range of search terms, it seems that the canonical tags would have to be somehow dynamically generated. Or, (best case scenario!) am I completely over-thinking this and it just doesn't matter on dynamically driven search results pages? Thanks in advance for any comments, help, etc.
Technical SEO | | danatanseo1 -
I have mulitple domains that are both drawing traffic and that I should only have doing that. my question is how do I make one go away?
First off I am VERY new to his SEO stuff and If you guys could be so kind as to help. I was setting up my first campaign for my web site and when i entered it into the URL search it came back with having 2 web sites that it searched. Both are mine but one has the "www.website.com" and the other just has the "website.com" how can i fix this so i just have one? thanks in advance for your help
Technical SEO | | madabouthats0