Mod rewrite question
-
Sorry in advance if this isn't the best place to ask this question.
Google Webmaster Tools has recently identified a ton of "Not Found" pages, which are actual pages with some digits appended at the end.
For example, suppose an actual page on my blog is:
(A) http://www.example.com/blog/2012/09/my-post-title/
This page works just fine.
However, GWT has identified the following page as a "not found" page:
(B) http://www.example.com/blog/2012/09/my-post-title/9157586677/1846732913010
This appears to be happening to hundreds of posts on my site. In each case, the "9157586677" portion of the URL is identical, but the remaining 13 digits change from page to page.
I haven't been able to determine exactly what is causing this to happen - it's probably a social plug-in for Wordpress, or perhaps Disqus, but I'm not sure which one. I'll go through a process of elimination to narrow it down over the coming week.
As a quick fix, I'd like to create a ModRewrite rule so that requests for (B) get 301 redirected to (A). Since there are hundreds of posts, I need to do this in a way that works regardless of what's in the "/2012/09/my-post-title/" part of the URL.
Unfortunately, mod-rewrite is outside of my area of expertise. Can somebody please suggest how I can handle this? Thanks in advance.
PS - As for tracking down the cause, I've looked at the source of the pages in the "Linked From" area of GWT and the Not Found link is nowhere to be found. That is why I assume the bad link is being generated by some javascript that is a part of one of my plug-ins.
Update: It seems like Disqus is the source of these phantom links. There's considerable discussion here. I'll continue searching for a long-term solution. Meanwhile, I'd still appreciate help with the mod-rewrite question above. Thanks again.
-
I've found a solution and am posting it here in case anybody else is having the same problem:
RewriteRule ^([0-9]{4})/([0-9]{2})/([^/]+)/[0-9]+ /blog/$1/$2/$3/ [L,R=301]
-
I hadnt seen the update over Disquss at the end of the post.
Please, post all your advances on this topic Ahirai
Best regards!
-
Hi ahirai,
I was gonna say you should check the linked from tab in GWT but since you actually did it, for me its pretty sure that a plugin that drives content is creating this issue from scratch.
Since i´m neither an apache expert, i can´t give you a method to do the dirty work, but i can tell you the problem is created by some 3rd party plugin driving content of site.
Please, post your advances in the topic!
Good luck!!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
De-indexing and SSL question
Few days ago Google indexed hundreds of my directories by mistake (error with plugins/host), my traffic dropped as a consequence. Anyway I fixed that and submitted a URL removal request. Now just waiting things to go back to normality. Meantime I was supposed to move my website to HTTPS this week. Question: Should I wait until this indexing error has been fixed or I may as well go ahead with the SSL move?
Technical SEO | | fabx0 -
Site Penalized - 301 Redirect Question
Hello, We have a website that was penalized roughly two years by Google for "Unnatural Links"... We are experiencing a lot of problems with this site, completely unrelated to the penalty or SERPS, and we're debating doing a 301 Re-direct to another site we own that is totally clean and has no "Unnatural Links". If we do a 301 from the penalized site to our alternative website, will there be any cross-contamination? Will the penalty carry over to our other site? Please let me know what you guys think. Thanks
Technical SEO | | Prime850 -
Questionable Referral Traffic
Hey SEOMozers, I'm working with a client that has a suspicious traffic pattern going on. In October, a referral domain called profitclicking.com started passing visits to the site. Almost, in parallel the overall visits decreased anywhere from 35 to 50%. After checking out profitclicking.com more, it promises more traffic "with no SEO knowledge". The client doesn't think that this service was signed up for internally. Regardless, it obviously smells pretty fishy, and I'm searching for a way I can disallow traffic from this site. Could I simply just write a simple disallow statement in the robots.txt and be done with it? Just wanted to see if anyone else had any other ideas before recommending a solution. Thanks!
Technical SEO | | kylehungate0 -
Duplicate Content Question (E-Commerce Site)
Hi All, I have a page that ranks well for the keyword “refurbished Xbox 360”. The ranking page is an eCommerce product details page for a particular XBOX 360 system that we do not currently have in stock (currently, we do not remove a product details page from the website, even if it sells out – as we bring similar items into inventory, e.g. more XBOX 360s, new additional pages are created for them). Long story short, given this way of doing things, we have now accumulated 79 “refurbished XBOX 360” product details pages across the website that currently, or at some point in time, reflected some version of a refurbished XBOX 360 in our inventory. From an SEO standpoint, it’s clear that we have a serious duplicate content problem with all of these nearly identical XBOX 360 product pages. Management is beginning to question why our latest, in-stock, XBOX 360 product pages aren't ranking and why this stale, out-of-stock, XBOX 360 product page still is. We are in obvious need of a better process for retiring old, irrelevant (product) content and eliminating duplicate content, but the question remains, how exactly is Google choosing to rank this one versus the others since they are primarily duplicate pages? Has Google simply determined this one to be the original? What would be the best practice approach to solving a problem like this from an SEO standpoint – 301 redirect all out of stock pages to in stock pages, remove the irrelevant page? Any thoughts or recommendations would be greatly appreciated. Justin
Technical SEO | | JustinGeeks0 -
Duplicate titles Question
Hi eveyone, I have around 1000 duplicate titles and meta description. The poblem was that I had pages in my home page and different pages had the same title. For example, index.php/site/articles/should_you_eat_protein_every_2-3_hours_for_muscle_growth/
Technical SEO | | anoopbal/index.php/site/articles/should_you_eat_protein_every_2-3_hours_for_muscle_growth/N12/
/index.php/site/articles/should_you_eat_protein_every_2-3_hours_for_muscle_growth/N1444/
/index.php/site/articles/should_you_eat_protein_every_2-3_hours_for_muscle_growth/N1448/
/index.php/site/articles/should_you_eat_protein_every_2-3_hours_for_muscle_growth/N1448/P6/
/index.php/site/articles/should_you_eat_protein_every_2-3_hours_for_muscle_growth/N1452/I have 172 of the same page!So I took off all the pagination on my home page and just added 'click fo more'. When they click more, it takes them to the category.So my question is will google slowly start deleting or non-indexing these duplicate titles or pages as I have removed it from my website? (Just so that you know I added a canonical link and figuring out how to add page numbers to met titles and meta description tags for categories with pages)
0 -
301 Redirect Question
I'm working on a site that has a lot of indexed pages and backlinks to both domain.com and www.domain.com. Will using a 301 redirect to send domain.com to www.domain.com merge all of the indexed pages and links over to www.domain.com, thereby strengthening the www?
Technical SEO | | Yo_Adrian0 -
SEO-MOZ bar question on root vs subdomain / canonicalization issues
When I look at the SEO-MOZ bar for our site and click next to subdomain (# links from #domains) it shows my main incoming links etc. but when I click on root domain ity shows mydomain/default.asp and 4 incoming links as well as a message that says this url redirects to another url. Does this imply canonicalization issues or is there a 301 redirect to my non /default.asp correcting this issue. Thanks kindly, Howard
Technical SEO | | mrkingsley0 -
Complex duplicate content question
We run a network of three local web sites covering three places in close proximity. Each sitehas a lot of unique content (mainly news) but there is a business directory that is shared across all three sites. My plan is that the search engines only index the business in the directory that are actually located in the place the each site is focused on. i.e. Listing pages for business in Alderley Edge are only indexed on alderleyedge.com and businesses in Prestbury only get indexed on prestbury.com - but all business have a listing page on each site. What would be the most effective way to do this? I have been using rel canonical but Google does not always seem to honour this. Will using meta noindex tags where appropriate be the way to go? or would be changing the urls structure to have the place name in and using robots.txt be a better option. As an aside my current url structure is along the lines of: http://dev.alderleyedge.com/directory/listing/138/the-grill-on-the-edge Would changing this have any SEO benefit? Thanks Martin
Technical SEO | | mreeves0