Lately I have noticed Google indexing many files on the site without the .html extension
-
Hello,
Our site, while we convert, remains in HTML 4.0.
Fle names such as http://www.sample.com/samples/index.shtml are being picked up in the SERPS as http://www.sample.com/samples/ even when I use the "rel="canonical" tag and specify the full file name therein as recommended. The link to the truncated URL (http://www.sample.com/samples/) results in what MOZ shows as fewer incoming links than the full file name is shown as having incoming.
I am not sure if this is causing a loss in placement (the MOZ stats are showing a decline of late), which I have seen recently (of course, I am aware of other possible reasons, such as not being in HTML5 yet).
Any help with this would be great.
Thank you in advance
-
Can you clarify what you're concerned about for 301 redirects in terms of link juice?
301 redirects don't carry as much link juice as a direct link, but it doesn't impact correct links, just the links that, otherwise, wouldn't get link juice to your end destination at all. (Though, if your canonical is working correctly, it'll pass the same amount of link juice as a 301 redirect.)
Dr. Pete goes into this a bit more over here: https://moz.com/community/q/do-canonical-tags-pass-all-of-the-link-juice-onto-the-url-they-point-to
-
Many thanks for taking the time to respond Kristina.
-
I don't like to do redirects, as so many have warned of the consequences in terms of link juice
-
No, I don't link to the pages in question using "/" rather than the ".shtml" version of the page indexed.
-
A few external sources use the "/" version (recent linkers) I have found, but they likely only did so as they saw it displayed as such in the SERPs previously. No commercial or other affiliate sites do.
The reason I was really confused is that some pages are indexed using the "/", while others are not -- with no apparent reason I could locate. The "/" version for pages still remains on the first page for keywords, even with far less domain authorities and pages linking to them (for now!). We will be moving to another platform with a different default extension, so I wonder how that will be handled. Endless mysteries.
Thank you again for your time and suggestions,
Greg
-
-
Hmm, that doesn't seem good. It's hard to say whether this is causing the decline in your rankings, but either way, you want to make sure that you're not splitting your link equity between your / and .shtml pages. Here's what I'd do:
- If you can, 301 redirect / pages to .shtml pages. Obviously, it'd be easier if the canonical worked, but it sounds like it doesn't.
- Use ScreamingFrog or DeepCrawl to look through internal pages on your site to see if you're ever linking to the / version of pages rather than the .shtml pages. When Google chooses a different version of a URL over the canonical one, it's often because that's how it sees internal links pointing to the page. Make sure that you only have links to the .shtml version of the page.
- Use a tool like Moz or Ahrefs to find all internal links to your site. For any links that you built or have a partnership with the owners, make sure that they're linking to the .shtml version of the page. I could especially see your ad partners using / because it's a cleaner before parameters than .shtml.
After that, wait and see if Google fixes the problem.
Also worth noting: have you thought about changing your default to /? That's more common today, so you're probably getting a lot of external links with / instead of .shtml, and you'll never be able to fix that problem. If that's a possible solution, you may want to explore it.
Good luck!
Kristina
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My product category pages are not being indexed on google can someone help?
My website has been indexed on google and all of its pages can be found on google except for the product category pages - which are where we want our traffic heading to, so this is a big problem for us. Our website is www.skirtinguk.com And an example of a page that isn't being indexed is https://www.skirtinguk.com/product-category/mdf-skirting-board/
Intermediate & Advanced SEO | | chelseaskirtinguk0 -
Fix Google Index error
I changed my blog URL structure Can Someone please let me how to solve this?
Intermediate & Advanced SEO | | Michael.Leonard0 -
Removing content from Google's Indexes
Hello Mozers My client asked a very good question today. I didn't know the answer, hence this question. When you submit a 'Removing content for legal reasons report': https://support.google.com/legal/contact/lr_legalother?product=websearch will the person(s) owning the website containing this inflammatory content recieve any communication from Google? My clients have already had the offending URL removed by a court order which was sent to the offending company. However now the site has been relocated and the same content is glaring out at them (and their potential clients) with the title "Solicitors from Hell + Brand name" immediately under their SERPs entry. **I'm going to follow the advice of the forum and try to get the url removed via Googles report system as well as the reargard action of increasing my clients SERPs entries via Social + Content. ** However, I need to be able to firmly tell my clients the implications of submitting a report. They are worried that if they rock the boat this URL (with open access for reporting of complaints) will simply get more inflammatory)! By rocking the boat, I mean, Google informing the owners of this "Solicitors from Hell" site that they have been reported for "hosting defamatory" content. I'm hoping that Google wouldn't inform such a site, and that the only indicator would be an absence of visits. Is this the case or am I being too optimistic?
Intermediate & Advanced SEO | | catherine-2793880 -
Is Google indexing Mp3 audio and MIDI music files? Can that cause any duplicate problems?
Hello, I own virtualsheetmusic.com website and we have several thousands of media files (Mp3 and MIDI files) that potentially Google can index. If that's the case, I am wondering if that could cause any "duplicate" issues of some sort since many of such media files have exact file names or same meta information inside. Any thoughts about this issue are very welcome! Thank you in advance to anyone.
Intermediate & Advanced SEO | | fablau0 -
Redirecting site from html/php to wordpress
I've never come across this and haven't been able to really find anything that explains it very well. I want to get opinions before we make a definitive decision. Here's the scenario... I am working on a site that was built in HTML/PHP and some of the pages are ranking pretty well. (some page 1, but not number 1) We are going to start using the Wordpress platform by year's end. The pages that were built in html have been built a little spammy but they still rank. I just think they are keyword stuffed a little and not very "reader friendly" (I think the last person was spinning content). So, we've built completely new content on our new pages and we've commissioned really good content writers for them. I will be handling the on-page SEO going forward so I know what to do there. My questions are this.... Should I 301 the old pages to the new pages with the better content? (old pages have the .html or .php extensions so www.example.com/keyword.php will become www.example.com/keyword-keyword Is there any negative side to doing this since the content will be completely different then the old pages that are being 301 from. (Keywords are pretty much staying the same with the exception of minor variations. ie, www.example.com/red-cashmere-sweater.php to www.example.com/cashmere-sweater) I ask this because I've moved sites before where I've just changed the location of the same content. I've never done it where the content is changing and so is the URL extension. Thank you in advance for your help and guidance.
Intermediate & Advanced SEO | | DarinPirkey0 -
Google indexing issue?
Hey Guys, After a lot of hard work, we finally fixed the problem on our site that didn't seem to show Meta Descriptions in Google, as well as "noindex, follow" on tags. Here's my question: In our source code, I am seeing both Meta descriptions on pages, and posts, as well as noindex, follow on tag pages, however, they are still showing the old results and tags are also still showing in Google search after about 36 hours. Is it just a matter of time now or is something else wrong?
Intermediate & Advanced SEO | | ttb0 -
Google is indexing wordpress attachment pages
Hey, I have a bit of a problem/issue what is freaking me out a bit. I hope you can help me. If i do site:www.somesitename.com search in Google i see that Google is indexing my attachment pages. I want to redirect attachment URL's to parent post and stop google from indexing them. I have used different redirect plugins in hope that i can fix it myself but plugins don't work. I get a error:"too many redirects occurred trying to open www.somesitename.com/?attachment_id=1982 ". Do i need to change something in my attachment.php fail? Any idea what is causing this problem? get_header(); ?> /* Run the loop to output the attachment. * If you want to overload this in a child theme then include a file * called loop-attachment.php and that will be used instead. */ get_template_part( 'loop', 'attachment' ); ?>
Intermediate & Advanced SEO | | TauriU0 -
Why my site is "STILL" violating the Google quality guidelines?
Hello, I had a site with two topics: Fashion & Technology. Due to the Panda Update I decided to change some things and one of those things was the separation of these two topics. So, on June 21, I redirected (301) all the Fashion pages to a new domain. The new domain performed well the first three days, but the rankings dropped later. Now, even the site doesn't rank for its own name. So, I thought the website was penalized for any reason, and I sent a reconsideration to Google. In fact, five days later, Google confirmed that my site is "still violating the quality guidelines". I don't understand. My original site was never penalized and the content is the same. And now when it is installed on the new domain becomes penalized just a few days later? Is this penalization only a sandbox for the new domain? Or just until the old URLs disappear from the index (due to the 301 redirect)? Maybe Google thinks my new site is duplicating my old site? Or just is a temporal prevention with new domains after a redirection in order to avoid spammers? Maybe this is not a real penalization and I only need a little patience? Or do you think my site is really violating the quality guidelines? (The domain is http://www.newclothing.co/) The original domain where the fashion section was installed before is http://www.myddnetwork.com/ (As you can see it is now a tech blog without fashion sections) The 301 redirect are working well. One example of redirected URLs: http://www.myddnetwork.com/clothing-shoes-accessories/ (this is the homepage, but each page was redirected to its corresponding URL in the new domain). I appreciate any advice. Basically my fashion pages have dropped totally. Both, the new and old URLs are not ranking. 😞
Intermediate & Advanced SEO | | omarinho0