Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Increase of 404 error after change of encoding
-
Hello,
We just have launch a new version of our website with a new utf-8 encoding.
Thing is, we use comma as a separator and since the new website went live, I have a massive increase of 404 error of comma-encoded URL.
Here is an example :
http://web.bons-de-reduction.com/annuaire%2C321-sticker%2Csite%2Cpromotions%2C5941.html
instead of :
http://web.bons-de-reduction.com/annuaire,321-sticker,site,promotions,5941.html
I check with Screaming Frog SEO and Xenu, I can't manage to find any encoded URL.
Is anyone have a clue on how to fix that ?
Thanks
-
I will take a look at it but it's not the issue that SEOMoz tell me as this format concerns only images. It's actually a little trick to do lazyloading on images.
The link you pointed out on your example is good ("/annuaire,amkashop,site,promotion...) as comma are not encoded.
And for your example I see no issue except capitalization.
I bet this is a Moz problem because when I fetch as Googlebot, I don't find encoded URL...
-
just wanted to give you one more thing that I think would help http://www.w3schools.com/html5/att_meta_charset.asp
I believe you should clean up your encoding and that it will not be a big deal.
Sincerely,
Tom
-
I thought this may help as well because you do have to clean up your source code
The online quoted-printable encoder tool first encodes the input text in either UTF-8 or ISO-8859-1. The characters are then output according to this schema:
| Character | Result | Comment |
| "=" (0x3D) | =3D | Special handling of the equal sign |
| " " (0x20) to "~" (0x7E) | Unmodified | Printable ASCII (7 bits) |
| Any other | =XX | Hexadecimal char code |Since quoted-printable does not in itself specify the text character encoding, it is important to specify this correctly when used. The online quoted-printable decoder tool attempts to auto-detect the text encoding.
See the Wikipedia article on quoted-printable for more info.
-
I would use a tool similar to this http://www.percederberg.net/tools/text_converter.html
as you can see your links for your gif photos are encoded "data:image/gif;base64"
please give it a try and tell me if that helps?
Sincerely,
Thomas
-
Hello and thanks for your answer.
No word involved here.
We move from :
http-equiv="content-type" content="text/html; charset=iso-8859-1" />
to
charset="utf-8">
Everything is fine except for Mozbot
-
what you need to do is go into your site and cleanup the links that have been converted and messed up because of the change. Once you clean them you will have no problem this is what your links look like
data:image/gif;base64,R0lGODlhAQABAIAAAP///////yH+A1BTQQAsAAAAAAEAAQAAAgJEAQA7
utf-8 is definitely the right coding it's very good you just have to go in and clean it up looking your source code
"
| {"m":2571,"a":"wrap"}" width="108" height="65" data-original="/upload/merchants_logo/108-65/amkashop.jpeg" src="data:image/gif;base64,R0lGODlhAQABAIAAAP///////yH+A1BTQQAsAAAAAAEAAQAAAgJEAQA7"> <noscript></span><img data-merchant="2571" class="merchantLogo lazy" data-out="{"m":2571,"a":"wrap"}" width="108" height="65" src="/upload/merchants_logo/108-65/amkashop.jpeg" alt="Amkashop"><span></noscript> |
| |<a <span="">href</a><a <span="">="</a>/annuaire,amkashop,site,promotions,2685.html" title="Amkashop">Code promo Amkashop"
|
I hope I've been of help to you.
Thomas
-
Did you happened to possibly write it using Microsoft Word and paste content in? Or are you speaking about a website that you converted from another encoding to Unicode 8?
sincerely,
Thomas
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved Link Tracking List Error
"I have been maintaining 5 directories of backlinks in the 'Link Tracking List' section for several months. However, I am unable to locate any of these links at this time. Additionally, the link from my MOZ profile is currently broken and redirects to an error page, no to Elche Se Mueve. Given the premium pricing of MOZ's services, these persistent errors are unacceptable."
Moz Pro | | Alberto D.0 -
Pages with Duplicate Content Error
Hello, the result of renewed content appeared in the scan results in my Shopify Store. But these products are unique. Why am I getting this error? Can anyone please help to explain why? screenshot-analytics.moz.com-2021.10.28-19_53_09.png
Moz Pro | | gokimedia0 -
What are main factors to increase DA?
Hi, I'm running a blog and working continuously on it and creating both DoFollow and NoFollow links but DA of website best heated vest isn't increasing even I"m creating guest posts and other links. Can you please guide me what are the factors to increase DA so I can work on my website and increase my site's DA as well? Thank you!
Moz Pro | | loukris2350 -
Moz-Specific 404 Errors Jumped with URLs that don't exist
Hello, I'm going to try and be as specific as possible concerning this weird issue, but I'd rather not say specific info about the site unless you think it's pertinent. So to summarize, we have a website that's owned by a company that is a division of another company. For reference, we'll say that: OURSITE.com is owned by COMPANY1 which is owned by AGENCY1 This morning, we got about 7,000 new errors in MOZ only (these errors are not in Search Console) for URLs with the company name or the agency name at the end of the url. So, let's say one post is: OURSITE.com/the-article/ This morning we have an error in MOZ for URLs OURSITE.com/the-article/COMPANY1 OURSITE.com/the-article/AGENCY1 x 7000+ articles we have created. Every single post ever created is now an error in MOZ because of these two URL additions that seem to come out of nowhere. These URLs are not in our Sitemaps, they are not in Google... They simply don't exist and yet MOZ created an an error with them. Unless they exist and I don't see them. Obviously there's a link to each company and agency site on the site in the about us section, but that's it.
Moz Pro | | CJolicoeur0 -
Htaccess and robots.txt and 902 error
Hi this is my first question in here I truly hope someone will be able to help. It's quite a detailed problem and I'd love to be able to fix it through your kind help. It regards htaccess files and robot.txt files and 902 errors. In October I created a WordPress website from what was previously a non-WordPress site it was quite dated. I had built the new site on a sub-domain I created on the existing site so that the live site could remain live whilst I created on the subdomain. The site I built on the subdomain is now live but I am concerned about the existence of the old htaccess files and robots txt files and wonder if I should just delete the old ones to leave the just the new on the new site. I created new htaccess and robots.txt files on the new site and have left the old htaccess files there. Just to mention that all the old content files are still sat on the server under a folder called 'old files' so I am assuming that these aren't affecting matters. I access the htaccess and robots.txt files by clicking on 'public html' via ftp I did a Moz crawl and was astonished to 902 network error saying that it wasn't possible to crawl the site, but then I was alerted by Moz later on to say that the report was ready..I see 641 crawl errors ( 449 medium priority | 192 high priority | Zero low priority ). Please see attached image. Each of the errors seems to have status code 200; this seems to be applying to mainly the images on each of the pages: eg domain.com/imagename . The new website is built around the 907 Theme which has some page sections on the home page, and parallax sections on the home page and throughout the site. To my knowledge the content and the images on the pages are not duplicated because I have made each page as unique and original as possible. The report says 190 pages have been duplicated so I have no clue how this can be or how to approach fixing this. Since October when the new site was launched, approx 50% of incoming traffic has dropped off at the home page and that is still the case, but the site still continues to get new traffic according to Google Analytics statistics. However Bing Yahoo and Google show a low level of Indexing and exposure which may be indicative of the search engines having difficulty crawling the site. In Google Analytics in Webmaster Tools, the screen text reports no crawl errors. W3TC is a WordPress caching plugin which I installed just a few days ago to speed up page speed, so I am not querying anything here about W3TC unless someone spots that this might be a problem, but like I said there have been problems re traffic dropping off when visitors arrive on the home page. The Yoast SEO plugin is being used. I have included information about the htaccess and robots.txt files below. The pages on the subdomain are pointing to the live domain as has been explained to me by the person who did the site migration. I'd like the site to be free from pages and files that shouldn't be there and I feel that the site needs a clean up as well as knowing if the robots.txt and htaccess files that are included in the old site should actually be there or if they should be deleted... ok here goes with the information in the files. Site 1) refers to the current website. Site 2) refers to the subdomain. Site 3 refers to the folder that contains all the old files from the old non-WordPress file structure. **************** 1) htaccess on the current site: ********************* BEGIN W3TC Browser Cache <ifmodule mod_deflate.c=""><ifmodule mod_headers.c="">Header append Vary User-Agent env=!dont-vary</ifmodule>
Moz Pro | | SEOguy1
<ifmodule mod_filter.c="">AddOutputFilterByType DEFLATE text/css text/x-component application/x-javascript application/javascript text/javascript text/x-js text/html text/richtext image/svg+xml text/plain text/xsd text/xsl text/xml image/x-icon application/json
<ifmodule mod_mime.c=""># DEFLATE by extension
AddOutputFilter DEFLATE js css htm html xml</ifmodule></ifmodule></ifmodule> END W3TC Browser Cache BEGIN W3TC CDN <filesmatch ".(ttf|ttc|otf|eot|woff|font.css)$"=""><ifmodule mod_headers.c="">Header set Access-Control-Allow-Origin "*"</ifmodule></filesmatch> END W3TC CDN BEGIN W3TC Page Cache core <ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /
RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteRule .* - [E=W3TC_ENC:_gzip]
RewriteCond %{HTTP_COOKIE} w3tc_preview [NC]
RewriteRule .* - [E=W3TC_PREVIEW:_preview]
RewriteCond %{REQUEST_METHOD} !=POST
RewriteCond %{QUERY_STRING} =""
RewriteCond %{REQUEST_URI} /$
RewriteCond %{HTTP_COOKIE} !(comment_author|wp-postpass|w3tc_logged_out|wordpress_logged_in|wptouch_switch_toggle) [NC]
RewriteCond "%{DOCUMENT_ROOT}/wp-content/cache/page_enhanced/%{HTTP_HOST}/%{REQUEST_URI}/_index%{ENV:W3TC_PREVIEW}.html%{ENV:W3TC_ENC}" -f
RewriteRule .* "/wp-content/cache/page_enhanced/%{HTTP_HOST}/%{REQUEST_URI}/_index%{ENV:W3TC_PREVIEW}.html%{ENV:W3TC_ENC}" [L]</ifmodule> END W3TC Page Cache core BEGIN WordPress <ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /
RewriteRule ^index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]</ifmodule> END WordPress ....(((I have 7 301 redirects in place for old page url's to link to new page url's))).... #Force non-www:
RewriteEngine on
RewriteCond %{HTTP_HOST} ^www.domain.co.uk [NC]
RewriteRule ^(.*)$ http://domain.co.uk/$1 [L,R=301] **************** 1) robots.txt on the current site: ********************* User-agent: *
Disallow:
Sitemap: http://domain.co.uk/sitemap_index.xml **************** 2) htaccess in the subdomain folder: ********************* Switch rewrite engine off in case this was installed under HostPay. RewriteEngine Off SetEnv DEFAULT_PHP_VERSION 53 DirectoryIndex index.cgi index.php BEGIN WordPress <ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /WPnewsiteDee/
RewriteRule ^index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /subdomain/index.php [L]</ifmodule> END WordPress **************** 2) robots.txt in the subdomain folder: ********************* this robots.txt file is empty **************** 3) htaccess in the Old Site folder: ********************* Deny from all *************** 3) robots.txt in the Old Site folder: ********************* User-agent: *
Disallow: / I have tried to be thorough so please excuse the length of my message here. I really hope one of you great people in the Moz community can help me with a solution. I have SEO knowledge I love SEO but I have not come across this before and I really don't know where to start with this one. Best Regards to you all and thank you for reading this. moz-site-crawl-report-image_zpsirfaelgm.jpg0 -
404 Crawl Diagnostics with void(0) appended to URL
Hello I am getting loads of 404 reported in my Crawl report, all appended with void(0) at the end. For example: http://lfs.org.uk/films-and-filmmakers/watch-our-films/1289/void(0)
Moz Pro | | moshen
The site is running on Drupal 7, Has anyone come across this before? Kind Regards Moshe | http://lfs.org.uk/films-and-filmmakers/watch-our-films/1289/void(0) |0 -
404 even after Successful 301 Redirection
Hi, I've got quite few 404 error links on my site and I manually redirected all of them one by one with 301. They are redirecting successfully. But when I check my MOZ analysis, it still shows me as 404 error. I've done this about 4 days ago and MOZ crawled to my site couple times after that if it's not everyday. Do you know what the issue could be? And how can I fix it? PS: I've used Wordpress Redirection tool for it first and redirection did not work. Then I had to install the Simple 301 Redirects plugin to get it done.
Moz Pro | | nunobaronio0