Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Increase of 404 error after change of encoding
-
Hello,
We just have launch a new version of our website with a new utf-8 encoding.
Thing is, we use comma as a separator and since the new website went live, I have a massive increase of 404 error of comma-encoded URL.
Here is an example :
http://web.bons-de-reduction.com/annuaire%2C321-sticker%2Csite%2Cpromotions%2C5941.html
instead of :
http://web.bons-de-reduction.com/annuaire,321-sticker,site,promotions,5941.html
I check with Screaming Frog SEO and Xenu, I can't manage to find any encoded URL.
Is anyone have a clue on how to fix that ?
Thanks
-
I will take a look at it but it's not the issue that SEOMoz tell me as this format concerns only images. It's actually a little trick to do lazyloading on images.
The link you pointed out on your example is good ("/annuaire,amkashop,site,promotion...) as comma are not encoded.
And for your example I see no issue except capitalization.
I bet this is a Moz problem because when I fetch as Googlebot, I don't find encoded URL...
-
just wanted to give you one more thing that I think would help http://www.w3schools.com/html5/att_meta_charset.asp
I believe you should clean up your encoding and that it will not be a big deal.
Sincerely,
Tom
-
I thought this may help as well because you do have to clean up your source code
The online quoted-printable encoder tool first encodes the input text in either UTF-8 or ISO-8859-1. The characters are then output according to this schema:
| Character | Result | Comment |
| "=" (0x3D) | =3D | Special handling of the equal sign |
| " " (0x20) to "~" (0x7E) | Unmodified | Printable ASCII (7 bits) |
| Any other | =XX | Hexadecimal char code |Since quoted-printable does not in itself specify the text character encoding, it is important to specify this correctly when used. The online quoted-printable decoder tool attempts to auto-detect the text encoding.
See the Wikipedia article on quoted-printable for more info.
-
I would use a tool similar to this http://www.percederberg.net/tools/text_converter.html
as you can see your links for your gif photos are encoded "data:image/gif;base64"
please give it a try and tell me if that helps?
Sincerely,
Thomas
-
Hello and thanks for your answer.
No word involved here.
We move from :
http-equiv="content-type" content="text/html; charset=iso-8859-1" />
to
charset="utf-8">
Everything is fine except for Mozbot
-
what you need to do is go into your site and cleanup the links that have been converted and messed up because of the change. Once you clean them you will have no problem this is what your links look like
data:image/gif;base64,R0lGODlhAQABAIAAAP///////yH+A1BTQQAsAAAAAAEAAQAAAgJEAQA7
utf-8 is definitely the right coding it's very good you just have to go in and clean it up looking your source code
"
| {"m":2571,"a":"wrap"}" width="108" height="65" data-original="/upload/merchants_logo/108-65/amkashop.jpeg" src="data:image/gif;base64,R0lGODlhAQABAIAAAP///////yH+A1BTQQAsAAAAAAEAAQAAAgJEAQA7"> <noscript></span><img data-merchant="2571" class="merchantLogo lazy" data-out="{"m":2571,"a":"wrap"}" width="108" height="65" src="/upload/merchants_logo/108-65/amkashop.jpeg" alt="Amkashop"><span></noscript> |
| |<a <span="">href</a><a <span="">="</a>/annuaire,amkashop,site,promotions,2685.html" title="Amkashop">Code promo Amkashop"
|
I hope I've been of help to you.
Thomas
-
Did you happened to possibly write it using Microsoft Word and paste content in? Or are you speaking about a website that you converted from another encoding to Unicode 8?
sincerely,
Thomas
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved Link Tracking List Error
"I have been maintaining 5 directories of backlinks in the 'Link Tracking List' section for several months. However, I am unable to locate any of these links at this time. Additionally, the link from my MOZ profile is currently broken and redirects to an error page, no to Elche Se Mueve. Given the premium pricing of MOZ's services, these persistent errors are unacceptable."
Moz Pro | | Alberto D.0 -
Unsolved Using Weglot on wordpress (errors)
Good day to you all, Does anyone have experience of the errors being pulled up by Moz about the utility of the weglot plugin on Wordpress? Moz is pulling up URLs such as: https://www.ibizacc.com/es/chapparal-2/?wg-choose-original=false These are classified under "redirect issues" and 99% of the pages are with the ?wg-choose parameter in the URL. Is this having an actual negative impact on my search or is it something more Moz related being highlighted. Any advice be appreciated and a resolution .. Im thinking I could exclude this parameter.
Moz Pro | | alwaysbeseen0 -
Moz-Specific 404 Errors Jumped with URLs that don't exist
Hello, I'm going to try and be as specific as possible concerning this weird issue, but I'd rather not say specific info about the site unless you think it's pertinent. So to summarize, we have a website that's owned by a company that is a division of another company. For reference, we'll say that: OURSITE.com is owned by COMPANY1 which is owned by AGENCY1 This morning, we got about 7,000 new errors in MOZ only (these errors are not in Search Console) for URLs with the company name or the agency name at the end of the url. So, let's say one post is: OURSITE.com/the-article/ This morning we have an error in MOZ for URLs OURSITE.com/the-article/COMPANY1 OURSITE.com/the-article/AGENCY1 x 7000+ articles we have created. Every single post ever created is now an error in MOZ because of these two URL additions that seem to come out of nowhere. These URLs are not in our Sitemaps, they are not in Google... They simply don't exist and yet MOZ created an an error with them. Unless they exist and I don't see them. Obviously there's a link to each company and agency site on the site in the about us section, but that's it.
Moz Pro | | CJolicoeur0 -
How to overcome Connection Timeout Status Error?
My website contains 110+ pages in which 70 are CONNECTION TIMEOUT while checking in Screaming Frog. Can someone help me in getting this solved? My website Home Page Sanctum Consulting.
Moz Pro | | Manifeat90 -
Htaccess and robots.txt and 902 error
Hi this is my first question in here I truly hope someone will be able to help. It's quite a detailed problem and I'd love to be able to fix it through your kind help. It regards htaccess files and robot.txt files and 902 errors. In October I created a WordPress website from what was previously a non-WordPress site it was quite dated. I had built the new site on a sub-domain I created on the existing site so that the live site could remain live whilst I created on the subdomain. The site I built on the subdomain is now live but I am concerned about the existence of the old htaccess files and robots txt files and wonder if I should just delete the old ones to leave the just the new on the new site. I created new htaccess and robots.txt files on the new site and have left the old htaccess files there. Just to mention that all the old content files are still sat on the server under a folder called 'old files' so I am assuming that these aren't affecting matters. I access the htaccess and robots.txt files by clicking on 'public html' via ftp I did a Moz crawl and was astonished to 902 network error saying that it wasn't possible to crawl the site, but then I was alerted by Moz later on to say that the report was ready..I see 641 crawl errors ( 449 medium priority | 192 high priority | Zero low priority ). Please see attached image. Each of the errors seems to have status code 200; this seems to be applying to mainly the images on each of the pages: eg domain.com/imagename . The new website is built around the 907 Theme which has some page sections on the home page, and parallax sections on the home page and throughout the site. To my knowledge the content and the images on the pages are not duplicated because I have made each page as unique and original as possible. The report says 190 pages have been duplicated so I have no clue how this can be or how to approach fixing this. Since October when the new site was launched, approx 50% of incoming traffic has dropped off at the home page and that is still the case, but the site still continues to get new traffic according to Google Analytics statistics. However Bing Yahoo and Google show a low level of Indexing and exposure which may be indicative of the search engines having difficulty crawling the site. In Google Analytics in Webmaster Tools, the screen text reports no crawl errors. W3TC is a WordPress caching plugin which I installed just a few days ago to speed up page speed, so I am not querying anything here about W3TC unless someone spots that this might be a problem, but like I said there have been problems re traffic dropping off when visitors arrive on the home page. The Yoast SEO plugin is being used. I have included information about the htaccess and robots.txt files below. The pages on the subdomain are pointing to the live domain as has been explained to me by the person who did the site migration. I'd like the site to be free from pages and files that shouldn't be there and I feel that the site needs a clean up as well as knowing if the robots.txt and htaccess files that are included in the old site should actually be there or if they should be deleted... ok here goes with the information in the files. Site 1) refers to the current website. Site 2) refers to the subdomain. Site 3 refers to the folder that contains all the old files from the old non-WordPress file structure. **************** 1) htaccess on the current site: ********************* BEGIN W3TC Browser Cache <ifmodule mod_deflate.c=""><ifmodule mod_headers.c="">Header append Vary User-Agent env=!dont-vary</ifmodule>
Moz Pro | | SEOguy1
<ifmodule mod_filter.c="">AddOutputFilterByType DEFLATE text/css text/x-component application/x-javascript application/javascript text/javascript text/x-js text/html text/richtext image/svg+xml text/plain text/xsd text/xsl text/xml image/x-icon application/json
<ifmodule mod_mime.c=""># DEFLATE by extension
AddOutputFilter DEFLATE js css htm html xml</ifmodule></ifmodule></ifmodule> END W3TC Browser Cache BEGIN W3TC CDN <filesmatch ".(ttf|ttc|otf|eot|woff|font.css)$"=""><ifmodule mod_headers.c="">Header set Access-Control-Allow-Origin "*"</ifmodule></filesmatch> END W3TC CDN BEGIN W3TC Page Cache core <ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /
RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteRule .* - [E=W3TC_ENC:_gzip]
RewriteCond %{HTTP_COOKIE} w3tc_preview [NC]
RewriteRule .* - [E=W3TC_PREVIEW:_preview]
RewriteCond %{REQUEST_METHOD} !=POST
RewriteCond %{QUERY_STRING} =""
RewriteCond %{REQUEST_URI} /$
RewriteCond %{HTTP_COOKIE} !(comment_author|wp-postpass|w3tc_logged_out|wordpress_logged_in|wptouch_switch_toggle) [NC]
RewriteCond "%{DOCUMENT_ROOT}/wp-content/cache/page_enhanced/%{HTTP_HOST}/%{REQUEST_URI}/_index%{ENV:W3TC_PREVIEW}.html%{ENV:W3TC_ENC}" -f
RewriteRule .* "/wp-content/cache/page_enhanced/%{HTTP_HOST}/%{REQUEST_URI}/_index%{ENV:W3TC_PREVIEW}.html%{ENV:W3TC_ENC}" [L]</ifmodule> END W3TC Page Cache core BEGIN WordPress <ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /
RewriteRule ^index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]</ifmodule> END WordPress ....(((I have 7 301 redirects in place for old page url's to link to new page url's))).... #Force non-www:
RewriteEngine on
RewriteCond %{HTTP_HOST} ^www.domain.co.uk [NC]
RewriteRule ^(.*)$ http://domain.co.uk/$1 [L,R=301] **************** 1) robots.txt on the current site: ********************* User-agent: *
Disallow:
Sitemap: http://domain.co.uk/sitemap_index.xml **************** 2) htaccess in the subdomain folder: ********************* Switch rewrite engine off in case this was installed under HostPay. RewriteEngine Off SetEnv DEFAULT_PHP_VERSION 53 DirectoryIndex index.cgi index.php BEGIN WordPress <ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /WPnewsiteDee/
RewriteRule ^index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /subdomain/index.php [L]</ifmodule> END WordPress **************** 2) robots.txt in the subdomain folder: ********************* this robots.txt file is empty **************** 3) htaccess in the Old Site folder: ********************* Deny from all *************** 3) robots.txt in the Old Site folder: ********************* User-agent: *
Disallow: / I have tried to be thorough so please excuse the length of my message here. I really hope one of you great people in the Moz community can help me with a solution. I have SEO knowledge I love SEO but I have not come across this before and I really don't know where to start with this one. Best Regards to you all and thank you for reading this. moz-site-crawl-report-image_zpsirfaelgm.jpg0 -
Why are my keyword rankings dramatically changing week to week?
I have a site (actual several) that has been up for many years now. Content is changed monthly - regularly. It would always rank in the top 5 positions for 5 or 6 keywords. Now all of a sudden (without any dramatic changes to the website) my ranking results from my weekly Moz reports are dramatically different from week to week - every week. One Keyword will be #1, then the next week it will drop 49 positions, the 3rd week it might be #5, then Not in the Top 50, then #1 again, and so and so on. This happens to all my keywords on my site. I was waiting for Google to finish it's changes to it's algorithms but it's now been long enough and I can't get a handle on what's happening, why it's happening, and what to do about it, and what direction to take with my SEO. My Traffic has actually improved over last year by approx. 25%, but I have also started a PPC campaign for this client. Any suggestions would be great.
Moz Pro | | SummitCom0 -
How to increase page authority
I wonder how to increase the page authority or the domain authority to begin with. It seems you are putting a lot of weight on this in your analysis.
Moz Pro | | wcsinc0