Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
XML Sitemap and unwanted URL parameters
-
We currently don't have an XML sitemap for our site. I generated one using Screaming Frog and it looks ok, but it also contains my tracking url parameters (ref=), which I don't want Google to use, as specified in GWT. Cleaning it will require time and effort which I currently don't have. I also think that having one could help us on Bing.
So my question is: Is it better to submit a "so-so" sitemap than having none at all, or the risks are just too high? Could you explain what could go wrong?
Thanks !
-
Our IT department is on a big project and we won't have any support for almost a year, that's why I was looking at other solutions.
We currently add about 10 to 20 pages a month, so I probably could redo the sitemap once a month, right after the new content is published.
-
Glad I could help
The only other issue I see with this is your sitemap will get outdated quickly if you have a lot of content/pages being added to your site. Additional work or development may be needed to create a fluent sitemap that auto-updates alongside the website.
-
Thanks, I really like your answer.
I should have thought about cleaning it in Excel. I will get right on it !
-
HI Jean-Francois
I would try to keep you sitemap as clean as possible. But could you export all the data into a CSV and clean up the pages using a formula. If you got a full list of your URLs in column A in Excel. Then used the following formula
=LEFT(A1,Find("ref=",A1)-1)
Put this formula into cell B1 and drag the formula down all the rows. This should strip out all of the parameters you do not want. Then simply remove the duplicates and you have your list of URLs to create a clean sitemap.
Let me know if this helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
.xml sitemap showing in SERP
Our sitemap is showing in Google's SERP. While it's only for very specific queries that don't seem to have much value (it's a healthcare website and when a doctor who isn't with us is search with the brand name so 'John Smith Brand,' it shows if there's a first or last name that matches the query), is there a way to not make the sitemap indexed so it's not showing in the SERP. I've seen the "x-robots-tag: noindex" as a possible option, but before taking any action wanted to see if this was still true and if it would work.
Technical SEO | | Kyleroe950 -
Folders in url structure?
Hello, Revamping an out-of-date website and am wondering if I need to include the folders (categories) in the url structure? The proposed structure has 8 main folders. I've been reading that Google is ok if the folder is not included in the url, but is it really? The hesitation I have is that the urls are getting long and the main folder only has only a sub folder beneath it. So, /folder-name/facility-name/treatment-overview. This looks too long, doesn't it? Thanks!
Technical SEO | | lfrazer1230 -
Automate XML Sitemaps
Quick question, which is the best method that people have for automating sitemaps. We publish around 200 times a day and I would like to make sure as soon as we publish it gets updated in the site map. What is the best method of updating a sitemap so it gets updated immediately after it is published.
Technical SEO | | mattdinbrooklyn0 -
Url folder structure
I work for a travel site and we have pages for properties in destinations and am trying to decide how best to organize the URLs basically we have our main domain, resort pages and we'll also have articles about each resort so the URL structure will actually get longer:
Technical SEO | | Vacatia_SEO
A. domain.com/main-keyword/state/city-region/resort-name
_ domain.com/family-condo-for-rent/orlando-florida/liki-tiki-village_ _ domain.com/main-keyword-in-state-city/resort-name-feature _
_ domain.com/family-condo-for-rent/orlando-florida/liki-tiki-village/kid-friend-pool_ B. Another way to structure would be to remove the location and keyword folders and combine. Note that some of the resort names are long and spaces are being replaced dynamically with dashes.
ex. domain.com/main-keyword-in-state-city/resort-name
_ domain.com/family-condo-for-rent-in-orlando-florida/liki-tiki-village_ _ domain.com/main-keyword-in-state-city/resort-name-feature_
_ domain.com/family-condo-for-rent-in-orlando-florida/liki-tiki-village-kid-friend-pool_ Question: is that too many folders or should i combine or break up? What would you do with this? Trying to avoid too many dashes.0 -
XML Sitemap without PHP
Is it possible to generate an XML sitemap for a site without PHP? If so, how?
Technical SEO | | jeffreytrull11 -
URL rewriting from subcategory to category
Hello everybody! I have quite simple question about URL rewriting from subcategory to category, yet I can't find any solution to this problem (due to lack of my deeper apache programming knowledge). Here is my problem/question: we have two website url structures that causes dublicate problems: www.website.lt/language/category/ www.website.lt/language/category/1/ 1 and 2 pages are absolutely same (both also returns 200 OK). What we need is 301 redirect from 2 to 1 without any other deeper categories redirects (like www.website.com/language/category/1/169/ redirecting to .../category/1/ or .../category/). Here goes .htaccess URL rewrite rules: RewriteRule ^([^/]{1,3})/([^/]+)/([^/]+)/([^/]+)/([^/]+)/([^/]+)/$ /index.php?lang=$1&idr=$2&par1=$3&par2=$4&par3=$5&par4=$6&%{QUERY_STRING} [L] RewriteRule ^([^/]{1,3})/([^/]+)/([^/]+)/([^/]+)/([^/]+)/$ /index.php?lang=$1&idr=$2&par1=$3&par2=$4&par3=$5&%{QUERY_STRING} [L] RewriteRule ^([^/]{1,3})/([^/]+)/([^/]+)/([^/]+)/$ /index.php?lang=$1&idr=$2&par1=$3&par2=$4&%{QUERY_STRING} [L] RewriteRule ^([^/]{1,3})/([^/]+)/([^/]+)/$ /index.php?lang=$1&idr=$2&par1=$3&%{QUERY_STRING} [L] RewriteRule ^([^/]{1,3})/([^/]+)/$ /index.php?lang=$1&idr=$2&%{QUERY_STRING} [L] RewriteRule ^([^/]{1,3})/$ /index.php?lang=$1&%{QUERY_STRING} [L] There are other redirects that handles non-www to www and related issues: RedirectMatch 301 ^/lt/$ http://www.domain.lt/ RewriteCond %{HTTP_HOST} ^domain.lt RewriteRule (.*) http://www.domain.lt/$1 [R=301,L] RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_URI} !(.)/$RewriteRule ^(.)$ http://www.domain.lt/$1/ [R=301,L] At this moment we cannot solve this problem with rel canonical (due to our CMS limits). Thanks for your help guys! If You need any other details on our coding, just let me know.
Technical SEO | | jkundrotas0 -
Should XML sitemaps include *all* pages or just the deeper ones?
Hi guys, Ok this is a bit of a sitemap 101 question but I cant find a definitive answer: When we're running out XML sitemaps for google to chew on (we're talking ecommerce and directory sites with many pages inside sub-categories here) is there any point in mentioning the homepage or even the second level pages? We know google is crawling and indexing those and we're thinking we should trim the fat and just send a map of the bottom level pages. What do you think?
Technical SEO | | timwills0 -
Is "last modified" time in XML Sitemaps important?
My Tech lead is concerned that his use of a script to generate XML sitemaps for some client sites may be causing negative issues for those sites. His concern centers around the fact that the script generates a sitemap which indicates that every URL page in the site was last modified at the exact same date and time. I have never heard anything to indicate that this might be a problem, but I do know that the sitemaps I generate for other client sites can choose server response or not. What is the best way to generate the sitemap? Last mod from actual time modified, or all set at one date and time?
Technical SEO | | ShaMenz0