Attack of the dummy urls -- what to do?
-
It occurs to me that a malicious program could set up thousands of links to dummy pages on a website:
www.mysite.com/dynamicpage/dummy123
www.mysite.com/dynamicpage/dummy456
etc..
How is this normally handled? Does a developer have to look at all the parameters to see if they are valid and if not, automatically create a 301 redirect or 404 not found? This requires a table lookup of acceptable url parameters for all new visitors.
I was thinking that bad url names would be rare so it would be ok to just stop the program with a message, until I realized someone could intentionally set up links to non existent pages on a site.
-
Hello,
I am also having this issue with hundreds of dummy urls that never existed as a part of our website's blog. Do I go into parameters and specify each of the dummy urls to avoid this?
Thanks in advance for any help!!!! (and sorry to piggyback this question Theodore-hope you don't mind!)
-
Thanks Ray. Appreciate the advice!
-
It's great that you've identified issues like this. I also suggest that if you know certain parameters are generated often and not necessary to index, that you go into your Google Webmaster Tools account > Crawl > URL Parameters and proactively set the crawl rate to 'No URLs' is appropriate. I do this with certain custom parameters for sites that are prone to having these extra URLs indexed mistakenly.
-
Hi Ray-pp,
Thanks for your answer. I'm not getting anything significant, but occasionally a bot will come with extra stuff added to the parameter names, so it got me to thinking a malicious program or nasty competitor might want to do that to cause havoc. My understanding is that 404s don't hurt SEO ranking from Google, but I was thinking that the way things are set up now no-one would get a 404 and in fact Google would index the 'bad' pages, so maybe I needed to do something proactively to 404 or 301 such pages so they would never get put into an index at all.
Since my site has lots of dynamically generated pages, I've had my share of surprises, and am just trying to avoid any new ones!
-
Hi Theodore - You pose an interesting problem, are you currently experiencing this issue? I don't see why someone would create a bunch of random non-existent links to your site, but if they did (and the pages were receiving low quality traffic) then I would proactively disavow those domains that created the links. That would be enough to prevent any penalties you're afraid of receiving.
If, however, you're noticing that specific 404 pages are receiving quality traffic (maybe an old page was removed but good traffic is still sent to the page) then you would want to 301 that page to its closest relative page that deserves the traffic and authority.
Does that help? Maybe a little more information around you specific problem would allow me to tailor the advice better.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
When the site's entire URL structure changed, should we update the inbound links built pointing to the old URLs?
We're changing our website's URL structures, this means all our site URLs will be changed. After this is done, do we need to update the old inbound external links to point to the new URLs? Yes the old URLs will be 301 redirected to the new URLs too. Many thanks!
Intermediate & Advanced SEO | | Jade1 -
Url design for automobile parts
Hi All, Im designing the url and im confused, need your experts advice engine-oil is a category I will display car truck, bike oils only
Intermediate & Advanced SEO | | Rahim119
Car > in this page I will display engine oils only related to car
Hyundia> in this page I will display engine oils only related to hyundia
i30 > in this page I will display engine oils only related to i30 models
Petrol > in this page I will display engine oils only related to petrol So im planning for www.xyz.com/engine-oil/car/Hyundia/i30/Petrol or should I write like this below xyz.com/c-engine-oil.html
xyz.com/c-car-engine-oil.html
xyz.com/c-hyundia--car-engine-oil.html
xyz.com/c-hyundia-i30-car-engine-oil.html
xyz.com/c-hyundia-i30-Petrol-car-engine-oil.html and also i heard i should keep 3 folders max.. so confused..
i have lot of car parts like engine oil, gear oil, tyres, battery,etc(categories)0 -
WordPress Duplicate URLs?
On my site, there are two different category bases leading to the exact same page. My developer claims that this is a common — and natural — occurrence when using WordPress, and that there's not a duplicate content issue to worry about. Is this true? Here's an example of the correct url. and... Here's an example of the same exact content, but using a different url. Notice that one is coming from /topics and the other is coming from /authors base. My understanding is that this is bad. Am I wrong?
Intermediate & Advanced SEO | | JasonMOZ1 -
Long urls created by filters (not with query parameters)
A website adds subfolders to a category URL for each filter that's selected. In a crawl of the website some of these URLs reach over 400 characters. For example, if I select shoe size 5, 5.5 and 6, white and blue colour, price $70-$100, heel and platform styles, the URL will be as follows: www.example.com/shoes/womens/filters/shoe-size--5--5.5--6/color--white--blue/price--70-100/style--heel--platform There is a canonical that points to www.example.com/shoes/womens/ so it isn't a duplicate content issue. But these URLs still get crawled. How would you handle this? It's not a great system so I'm tempted to tell them to start over with best practice recommendations, but maybe I should just tell them to block the "/filters/" folder from crawlers? For some products however, filtered content would be worth having in search indexes (e.g. colour).
Intermediate & Advanced SEO | | Alex-Harford0 -
Search traffic decline after redesign and new URL
Howdy Mozzers I’ve been a Moz fan since 2005, and been doing SEO since. This is my first major question to the community! I just started working for a new company in-house, and we’ve uncovered a serious problem. This is a bit of a long one, so I’m hoping you’ll stick it out with me! ***Since the images aren't working, here's a link to the google doc with images. https://docs.google.com/document/d/1I-iLDjBXI4d59Kl3uRMwLvpihWWKF3bQFTTNRb1R3ZM/edit?usp=sharing Background The site has gone through a few changes in the past few years. Drupal 5 and 6 hosted at bcbusinessonline.ca and now on Drupal 7 hosted at bcbusiness.ca. The redesigned responsive design site launched on January 9th, 2013. This includes changing the structure of the URL’s, such as categories, tags, and articles. We submitted a change of address through GWT shortly after the change. Problem Organic site traffic is down 50% over the last three months. Below, Google analytics, and Google Webmaster Tools shows the decline. *They used the same UA number for Google analytics, so that’s why the data is continuous Organic traffic to the site. January 2011 - Dips in January are because of the business crowd on holidays. Google Webmaster Tools data exported for bcbusiness.ca starting as far back as I could get. Redirects During the switch, the site went from bcbusinessonline.ca to bcbusiness.ca. They were implemented as 302’s on January 9th, 2013 to test, then on January 15th, they were all made 301’s. Here is how they were set up: Original: http://www.bcbusinessonline.ca/bcb/bc-blogs/conference/2010/10/07/11-phrases-never-use-your-resume --301-- http://www.bcbusiness.ca/bcb/bc-blogs/conference/2010/10/07/11-phrases-never-use-your-resume --301-- http://www.bcbusiness.ca/careers/11-phrases-never-to-use-on-your-resume Canonical issue On bcbusiness.ca, there are article pages (example) that are paginated. All of the page 2 to page N were set to the first page of the article. We addressed this issue by removing the canonical tag completely from the site on April 16th, 2013. Then, by walking through the Ayima Pagination Guide we decided for immediate and least work choice was to noindex, follow all the pages that simply list articles (example). Google Algorithm Changes (Penguin or Panda) According to SEOmoz Google Algorithm Changes there is no releases that could have impacted our site at the February 20th ballpark. However - Sitemap We have a sitemap submitted to Google Webmaster Tools, and currently have 4,229 pages indexed of 4,312 submitted. But there are a few pages we looked at that there is an inconsistency between what GWT is reporting and what a “site:” search reports. Why would the submit to index button be showing, if it’s in the index? That page is in the sitemap. Updated: 2012-11-28T22:08Z Change Frequency: Yearly Priority: 0.5 *GWT Index Stats from bcbusiness.ca What we looked at so far The redirects are all currently 301’s GWT is reporting good DNS, Server Connectivity, and Robots.txt Fetch We don’t have noindex or nofollow on pages where we haven’t intended them to be. Robots.txt isn’t blocking GoogleBot, or any pages we want to rank. We have added nofollow to all ‘Promoted Content’ or paid advertising / advertorials We had TextLinkAds on our site at one point but I removed them once I satarted working here (April 1). Sitemaps were linking to the old URL, but now updated (April)
Intermediate & Advanced SEO | | Canada_wide_media1 -
Help!!! Am I being Attacked???
Hello, I do not believe so much in spammy links attacks and I definitely do not believe my site is worth attacking. However, I'm seeing new links pointing to my site that I have no idea where they come from. I just spotted three articles on a poor crappy article site with exact match keywords point to me. The articles are completely unique (copyscaped them) and they were posted according to the site time stamp during Oct and Nov 2012. (And they Appear in the WMT recently discovered links from more or less the same time). What to do (besides for disavowing this domain)? Thanks
Intermediate & Advanced SEO | | BeytzNet0 -
Submitting URLs multiple times in different sitemaps
We have a very dynamic site, with a large number of pages. We use a sitemap index file, that points to several smaller sitemap files. The question is: Would there be any issue if we include the same URL in multiple sitemap files? Scenario: URL1 appears on sitemap1. 2 weeks later, the page at URL1 changes and we'd like to update it on a sitemap. Would it be acceptable to add URL1 as an entry in sitemap2? Would there be any issues with the same URL appearing multiple times? Thanks.
Intermediate & Advanced SEO | | msquare0 -
Sitemap - % of URL's in Google Index?
What is the average % of links from a sitemap that are included in the Google index? Obviously want to aim for 100% of the sitemap urls to be indexed, is this realistic?
Intermediate & Advanced SEO | | stats440