Spamdexing
|
Spamdexing or search engine spamming is the practice of deliberately and dishonestly modifying HTML pages to increase the chance of them being placed close to the beginning of search engine results, or to influence the category to which the page is assigned in a dishonest manner. Many designers of web pages try to get a good ranking in search engines and design their pages accordingly. Spamdexing refers exclusively to practices that are dishonest and mislead search and indexing programs to give a page a ranking it does not deserve.
People who do this are called search engine spammers. The word is a portmanteau of spamming and indexing (as well as a pun on spandex.)
Search engines use a variety of algorithms to determine relevancy ranking. Some of these include determining whether the search term appears in the META keywords tag, others whether the search term appears in the body text of a web page. A variety of techniques are used to spamdex, including listing chosen keywords on a page in small-point font face the same colour as the page background (rendering it invisible to humans but not search engine web crawlers).
Search engine spammers are generally aware that the content that they promote is not very useful or relevant to the ordinary internet surfer. They try to use methods that will make the website appear above more relevant websites in the search engine listings.
Techniques
Here are some common spamdexing techniques:
- Hidden or invisible text
- Disguising keywords and phrases by making them the same (or almost the same) colour as the background, using a tiny font size or hiding them within the HTML code such as no frame sections, ALT attributes and no script sections. This is useful to make a page appear to be relevant in a way that makes it more likely to be found. Example: A promoter of a Ponzi scheme wants to attract web surfers to a site where he advertises his scam. He places hidden text appropriate for a fan page of a popular music group on his page hoping that the page will be listed as a fan site and receive many visits from music lovers.
- Keyword stuffing (also known as keyword spamming)
- Repeated use of a word to increase its frequency on a page. Older versions of indexing programs simply counted how often a keyword appeared, and used that to determine relevance levels. Most modern search engines have the ability to analyze a page for Keyword stuffing and determine whether the frequency is above a "normal" level.
- Meta tag stuffing
- Repeating keywords in the Meta tags more than once, and using keywords that are unrelated to the site's content.
- Hidden links
- Putting links where visitors will not see them in order to increase link popularity.
- Mirror websites
- Hosting of multiple websites all with the same content but using different URL's. Some search engines give higher rank to results where the keyword searched for appears in the URL.
- Gateway or doorway pages
- Creating low-quality web pages that contain very little content but are instead stuffed with very similar key words and phrases. They are designed to rank highly within the search results. A doorway page will generally have "click here to enter" in the middle of it.
- Page redirects
- Taking the user to another page without his or her intervention, e.g. using META refresh tags, CGI scripts, Java, JavaScript, Server side redirects or server side techniques.
- Cloaking
- Sending to a search engine a version of a web page different from what web surfers see.
- Code swapping
- Optimizing a page for top ranking, then swapping another page in its place once a top ranking is achieved.
- Link spamming
- Link spam takes advantage of Google's PageRank algorithm, which gives a higher ranking to a website the more other websites link to it. A spammer may create multiple web sites at different domain names that all link to each other. Another technique is to take advantage of web applications such as weblogs and wikis that display hyperlinks submitted by anonymous or pseudonymous users. Link farms are another technique.
- Referrer log spamming
- When someone accesses a web page, i.e. the referee, by following a link from another web page, i.e. the referrer, the referee is given the address of the referrer by the person's internet browser. Some websites have a referrer log which shows which pages link to that site. By having a robot randomly access many sites enough times, with a message or specific address given as the referrer, that message or internet address then appears in the referrer log of those sites that have referrer logs. Since some search engines base the importance of sites by the number of different sites linking to them, referrer log spam may be used to increase the search engine rankings of the spammer's sites, by getting the referrer logs of many sites to link to them.
Spamdexing often gets confused with legitimate search engine optimization (SEO) techniques, which do not involve deceit.
Spamming involves getting web sites more exposure than they deserve for their keywords, leading to unsatisfactory search results. Optimization involves getting web sites the rank they deserve on the most targeted keywords, leading to satisfactory search experiences. To be sure, there is much gray area between the two extremes. The root problem is that search engine administrators and web site builders have different agendas: the search engine wants to present valuable search results, the webmaster just wants to come up first, particularly if he/she runs a commercial website and needs visitor Traffic from search engines and directories. For that reason, many search engine administrators say that any form of search engine optimization used to improve a website's page rank is nothing else than spamdexing.
Many search engines check for instances of spamdexing and will remove suspect pages from their indexes.
In 2002, search engine manipulator SearchKing filed suit in an Oklahoma court against the search engine Google. SearchKing's claim was that Google's tactics to prevent spamdexing constituted an unfair business practice. This may be compared to lawsuits which email spammers have filed against spam-fighters, as in various cases against MAPS and other DNSBLs. In January of 2003, the court pronounced a summary judgment in Google's favor. [1] (http://research.yale.edu/lawmeme/modules.php?name=Downloads&d_op=search&query=SearchKing)
This article is part of the Spamming series.
|
E-mail spam | Messaging spam | Newsgroup spam | Spamdexing Blog spam | Mobile phone spam | VoIP spam |
Make money fast | Advance fee fraud | Lottery scam | Phishing |
History of spamming |
Stopping e-mail abuse | DNSBL |
External links
- Assocation of Search Engine Spammers (http://www.aosep.com/) Parody - But methods listed are real.
- The Search Engine Spam Police (http://searchenginewatch.com/searchday/02/sd0115-spam-police.html)
- Search Engine Spamming - A Moment of Clarity (http://www.outfront.net/tutorials_02/business/sespamming.htm)
- Sins of The Internet: Spamming Search Engines (http://www.internet-tips.net/Legal/sins_searchspam.htm)
- Search Engine Damage Control (http://www.searchengineguide.com/wi/2002/1030_wi1.html)
- The Classification of Search Engine Spam (http://www.ebrandmanagement.com/whitepapers/spam-classification/)
- Online tool that detects spam techniques on web pages (http://tool.motoricerca.info/spam-detector/)
- Spamdexing Links (http://spamdexing.org/)de:Suchmaschinen-Spamming