URL shortening

From Infogalactic: the planetary knowledge core
Jump to: navigation, search

<templatestyles src="Module:Hatnote/styles.css"></templatestyles>

URL shortening is a technique on the World Wide Web in which a Uniform Resource Locator (URL) may be made substantially shorter in length and still direct to the required page. This is achieved by using a redirect on a domain name that is short, which links to the web page that has a long URL. For example, the URL "http://en.wikipedia.org/wiki/URL_shortening" can be shortened to "http://tinyurl.com/urlwiki". This is especially convenient for messaging technologies that limit the number of characters that may be used in a message, such as SMS, and for reducing the amount of typing required if the reader is copying a URL from a print source. In November 2009, the shortened links of the URL shortening service Bitly were accessed 2.1 billion times.[1]

Other uses of URL shortening are to "beautify" a link, track clicks, or disguise the underlying address. Although disguising of the underlying address may be desired for legitimate business or personal reasons, it is open to abuse[2] and for this reason, some URL shortening service providers have found themselves on spam blacklists, because of the use of their redirect services by sites trying to bypass those very same blacklists. Some websites, such as is.gd, prevent short, redirected URLs from being posted.

Purposes

There are several reasons to use URL shortening. Often regular unshortened links may be aesthetically unpleasing. Many web developers pass descriptive attributes in the URL to represent data hierarchies, command structures, transaction paths or session information. This can result in URLs that are hundreds of characters long and that contain complex character patterns. Such URLs are difficult to memorize, type-out or distribute. As a result, long URLs must be copied-and-pasted for reliability. Thus, short URLs may be more convenient for websites or hard copy publications (e.g. a printed magazine or a book), the latter often requiring that very long strings be broken into multiple lines (as is the case with some e-mail software or internet forums) or truncated.

On Twitter and some instant-messaging services, there is a limit to the number of characters a message can carry. Using a URL shortener can allow linking to web pages which would otherwise violate this constraint. Some shortening services, such as goo.gl, tinyurl.com, and bit.ly can generate URLs that are human-readable, although the resulting strings are longer than those generated by a length-optimized service. Finally, URL shortening sites provide detailed information on the clicks a link receives, which can be simpler than setting up an equally powerful server-side analytics engine.

URLs encoded in two-dimensional barcodes such as QR code are often shortened by a URL shortener in order to reduce the printed area of the code or allow printing at lower density in order to improve scanning reliability.

Registering a short URL

An increasing number of websites are registering their own short URLs to make sharing SMS easier. This can normally be done online, at the web pages of a URL shortening service. Short URLs often circumvent the intended use of top-level domains for indicating the country of origin; domain registration in many countries requires proof of physical presence within that country, although a redirected URL has no such guarantee.

Techniques

<templatestyles src="Module:Hatnote/styles.css"></templatestyles>

In URL shortening, every long URL is associated with a unique key, which is the part after http://top-level domain name/, for example http://tinyurl.com/m3q2xt has a key of m3q2xt. Not all redirection is treated equally; the redirection instruction sent to a browser can contain in its header the HTTP status 301 (permanent redirect), 302, or 307 (temporary redirect).

There are several techniques to implement a URL shortening. Keys can be generated in base 36, assuming 26 letters and 10 numbers. In this case, each character in the sequence will be 0, 1, 2, ..., 9, a, b, c, ..., y, z. Alternatively, if uppercase and lowercase letters are differentiated, then each character can represent a single digit within a number of base 62 (26 + 26 + 10). In order to form the key, a hash function can be made, or a random number generated so that key sequence is not predictable. Or users may propose their own keys. For example, http://en.wikipedia.org/w/index.php?title=TinyURL&diff=283621022&oldid=283308287 can be shortened to http://bit.ly/tinyurlwiki.

Not all protocols are capable of being shortened as of 2011, although protocols such as http, https, ftp, ftps, mailto, mms, rtmp, rtmpt, ed2k, pop, imap, nntp, news, ldap, gopher, dict and dns are being addressed by such services as URL Shortener. Typically, data: and javascript: URLs are not supported for security reasons. Some URL shortening services support the forwarding of mailto URLs, as an alternative to address munging, to avoid unwanted harvest by web crawlers or bots. This may sometimes be done using short, CAPTCHA-protected URLs, but this is not common.[3]

Makers of URL shorteners usually register domain names with less popular or esoteric Top-level domains in order to achieve a short URL and a catchy name, often using domain hacks. This results in registration of different URL shorteners with a myriad of different countries, leaving no relation between the country where the domain has been registered and the URL shortener itself or the shortened links. Top-level domains of countries such as Libya (.ly), Samoa (.ws), Mongolia (.mn), Malaysia (.my) and Liechtenstein (.li) have been used as well as many others. In some cases, the political or cultural aspects of the country in charge of the top-level domain may become an issue for users and owners,[4] but this is not usually the case.

Tinyarro.ws, urlrace.com, and qoiob.com use Unicode characters to achieve the shortest URLs possible, since more condensed URLs are possible with a given number of characters compared to those using a standard Latin alphabet.[citation needed]

Statistics

Services may record inbound statistics, which may be viewed publicly by others.[5]

Expiry and time-limited services

Many providers of shortened URLs claim that they will "never expire" (there is always the implied small print: so long as we do not decide to discontinue this service—there is no contract to be breached by a free service, regardless of "promises"—and remain in business).

A permanent URL is not necessarily a good thing. There are security implications, and obsolete short URLs remain in existence and may be circulated long after they cease to point to a relevant or even extant destination. Sometimes a short URL is useful simply to give someone over a telephone conversation for a one-off access or file download, and no longer needed within a couple of minutes.

Some URL shorteners offer a time-limited service, which will expire after a specified period. Services available include an ordinary, easy-to-say word as the URL with a lifetime from 5 minutes up to 24 hours, creation of a URL which will expire on a specified date or after a specified period, creation of a very-short-lived URL of only 5 characters for typing into a smartphone, restriction by the creator of the total number of uses of the URL, and password protection. A Microsoft Security Brief recommends the creation of short-lived URLs, but for reasons explicitly of security rather than convenience.[6]

History

An early reference is US Patent 6957224, which describes

...a system, method and computer program product for providing links to remotely located information in a network of remotely connected computers. A uniform resource locator (URL) is registered with a server. A shorthand link is associated with the registered URL. The associated shorthand link and URL are logged in a registry database. When a request is received for a shorthand link, the registry database is searched for an associated URL. If the shorthand link is found to be associated with a URL, the URL is fetched, otherwise an error message is returned.[7]

The patent was filed in September 2000; while the patent was issued in 2005, patent applications are made public within 18 months of filing.

Another reference to URL shortening was in 2001.[8] The first notable URL shortening service, TinyURL, was launched in 2002. Its popularity influenced the creation of at least 100 similar websites,[9] although most are simply domain alternatives. Initially Twitter automatically translated long URLs using TinyURL, although it began using bit.ly in 2009.[10]

On 14 August 2009 WordPress announced the wp.me URL shortener for use when referring to any WordPress.com blog post.[11] In November 2009, shortened links on bit.ly were accessed 2.1 billion times.[12] Around that time, bit.ly and TinyURL were the most widely used URL-shortening services.[12]

One service, tr.im, stopped generating short URLs in 2009, blaming a lack of revenue-generating mechanisms to cover costs and Twitter's default use of the bit.ly shortener, and questioning whether other shortening services could be profitable from URL shortening in the longer term.[13] It resumed for a time,[14] then closed.

The shortest possible long-term URLs were generated by NanoURL from December 2009 until about 2011, associated with the top-level .to (Tonga) domain, in the form http://to./xxxx, where xxxx represents a sequence of random numbers and letters.[15]

On 14 December 2009 Google announced a service called Google URL Shortener at goo.gl, which originally was only available for use through Google products (such as Google Toolbar and FeedBurner)[16] and extensions for Google Chrome.[17] On 21 December 2009, Google introduced a YouTube URL Shortener, youtu.be.[18] From September 2010 Google URL Shortener became available via a direct interface. The goo.gl service provides analytics details and a QR code generator.

Advantages

The main advantage of a short link is that it is, in fact, short, and can be easily communicated and entered without error. To a very limited extent it may obscure the destination of the URL, though easily discoverable; this may be advantageous, disadvantageous, or irrelevant. A short link which expires, or can be terminated, has some security advantages.

Shortcomings

Lua error in package.lua at line 80: module 'strict' not found.

Abuse

URL shortening may be utilized by spammers or for illicit internet activities. As a result, many have been removed from online registries or shut down by web hosts or internet service providers.

According to Tonic Corporation, the registry for .to domains, it is "very serious about keeping domains spam free" and may remove URL shortening services from their registry if the service is abused.[19]

In addition, "u.nu" made the following announcement upon closing operations:

The last straw came on September 3, 2010, when the server was disconnected without notice by our hosting provider in response to reports of a number of links to child pornography sites. The disconnection of the server caused us serious problems, and to be honest, the level and nature of the abuse has become quite demoralizing. Given the choice between spending time and money to find a different home, or just giving up, the latter won out.[20]

Google's url-shortener discussion group has frequently included messages from frustrated users reporting that specific shortened URLs have been disabled after they were reported as spam.[21]

A study in May 2012 showed that 61% of URL shorteners had shut down (614 of 1002).[22] The most common cause cited was abuse by spammers.

Linkrot

The convenience offered by URL shortening also introduces potential problems, which have led to criticism of the use of these services. Short URLs, for example, will be subject to linkrot if the shortening service stops working; all URLs related to the service will become broken. It is a legitimate concern that many existing URL shortening services may not have a sustainable business model in the long term. This worry was highlighted by a statement from tr.im in August 2009 (see above).[12] In late 2009, the Internet Archive started the "301 Works" projects,[23] together with twenty collaborating companies (initially), whose short URLs will be preserved by the project.[12] The URL shortening service ur1.ca provides its entire database as a file download, so if its website stops working, other websites may be able to provide ways to correct broken links to URLs shortened with its service. A circumvention could be that a website provided its own shortlinks instead of relying on a shortening service – but this is not common.

Transnational law

Shortened internet links typically use foreign country domain names, and are therefore under the jurisdiction of that nation. Libya, for instance, exercised its control over the .ly domain in October 2010 to shut down vb.ly for violating Libyan pornography laws. Failure to predict such problems with URL shorteners and investment in URL shortening companies may reflect a lack of due diligence.[24]

Blocking

Some websites prevent short, redirected URLs from being posted.

In 2009, the Twitter network replaced TinyURL with Bit.ly as its default shortener of links longer than twenty-six characters.[10] In April 2009, TinyURL was reported to be blocked in Saudi Arabia.[25] Yahoo! Answers blocks postings that contain TinyURLs,[citation needed] and Wikipedia does not accept links by any URL shortening services in its articles.[26]

Advertising

Sites such as Adf.ly use a number of interstitial advertising techniques to generate revenue. This may deter readers.

Privacy and security

Users may be exposed to privacy issues through a URL shortening service's ability to track a user's behavior across many domains.

A short URL obscures the target address and can be used to redirect to an unexpected site. Examples of this are rickrolling, redirecting to shock sites, or to affiliate websites. The short URL can allow blacklisted URLs to be accessed, bypassing blocks; this facilitates redirection of a user to blacklisted scam pages or pages containing malware or XSS attacks. TinyURL tries to disable spam-related links from redirecting.[27] ZoneAlarm, however, has warned its users: "TinyURL may be unsafe. This website has been known to distribute spyware." TinyURL countered this problem by offering an option to view a link's destination before using a shortened URL. This ability is installed on the browser via the TinyURL website and requires the use of cookies.[28] A destination preview may also be obtained by prefixing the word "preview" to the TinyURL URL; for example, the destination of http://tinyurl.com/8kmfp is revealed by entering http://preview.tinyurl.com/8kmfp. Other URL shortening services provide a similar destination display.[29] Security professionals suggest that users check a short URL's destination before accessing it, following an instance where shortening service cli.gs was compromised, exposing millions of users to security uncertainties.[30] There are several web applications that can display the destination of a shortened URL.

Some URL shortening services filter their links through bad-site screening services such as Google Safe Browsing. Many sites that accept user-submitted content block links, however, to certain domains in order to cut down on spam and for this reason, known URL redirection services are often themselves added to spam blacklists.

Additional layer of complexity

Short URLs, although making it easier to access what might otherwise be a very long URL or user-space on an ISP server, add an additional layer of complexity to the process of retrieving web pages. Every access requires more requests (at least one more DNS lookup, though it may be cached, and one more HTTP/HTTPS request), thereby increasing latency, the time taken to access the page, and also the risk of failure, since the shortening service may become unavailable. Another operational limitation of URL shortening services is that browsers do not resend POST bodies when a redirect is encountered. This can be overcome by making the service a reverse proxy, or by elaborate schemes involving cookies and buffered POST bodies, but such techniques present security and scaling challenges, and are therefore not used on extranets or Internet-scale services.[original research?]

URL shortening services

See also

References

  1. Goo.gl Challenges Bit.ly as King of the Short – New York Times, 14 December 2009
  2. Lua error in package.lua at line 80: module 'strict' not found.
  3. Lua error in package.lua at line 80: module 'strict' not found.
  4. Lua error in package.lua at line 80: module 'strict' not found.
  5. Lua error in package.lua at line 80: module 'strict' not found.
  6. Lua error in package.lua at line 80: module 'strict' not found.
  7. US patent 6957224, Nimrod Megiddo and Kevin S. McCurley; assigned to IBM corp., "Efficient retrieval of uniform resource locators", issued 2005-10-18 
  8. "Comment thread 8916". Metafilter. 10 June 2001; Announcement of URL shortening service available at makeashorterlink.com
  9. "URL Shortening Services" shortenurl – Supported URL shortening services
  10. 10.0 10.1 Wortham, Jenna (7 May 2009) "Bit.ly Eclipses TinyURL on Twitter" Bits (blog at The New York Times). Retrieved 1 January 2011.
  11. "WP.me — Shorten Your Links" WordPress. 14 August 2009.
  12. 12.0 12.1 12.2 12.3 Ahmed, Murad (7 December 2009). "New Project in Scramble To Save Vanishing Internet Links — The Internet Archive Is Fighting To Preserve Shortened Web Links Created by Free Online Services That May Be Running Out of Money". The Times. Retrieved 1 January 2011.
  13. tr.im R.I.P. blog.tr.im
  14. tr.im Resurrected. blog.tr.im
  15. Lua error in package.lua at line 80: module 'strict' not found.
  16. Lua error in package.lua at line 80: module 'strict' not found.
  17. Lua error in package.lua at line 80: module 'strict' not found.
  18. Lua error in package.lua at line 80: module 'strict' not found.
  19. Lua error in package.lua at line 80: module 'strict' not found.
  20. http://u.nu/unu-discontinued "u.nu :: discontinued."
  21. Lua error in package.lua at line 80: module 'strict' not found.
  22. Lua error in package.lua at line 80: module 'strict' not found.
  23. Lua error in package.lua at line 80: module 'strict' not found.
  24. Lua error in package.lua at line 80: module 'strict' not found.
  25. Lua error in package.lua at line 80: module 'strict' not found.
  26. Lua error in package.lua at line 80: module 'strict' not found.
  27. Krebs, Brian (13 June 2006). "Spam Spotted Using TinyURL". Security Fixes (blog at The Washington Post). Retrieved 1 January 2011.
  28. Lua error in package.lua at line 80: module 'strict' not found.
  29. Lua error in package.lua at line 80: module 'strict' not found.
  30. "Updated: Cligs Got Hacked — Restoration from Backup Started" Blog at Cli.gs (16 June 2009).

External links