Overview[1]#

URL or A Uniform Resource Locator is defined in RFC 1738 is a reference to a resource that specifies the location of the resource on a computer network and a mechanism for retrieving the resource.

A URL is a specific type of uniform resource identifier URI, although many people use the two terms interchangeably.

A URL implies the means to access an indicated resource, which is not true of every URI.

URLs occur most commonly to reference web pages (http), but are also used for file transfer (ftp), email (mailto), database access (JDBC), and many other applications.

URL Format#

Every HTTP URL consists of the following, in the given order. Several schemes other than HTTP also share this general format, with some variation.
  • schema - the scheme name (commonly called protocol, although not every URL scheme is a protocol, e.g. mailto is not a protocol)
  • a colon,
  • two slashes
  • a host, normally given as a domain name
  • optionally a colon followed by a port number
  • the full path of the resource which can be broken down into:
    • path
    • query_string
    • fragment_id

The scheme says how to connect, the host specifies where to connect, and the remainder specifies what to ask for.

The syntax in more detail is:

scheme://domain:port/path?query_string#fragment_id

Scheme#

The scheme, which in many cases is the name of a protocol (but not always), defines how the resource will be obtained. Examples include http, https, ftp, file and many others. Although schemes are case-insensitive, the canonical form is lowercase.

Domain Name#

The domain name or literal numeric IP address gives the destination location for the URL. A literal numeric IPv6 address may be given, but must be enclosed in [ ] e.g. [db8:0cec::99:123a].

The domain google.com, or its numeric IP address 173.194.34.5, is the address of Google's website.

The domain name portion of a URL is case-insensitive since DNS ignores case:

http://en.example.org/ and HTTP://EN.EXAMPLE.ORG/ both open the same page.

Port#

The port number, given in decimal, is optional; if omitted, the default for the scheme is used.

For example,

http://vnc.example.com:5800 
connects to port 5800 of vnc.example.com, which may be appropriate for a VNC remote control session.

If the port number is omitted for the scheme http: URL, the browser will connect on port 80, the default HTTP port. The default port for a scheme https: request is 443.

Path#

The path is used to specify and perhaps find the resource requested. This path may or may not describe folders on the file system in the web server. It may be very different from the arrangement of folders on the web server. The path is case-sensitive, though it may be treated as case-insensitive by some servers, especially those based on Microsoft Windows.

If the server is case-sensitive and http://en.example.org/wiki/URL is correct, then http://en.example.org/WIKI/URL or http://en.example.org/wiki/url will display an HTTP 404 error page, unless these URLs point to valid resources themselves.

Query String#

The query string contains case-sensitive data to be passed to software running on the server. It may contain name/value pairs separated by ampersands, for example:
?first_name=John&last_name=Doe.

The "?" indicates the start of the Query String and each additional parameter, if present, is separated by "&"

The fragment identifier#

The fragment identifier, if present, specifies a part or a position within the overall resource or document.

When used with HTML, it usually specifies a section or location within the page, and used in combination with Anchor elements or the "id" attribute of an element, the browser is scrolled to display that part of the page.

URIs, URLs, and URNs#

What is the difference between URIs, URLs, and URNs?

More Information#

There might be more information for this subject on one of the following:

Add new attachment

Only authorized users are allowed to upload new attachments.
« This page (revision-9) was last changed on 01-Dec-2016 12:58 by jim