Jump to content

Recommended Posts

Posted

The difference between URI and URL seems to be that URI can address anything on the Web whereas URL seems specific to documents and some other things. To be honest that doesnt really make much sense and for all intents and purposes I'd say they were the same but I'm sure someone will point out the error of my ways -

 

http://www.webopedia.com/TERM/U/URI.html

http://www.webopedia.com/TERM/U/URL.html

 

The difference between -

 

http://www.xxx.com/'>http://www.xxx.com/

 

and

 

http://www.xxx.com

 

is practically nothing. http://www.xxx.com is the server address (which is resolved via a DNS query to an IP address that can be accessed more directly) and this is connected to via socket based connections on port 80 (default port) and then a GET or POST request is made to the server for a particular file. If the URL was http://www.xxx.com/'>http://www.xxx.com/cheese, the request line would be "GET /cheese" followed by various request data (any cookies, acceptable languages, and extensions etc). Accessing just http://www.xxx.com or http://www.xxx.com/'>http://www.xxx.com/ will do "GET /" (the browser will add the / for you if you dont put it on the end) which just means to try to access the base dir that that server or hostname/dns entry resolves to (this will then be passed through on the webserver to perhaps an index file (index.html, index.php etc), a directory listing or perhaps a page error. The thing is it just says "i want the base dir" but you dont really need it in most browsers as they will automatically use it anyway if you don't use it at the end. [Edit] And as skuinders says below, if you use an address other than the simple hostname (ie you are looking for a specific file) and you specify http://www.cheese.com/test and there is no file "test", most good webservers will redirect you to the directory if it exists. It is however the browser that adds the / if you do not type it with a simple address url as can be found if you watch the packets sent.

 

The difference between .htm and .html is again, pretty much nothing. As far as I am aware, .htm was used when there was a small limit on the amount of characters in a filename, meaning that using .htm and sticking to the typical 3 letter extensions was better and meant that you could have a decent file name. Now that the limits are much larger, .html is used as it is only an extra letter and it states exactly what it is. As far as I am aware, not many people use .htm any more and the standard generally seems to be .html for static html pages (although you can send a text/html content-type with the file and use any other extension, this is done with things like php, perl, asp etc).

Posted

The general rule of thumb is to always put the / at the end of the URL. Some (bad) sites even require this. When you leave it out, it basically makes the server do a little more work.

 

http://www.blah.com/stuff/'>http://www.blah.com/stuff/

http://www.blah.com/stuff

 

where stuff is a directory that contains a file: index.html

 

If I type the first URL, the server looks in the directory 'stuff' for a file 'index.htm[l]' (or others, depending on the server configuration)

 

If I type the second URL, the server first checks to see if 'stuff' is the name of a file in the top directory, it will find that it is a directory and then it looks in the directory stuff for the file 'index.htm[l]'

 

So, you are helping the server out a bit by telling it that stuff is a directory and that it should look in there for a default file.

 

Really, it's not a big deal, but it is a good habit to include the trailing /.

 

.htm and .html can be used interchangeably. The .htm extension comes from the limitations on file names on the old DOS systems. The .html extension was the original extension for HTML documents. Most die hard UNIX/Linux folk (including myself) prefer the .html extension.

Posted

Interestingly, there does seem to be a difference between

 

index.htm and,

index.html,

 

At least if you have two files in the directory/folder, either the server or the browser loads one and not the other. (I forget which). This can mess you up if you are just trying to upload a new version of your default page to a directory.

Posted

With mine, it loads .html in preference to .htm (i assume you are talking about index files), but that is probably due to the order you specify them in the webserver config (ill see if i can fish out the line in the config now).

 

[Edit]

 

Found it -

 

###

### DirectoryIndex: Name of the file or files to use as a pre-written HTML

### directory index. Separate multiple entries with spaces.

###

<IfModule mod_dir.c>

DirectoryIndex index.html index.html.var index.php index.php3 index.shtml index.cgi index.pl index.htm Default.htm default.htm

</IfModule>

Posted

 

HTTP URLs

 

 

HTTP stands for HyperText Transport Protocol. HTTP servers are commonly used for serving hypertext documents, as HTTP is an extremely low-overhead protocol that capitalizes on the fact that navigation information can be embedded in such documents directly and thus the protocol itself doesn't have to support full navigation features like the FTP and Gopher protocols do.

A file called "foobar.html" on HTTP server "www.yoyodyne.com" in directory "/pub/files" corresponds to this URL:

 

 

http://www.yoyodyne.com/pub/files/foobar.html'>http://www.yoyodyne.com/pub/files/foobar.html

 

The default HTTP network port is 80; if a HTTP server resides on a different network port (say, port 1234 on http://www.yoyodyne.com), then the URL becomes:

http://www.yoyodyne.com:1234/pub/files/foobar.html

 

 

the bold line is abit contradictory to you people have said.

 

I thought it should be, pub/files/ ??

Posted
Interestingly' date=' there does seem to be a difference between

 

index.htm and,

index.html,

 

At least if you have two files in the directory/folder, either the server or the browser loads one and not the other. (I forget which). This can mess you up if you are just trying to upload a new version of your default page to a directory.[/quote']

 

 

It's server specified which one is the root path for that folder, I always use index.php and therefore make sure that is first on the list when I install webservers. It's quite common with sites like geosites to have main.html instead of index.htm or .html...

 

There is a difference between them because they have a different name, but .html and .htm is genrally interchangeable... (as long as people linking to them know what they are_

Posted
the bold line is abit contradictory to you people have said.

 

I thought it should be' date=' pub/files/ ??[/quote']

 

I think you are overcomplicating this. They are just saying the same thing. Its just that you can say a directory is /pub/files, as you know its a directory. When accessing it, you would say /pub/files/ so as not to confuse it with a normal file called "files". Also, the reason they use /pub/files is because they are saying it it is in a subdirectory of the subdirectory pub called files of the DocumentRoot or the base directory of the webserver (or virtualhost or however they have it set up). When you just say http://www.blahoordf.com/ you are saying you want to access the DocumentRoot or base directory of that webserver. The / isnt needed as the browser provides it anyway for you.

Posted

Ok, thanks

 

however, the reason I am overcomplicatin this is because:

 

in Java, if you are dealing with parameter of the URL class's constructor, there are "host" and "file".

 

the program will work, if you parameterize the string parameters:

"host" by http://www.xxx.com

"file" by /somefile.bla

 

or

 

"host" by http://www.xxx.com/

"file" by /somefile.bla

 

however, it wont work if:

"host" by http://www.xxx.com

"file" by /somefile.bla

 

or

 

"host" by http://www.xxx.com/

"file" by somefile.bla

 

 

any idea?

 

thanks

Posted

Yeah, Java doesnt do any additions for you. So, you need to be looking for http://www.xxx.com as the host and /somefile.bla as the file. The reason being, you arent providing a url to a browser that then converts this to a http request, you are providing parameters for a socket connection and then the GET request line in the http request, so the / from http://www.xxx.com/ will not carry over.

 

GET / HTTP/1.1

Host: http://www.scienceforums.net

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.10) Gecko/20050716 Firefox/1.0.6

Accept: text/xml' date='application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5

Accept-Language: en-us,en;q=0.5

Accept-Encoding: gzip,deflate

Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7

Keep-Alive: 300

Connection: keep-alive

[/quote']

 

My webbrowser has parsed the url i provided for http://www.scienceforums.net (no /) and then since it is a http request (http://) it knows to use the http protocol. It then knows that everything up to a / is the server address (if no /, all of it is) so it uses that to make a socket connection to the server on port 80. It then sends the above data as a request using GET / (entering the / as it wasnt provided by the user but is necessary) and providing the exact protocol version - HTTP/1.1. GET HTTP/1.1 wouldnt work as it wouldnt know to look in the base dir and it isnt the standard for the protocol to provide no indication of your intent in this way. Providing http://www.xxx.com/ as the address to connect to wouldnt work as the slash isnt part of a valid address.

 

Wiki

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.