Jump to content

Recommended Posts

Posted

<form action="/php/X_24_demo.php" name="DemoForm" method="POST" enctype="multipart/form-data" onsubmit="return Final_Check();">
<input type="file" name="UserFile">
<input type="hidden" name="Original">
<input type="submit" value="Send File">
</form>

the above is taken partially from: http://chensh.loxa.edu.tw/php/X_24_demo.php.

 

It is the form part of an HTML file.

 

Since my peer has made this simple php file, I would like to make a Java program which I could POST a small file and see the output, yes, without a browser.

 

Since I knew what's the output should look like (via a browser of course), I can know whether my program works.

 

But, I have no success of uploading the file, since output is the same as the original one (I can indicate by looking at the html code printed out), the correct result has an output message in addition to the original one. You can try that yourself via a browser of course.

 

below is my java program:

 

import java.io.*;
import java.net.*;

public class test1{

      public static void main(String[] ad){

      try{
      URL u = new URL("http://chensh.loxa.edu.tw/php/X_24_demo.php");
      URLConnection uc = u.openConnection();

      uc.setUseCaches(false);
      uc.setDoOutput(true);
      uc.setRequestProperty("Content-type", "multipart/form-data");

      PrintStream ps = new PrintStream(uc.getOutputStream());
      ps.print(URLEncoder.encode("file=C:\\Documents and Settings\\Albert Lee\\桌面\\JAVA\\AWT\\progressiveupdate\\mona.jpg"));

      /*File pic = new File("C:/Documents and Settings/Albert Lee/桌面/JAVA/AWT/progressiveupdate/mona.jpg");
      FileInputStream fis = new FileInputStream(pic);

      for(int i=fis.read(); i!=-1; i=fis.read())
      ps.write(i);  */

      ps.flush();
      ps.close();

      InputStream is = uc.getInputStream();

      for(int i=is.read(); i!=-1; i=is.read())
      System.out.write(i);    }

      catch(IOException ioe){ioe.printStackTrace();}


      }


}

 

As you see the commented lines, you probably know that I've tried POSTing a n actually file, or the uncommented, the text message urlencoded.

 

Does any one know what am I lacking or doing wrong????

 

thanks

Posted

I'm by no means a Java Expert but after a little fiddling around with Ethereal watching how Firefox sends multipart form data I found that you had to specify a boundary for the file contents and should also specify Content-Disposition and Content-Type fields specifically for the POST data, in addition to the normal HTTP Headers. Was stumped for a bit as it wasn't working but found the boundary needed a trailing "--". Heres the working version -

 

import java.io.*;
import java.net.*;

public class test{

      public static void main(String[] ad) {

      /* Init Variables */
      String filename    = "PATH_TO_FILE_HERE";
      String ContentDis  = "Content-Disposition: form-data; name=\"userfile\"; filename=\"" + filename + "\"";
      String ContentType = "Content-Type: image/jpeg"; /* Change this, or calculate it */


      try{
          /* Setup URL Object */
          URL u = new URL("URLHERE");
          URLConnection uc = u.openConnection();
          uc.setUseCaches(false);
          uc.setDoOutput(true);
          uc.setRequestProperty("Content-Type", "multipart/form-data; boundary=\"--3245234--------\"");
          /* Open File,  Setup Output Stream (Raw HTTP Request) and write to it */
          PrintStream ps = new PrintStream(uc.getOutputStream());
          File pic = new File(filename);
          FileInputStream fis = new FileInputStream(pic);
          ps.println("----3245234--------\r"); /* Boundary, requires the leading "--" */
          ps.println(ContentDis + "\r"); /* Content Disposition (Info on content) */
          ps.println(ContentType + "\r\n\r"); /* Content Type of the file */
          for(int i=fis.read(); i!=-1; i=fis.read()) { /* File Contents */
              ps.write(i);
          }
          ps.print("\r\n"); /* End of File */
          /* Flush Output */
          ps.flush();
          ps.close();
          /* Read Result */
          InputStream is = uc.getInputStream();
          for(int i=is.read(); i!=-1; i=is.read())
              System.out.write(i);
      }
      catch(IOException ioe){ioe.printStackTrace();}
      }
}

 

Where it says name=\"userfile"\" that defines what the field is called so when you access it from say a PHP script it would be for instance $_FILES['userfile'] then $_FILES['userfile']['tmp_name'] etc.

 

Example of the HTTP Request Firefox sends -

 

POST /aeternus/test.php HTTP/1.1
Host: aeternus.no-ip.org
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.10) Gecko/20050716 Firefox/1.0.6
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Content-Type: multipart/form-data; boundary=---------------------------41184676334
Content-Length: 174

-----------------------------41184676334
Content-Disposition: form-data; name="UserFile"; filename="test.txt"
Content-Type: text/plain

Some text in this file

Posted

thanks Aerternus, you are great!! :)

 

btw, how did you handle the Ethereal which gives out the HTTP request output result??

 

When I use that software, I got plenty of packets, and I can scarcely understand them...

 

any help??

 

thanks

Posted

They're carriage returns and line feeds - it's basically a delimiter for the HTTP headers. Each header is followed by a \r\n and that tells the server that what you've just typed in is a header.

 

Sorry if that sounds confusing, but I can't think of a decent way of putting it :)

Posted

alright, thx Dave

 

so, line feed is to move to the next line, and carriage return is??

 

by the way, I thought HTTP headers are set by the setRequestProperty()??

 

plz hellp.

 

thanx

Posted

HTTP Headers are but those were headers specifically for the File data and so werent to be set with the HTTP Headers with setRequestProperty.

 

I think the reason it is \r\n instead of \n is because windows uses both a carraige return and and new line for a new line for some strange reason so it keeps it happy. The reason some were just \r and \r\n\r is that println() is printline so it appends a new line character anyway so when it actually gets sent to the feed you will have \r\n and \r\n\r\n respectively in the raw data.

 

In regards to Ethereal, is pretty simple really, capture on the network device that you use for net access, do what you want to watch (ie upload the file with firefox or similar), stop capturing and then itll list all packets that went through that device in the time it was capturing. Alot of this will be random network traffic but it lists the protocol being used for each packet and HTTP is one of the ones it recognises, so if you click the protocol tab a couple of times it will order it according to protocol and you can scroll down to find the HTTP packets.

 

Once youve done that, click on one, and it will be displayed in the bottom window offering drop downs that you can select. Select the HTTP Headers one or the data one and it will list them along with the raw data in hex and in plain text in another bottom window.

Posted
HTTP Headers are but those were headers specifically for the File data and so werent to be set with the HTTP Headers with setRequestProperty.

 

 

thanks Aeternus, I wonder whether you are computer science student.... ;)

 

but I dont feel convinced by the above quote. Why do you "POST" request property???

 

thx

Posted

Not sure exactly what you mean there.

 

If you mean why can't you use the setRequestProperty() again, it is because the Content-Disposition and Content-Type headers are specifically for the file data, not for the HTTP Request. A Content-Type header has already been set in the HTTP Headers to inform the server that the data being sent is multipart/form-data and of that form, so if another Content-Type declaration for the file were to be defined with those headers it would be extremely confusing and it wouldnt know which to use.

 

You'll note that these headers come after the boundary, this is A) so they are not assumed to be HTTP Headers and B) because multiple files/file fields can be set, each requiring their own Content-Disposition and Content-Type etc fields, each file (or indeed simply POST field) seperated by the boundary string.

 

If you are asking why the method used is POST, this is simply because a GET request would be problematic as data is sent as part of the url and there is generally a limit on the length of a url (ie http://www.google.com/?s=booga&b=cheese etc) so sending file data this way wouldn't work and would simply be impractical. The POST method appends the data to be sent after the HTTP Request Headers and can in theory be as big as you wish, making it the prime choice for sending data this way.

 

There is also a PUT method that can be used which is extremely similar to the POST method but is specifically for uploading files to a server (ie you actually PUT to a file, like PUT /aeternus/cheese.html would cause the server to write to a file at the %HTTP_ROOT%/aeternus/cheese.html, although you would usually have some sort of security involved and Apache for example allows you to specify a script to handle put requests in a variety of situations (per directory, whole server, per location etc)). This isn't used as it isnt what you want in this instance as A) it isnt widely used as far as I know B) it uploads to a specific location and would require server configuration to get a single script to handle it, whereas using POST allows a normal php script to handle it with little configuration like you are doing and C) POST allows multiple fields to be used allow with files etc with multipart/form-data so is more suitable for this application.

 

--------------------

 

PS - Heh, I'm not a computer science student YET, I start Swansea University, late September doing MEng Computing (4 year MEng version of the basic 3 year Computer Science Course). Link .

Posted

POST /php/blah.php HTTP/1.1\r\n 
Host: chensh.loxa.edu.tw\r\n
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-TW;...\r\n
Accept: text/xml,application/xml,application/xhtml+xml,text/html\r\n
Accept-Language: zh-tw,en-us;q=0.7,en;q=0.3\r\n
Accept-Encoding: gzip,deflate\r\n
Accept-Charset:...\r\n
Keep-Alive:...\r\n
Connection: keep-alive\r\n
Referer:...\r\n
Cookie: PHPSESSID=183a044cd1be3f8a471a337c3474b6d3\r\n
Content-Type: multipart/form-data; boundary=...\r\n
Content-Length: 299\r\n
\r\n

 

thanks for your post, Matthew, and if you dont think it's obnoxious, may I ask you what you normally learn in Computing??? for me, my school does little teaching formal Computing course, generally art, like Photoshop, 3DS MAX... but never coding, I learn Java solely myself, you know, I learn in a hard way. *sigh*

 

Any way, as you see from the above result, can you tell me which are HTTP requests that my computer sets before connecting to the server, and which are the responses???

 

thanks

Posted

btw, I am abit lazy to dig this out, what's the difference between AS and A level??

 

ps, I know A level by the way, which is kinda extended version of GCSE.

Posted
btw' date=' I am abit lazy to dig this out, what's the difference between AS and A level??

 

ps, I know A level by the way, which is kinda extended version of GCSE.[/quote']

 

 

GCSE: qualification at 16

A-level: qualification at 18

AS level: the first year of an A-leve (the second year being A2), AS is also now a stand alone qualification, so you do an AS then do another year to make it an A-level...

Posted

As for what I did in A Level Computing, the specification is here. We used Turbo Pascal for the projects. We don't learn any Java or any other languages in Computing at A Level, I've learnt a little bit in my free time along with some other languages and ideas etc.

 

AS Level is basically half of an A Level, A Levels take 2 years, after the end of the first year you take an AS Level exam and then can choose to stop there or continue to complete the A level. It is basically so people who dont finish the 2 years still get something or so you can take less modules for certain things if you want a bit extra (like I did with Further Maths).

 

As for your question on responses, all of that is a HTTP Request. I didn't include the response. The Response would be of the form -

 

HTTP/1.1 200 OK

Date: Sun' date=' 28 Aug 2005 20:54:10 GMT

Server: Apache/2.0.54 (Gentoo/Linux) mod_ssl/2.0.54 OpenSSL/0.9.7d PHP/4.4.0

X-Powered-By: PHP/4.4.0

Content-Length: 264

Keep-Alive: timeout=15, max=100

Connection: Keep-Alive

Content-Type: text/html; charset=ISO-8859-1

 

Files-

array(1) {

["UserFile"']=>

array(5) {

["name"]=>

string(8) "test.txt"

["type"]=>

string(10) "text/plain"

["tmp_name"]=>

string(14) "/tmp/phpfdUdCM"

["error"]=>

int(0)

["size"]=>

int(32)

}

}

 

 

Post-

array(0) {

}

 

The stuff from "Files-" onwards is the actual text being sent that you would see in your browser whereas the stuff above that is the HTTP Response Headers. In this case it is simply a var_dump() in PHP of $_FILES and $_POST that I was using so I'd know what was going on. You can play about with this quite easily using telnet although I have had varying degrees of success using the windows telnet software (I use the FreeBSD implementation on Linux). This is available from Ethereal if you just look for the HTTP Packets coming from the servers IP to your own (instead of the other way around) and would probably include \r\n's instead of just new lines as I got this output straight from telnet as I was in a hurry.

 

You might find this link or this link useful.

Posted

oh yes,

 

as I've observed, the "boundary" is constantly changing...

 

do you people know what is the socalled "boundary" in HTTP??

 

thx

Posted

As far as I know its simply a string of characters that can be used to mark the boundary between fields/files in post data when dealing with multipart form data content type. Not sure exactly why it changes but I would assume it changes to try to ensure that the boundary string doesnt appear in the file and isnt some constant that might appear in some of the fields.

 

Link to some Info

Posted

thx Aeternus...

 

I have searched around, and find two concise quotes for explaining content-length, and boundary.

 

The Content-Length entity-header field indicates the size of the entity-body, in [i']decimal number of OCTETs[/i]

I dont understand the italic bits...

 

btw,

the below quote is what I've found the most useful for "boundary",

 

As stated previously' date=' each body part is preceded by a boundary

delimiter line that contains the boundary delimiter. The boundary

delimiter MUST NOT appear inside any of the encapsulated parts, on a

line by itself or as the prefix of any line. [b']This implies that it is

crucial that the composing agent be able to choose and specify a

unique boundary parameter value that does not contain the boundary

parameter value of an enclosing multipart as a prefix.[/b]

 

but the bold line is not clear to me, what does it mean by the "unique boundary parameter value that does not contain the boundary parameter value of an enclosing multipart as a prefix"???

 

 

any more help???

 

thx

Posted

http://en.wikipedia.org/wiki/Octet#Computers_and_networking

 

An octet is 8 bits or one byte. Therefore it wants the number represented in decimal or base ten form of octets (sets of 8 bits) that the data takes up. For instance if it were a text file, as you are most likely representing the letters in Ascii (or extended Ascii as basic Ascii is only 7 bits) that takes up 8 bits or 1 byte/octet per character and so the content-length would be the number of characters (including boundary characters etc).

 

This implies that it is

crucial that the composing agent be able to choose and specify a

unique boundary parameter value that does not contain the boundary

parameter value of an enclosing multipart as a prefix.

 

The wording sounds a bit complicated but I think it's just saying that the client (web browser or any form of http client (your java app for instance)) must pic a boundary that doesn't occur naturally in the data it is sending. It wouldn't really work as a boundary otherwise as the server would think there was a boundary WITHIN one of the data fields/files and so would assume it was 2 data fields/files and this is what you would want. Therefore a boundary that isn't found in the data you are sending must be used.

 

This is all as far as I know / can work out.I'm not an expert by any means, I'm sure someone else might be able to clarify or provide more information :).

Posted

oh, thx Aeternus again...

so, what does entity-body include?? the actual file, the boundary, and what??

 

btw, from your explanation, it sounds like any file has boundary?? ...must specify a boundary that's not found in the file??

and, is there any effect for different values of boundary?? causing any error??

since it seems to be it is the webbrowser which sets the boundary for me....

 

thx for your patience for answering my ques..

 

ps, I know you are not expert because only "expert" will need to read the specs thoroughly...:)

 

*sigh* why no others are helping me in this thread?? is it the "boundary" of Science Forums's knowledge????

Posted

... Its just some random string/data set. Don't know if theres a limit but I don't think its a set amount. Usually its specified in text form so several bytes ?? (ie --3333 would be 6 bytes --33334343 would be 10 bytes etc). I don't think it's anything specific. Maybe I'm misunderstanding what you mean? (its a bit late here heh (01:35)).

Posted

I suppose there is no reason it couldn't (just have the number as the boundary value with a few random characters afterwards) but I don't think it is actually in the spec or actually used that way. If you simply mean, can you use the boundaries to tell how many parts there are, then sure, that's the point, the boundary as you say is the delimiter, it sections each post field up so they can be seen as seperate and you know where one begins and one ends, you can simply go from boundary to boundary to boundary and count how many data segments there are. As far as I am aware there is no way of recording in the http headers (within the spec), how many post fields there are, but dont quote me on that, if I find out there is, I'll mention it and would ask you to do the same :)

Posted

Well, sorry I dont understand...

 

why is the HTTP protocol designed to delimit a file content via boundary?

 

and more over, how does the browser know at what point of the file content has to be delimited???

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.