albertlee Posted August 27, 2005 Posted August 27, 2005 <form action="/php/X_24_demo.php" name="DemoForm" method="POST" enctype="multipart/form-data" onsubmit="return Final_Check();"> <input type="file" name="UserFile"> <input type="hidden" name="Original"> <input type="submit" value="Send File"> </form> the above is taken partially from: http://chensh.loxa.edu.tw/php/X_24_demo.php. It is the form part of an HTML file. Since my peer has made this simple php file, I would like to make a Java program which I could POST a small file and see the output, yes, without a browser. Since I knew what's the output should look like (via a browser of course), I can know whether my program works. But, I have no success of uploading the file, since output is the same as the original one (I can indicate by looking at the html code printed out), the correct result has an output message in addition to the original one. You can try that yourself via a browser of course. below is my java program: import java.io.*; import java.net.*; public class test1{ public static void main(String[] ad){ try{ URL u = new URL("http://chensh.loxa.edu.tw/php/X_24_demo.php"); URLConnection uc = u.openConnection(); uc.setUseCaches(false); uc.setDoOutput(true); uc.setRequestProperty("Content-type", "multipart/form-data"); PrintStream ps = new PrintStream(uc.getOutputStream()); ps.print(URLEncoder.encode("file=C:\\Documents and Settings\\Albert Lee\\桌面\\JAVA\\AWT\\progressiveupdate\\mona.jpg")); /*File pic = new File("C:/Documents and Settings/Albert Lee/桌面/JAVA/AWT/progressiveupdate/mona.jpg"); FileInputStream fis = new FileInputStream(pic); for(int i=fis.read(); i!=-1; i=fis.read()) ps.write(i); */ ps.flush(); ps.close(); InputStream is = uc.getInputStream(); for(int i=is.read(); i!=-1; i=is.read()) System.out.write(i); } catch(IOException ioe){ioe.printStackTrace();} } } As you see the commented lines, you probably know that I've tried POSTing a n actually file, or the uncommented, the text message urlencoded. Does any one know what am I lacking or doing wrong???? thanks
Aeternus Posted August 27, 2005 Posted August 27, 2005 I'm by no means a Java Expert but after a little fiddling around with Ethereal watching how Firefox sends multipart form data I found that you had to specify a boundary for the file contents and should also specify Content-Disposition and Content-Type fields specifically for the POST data, in addition to the normal HTTP Headers. Was stumped for a bit as it wasn't working but found the boundary needed a trailing "--". Heres the working version - import java.io.*; import java.net.*; public class test{ public static void main(String[] ad) { /* Init Variables */ String filename = "PATH_TO_FILE_HERE"; String ContentDis = "Content-Disposition: form-data; name=\"userfile\"; filename=\"" + filename + "\""; String ContentType = "Content-Type: image/jpeg"; /* Change this, or calculate it */ try{ /* Setup URL Object */ URL u = new URL("URLHERE"); URLConnection uc = u.openConnection(); uc.setUseCaches(false); uc.setDoOutput(true); uc.setRequestProperty("Content-Type", "multipart/form-data; boundary=\"--3245234--------\""); /* Open File, Setup Output Stream (Raw HTTP Request) and write to it */ PrintStream ps = new PrintStream(uc.getOutputStream()); File pic = new File(filename); FileInputStream fis = new FileInputStream(pic); ps.println("----3245234--------\r"); /* Boundary, requires the leading "--" */ ps.println(ContentDis + "\r"); /* Content Disposition (Info on content) */ ps.println(ContentType + "\r\n\r"); /* Content Type of the file */ for(int i=fis.read(); i!=-1; i=fis.read()) { /* File Contents */ ps.write(i); } ps.print("\r\n"); /* End of File */ /* Flush Output */ ps.flush(); ps.close(); /* Read Result */ InputStream is = uc.getInputStream(); for(int i=is.read(); i!=-1; i=is.read()) System.out.write(i); } catch(IOException ioe){ioe.printStackTrace();} } } Where it says name=\"userfile"\" that defines what the field is called so when you access it from say a PHP script it would be for instance $_FILES['userfile'] then $_FILES['userfile']['tmp_name'] etc. Example of the HTTP Request Firefox sends - POST /aeternus/test.php HTTP/1.1 Host: aeternus.no-ip.org User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.10) Gecko/20050716 Firefox/1.0.6 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Content-Type: multipart/form-data; boundary=---------------------------41184676334 Content-Length: 174 -----------------------------41184676334 Content-Disposition: form-data; name="UserFile"; filename="test.txt" Content-Type: text/plain Some text in this file
albertlee Posted August 27, 2005 Author Posted August 27, 2005 thanks Aerternus, you are great!! btw, how did you handle the Ethereal which gives out the HTTP request output result?? When I use that software, I got plenty of packets, and I can scarcely understand them... any help?? thanks
albertlee Posted August 27, 2005 Author Posted August 27, 2005 more over, what are those "\r\n", "\r\n\r", ...whatever, for?? thx
Dave Posted August 28, 2005 Posted August 28, 2005 They're carriage returns and line feeds - it's basically a delimiter for the HTTP headers. Each header is followed by a \r\n and that tells the server that what you've just typed in is a header. Sorry if that sounds confusing, but I can't think of a decent way of putting it
albertlee Posted August 28, 2005 Author Posted August 28, 2005 alright, thx Dave so, line feed is to move to the next line, and carriage return is?? by the way, I thought HTTP headers are set by the setRequestProperty()?? plz hellp. thanx
Aeternus Posted August 28, 2005 Posted August 28, 2005 HTTP Headers are but those were headers specifically for the File data and so werent to be set with the HTTP Headers with setRequestProperty. I think the reason it is \r\n instead of \n is because windows uses both a carraige return and and new line for a new line for some strange reason so it keeps it happy. The reason some were just \r and \r\n\r is that println() is printline so it appends a new line character anyway so when it actually gets sent to the feed you will have \r\n and \r\n\r\n respectively in the raw data. In regards to Ethereal, is pretty simple really, capture on the network device that you use for net access, do what you want to watch (ie upload the file with firefox or similar), stop capturing and then itll list all packets that went through that device in the time it was capturing. Alot of this will be random network traffic but it lists the protocol being used for each packet and HTTP is one of the ones it recognises, so if you click the protocol tab a couple of times it will order it according to protocol and you can scroll down to find the HTTP packets. Once youve done that, click on one, and it will be displayed in the bottom window offering drop downs that you can select. Select the HTTP Headers one or the data one and it will list them along with the raw data in hex and in plain text in another bottom window.
albertlee Posted August 28, 2005 Author Posted August 28, 2005 HTTP Headers are but those were headers specifically for the File data and so werent to be set with the HTTP Headers with setRequestProperty. thanks Aeternus, I wonder whether you are computer science student.... but I dont feel convinced by the above quote. Why do you "POST" request property??? thx
Aeternus Posted August 28, 2005 Posted August 28, 2005 Not sure exactly what you mean there. If you mean why can't you use the setRequestProperty() again, it is because the Content-Disposition and Content-Type headers are specifically for the file data, not for the HTTP Request. A Content-Type header has already been set in the HTTP Headers to inform the server that the data being sent is multipart/form-data and of that form, so if another Content-Type declaration for the file were to be defined with those headers it would be extremely confusing and it wouldnt know which to use. You'll note that these headers come after the boundary, this is A) so they are not assumed to be HTTP Headers and B) because multiple files/file fields can be set, each requiring their own Content-Disposition and Content-Type etc fields, each file (or indeed simply POST field) seperated by the boundary string. If you are asking why the method used is POST, this is simply because a GET request would be problematic as data is sent as part of the url and there is generally a limit on the length of a url (ie http://www.google.com/?s=booga&b=cheese etc) so sending file data this way wouldn't work and would simply be impractical. The POST method appends the data to be sent after the HTTP Request Headers and can in theory be as big as you wish, making it the prime choice for sending data this way. There is also a PUT method that can be used which is extremely similar to the POST method but is specifically for uploading files to a server (ie you actually PUT to a file, like PUT /aeternus/cheese.html would cause the server to write to a file at the %HTTP_ROOT%/aeternus/cheese.html, although you would usually have some sort of security involved and Apache for example allows you to specify a script to handle put requests in a variety of situations (per directory, whole server, per location etc)). This isn't used as it isnt what you want in this instance as A) it isnt widely used as far as I know B) it uploads to a specific location and would require server configuration to get a single script to handle it, whereas using POST allows a normal php script to handle it with little configuration like you are doing and C) POST allows multiple fields to be used allow with files etc with multipart/form-data so is more suitable for this application. -------------------- PS - Heh, I'm not a computer science student YET, I start Swansea University, late September doing MEng Computing (4 year MEng version of the basic 3 year Computer Science Course). Link .
albertlee Posted August 28, 2005 Author Posted August 28, 2005 POST /php/blah.php HTTP/1.1\r\n Host: chensh.loxa.edu.tw\r\n User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-TW;...\r\n Accept: text/xml,application/xml,application/xhtml+xml,text/html\r\n Accept-Language: zh-tw,en-us;q=0.7,en;q=0.3\r\n Accept-Encoding: gzip,deflate\r\n Accept-Charset:...\r\n Keep-Alive:...\r\n Connection: keep-alive\r\n Referer:...\r\n Cookie: PHPSESSID=183a044cd1be3f8a471a337c3474b6d3\r\n Content-Type: multipart/form-data; boundary=...\r\n Content-Length: 299\r\n \r\n thanks for your post, Matthew, and if you dont think it's obnoxious, may I ask you what you normally learn in Computing??? for me, my school does little teaching formal Computing course, generally art, like Photoshop, 3DS MAX... but never coding, I learn Java solely myself, you know, I learn in a hard way. *sigh* Any way, as you see from the above result, can you tell me which are HTTP requests that my computer sets before connecting to the server, and which are the responses??? thanks
albertlee Posted August 28, 2005 Author Posted August 28, 2005 btw, I am abit lazy to dig this out, what's the difference between AS and A level?? ps, I know A level by the way, which is kinda extended version of GCSE.
Klaynos Posted August 28, 2005 Posted August 28, 2005 btw' date=' I am abit lazy to dig this out, what's the difference between AS and A level?? ps, I know A level by the way, which is kinda extended version of GCSE.[/quote'] GCSE: qualification at 16 A-level: qualification at 18 AS level: the first year of an A-leve (the second year being A2), AS is also now a stand alone qualification, so you do an AS then do another year to make it an A-level...
Aeternus Posted August 28, 2005 Posted August 28, 2005 As for what I did in A Level Computing, the specification is here. We used Turbo Pascal for the projects. We don't learn any Java or any other languages in Computing at A Level, I've learnt a little bit in my free time along with some other languages and ideas etc. AS Level is basically half of an A Level, A Levels take 2 years, after the end of the first year you take an AS Level exam and then can choose to stop there or continue to complete the A level. It is basically so people who dont finish the 2 years still get something or so you can take less modules for certain things if you want a bit extra (like I did with Further Maths). As for your question on responses, all of that is a HTTP Request. I didn't include the response. The Response would be of the form - HTTP/1.1 200 OK Date: Sun' date=' 28 Aug 2005 20:54:10 GMT Server: Apache/2.0.54 (Gentoo/Linux) mod_ssl/2.0.54 OpenSSL/0.9.7d PHP/4.4.0 X-Powered-By: PHP/4.4.0 Content-Length: 264 Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Content-Type: text/html; charset=ISO-8859-1 Files- array(1) { ["UserFile"']=> array(5) { ["name"]=> string(8) "test.txt" ["type"]=> string(10) "text/plain" ["tmp_name"]=> string(14) "/tmp/phpfdUdCM" ["error"]=> int(0) ["size"]=> int(32) } } Post- array(0) { } The stuff from "Files-" onwards is the actual text being sent that you would see in your browser whereas the stuff above that is the HTTP Response Headers. In this case it is simply a var_dump() in PHP of $_FILES and $_POST that I was using so I'd know what was going on. You can play about with this quite easily using telnet although I have had varying degrees of success using the windows telnet software (I use the FreeBSD implementation on Linux). This is available from Ethereal if you just look for the HTTP Packets coming from the servers IP to your own (instead of the other way around) and would probably include \r\n's instead of just new lines as I got this output straight from telnet as I was in a hurry. You might find this link or this link useful.
albertlee Posted August 28, 2005 Author Posted August 28, 2005 wow, the A level computer course looks stunning!! thanks fo the help and the link btw, Aeternus!!
albertlee Posted August 28, 2005 Author Posted August 28, 2005 oh yes, as I've observed, the "boundary" is constantly changing... do you people know what is the socalled "boundary" in HTTP?? thx
Aeternus Posted August 28, 2005 Posted August 28, 2005 As far as I know its simply a string of characters that can be used to mark the boundary between fields/files in post data when dealing with multipart form data content type. Not sure exactly why it changes but I would assume it changes to try to ensure that the boundary string doesnt appear in the file and isnt some constant that might appear in some of the fields. Link to some Info
albertlee Posted August 29, 2005 Author Posted August 29, 2005 thx Aeternus... I have searched around, and find two concise quotes for explaining content-length, and boundary. The Content-Length entity-header field indicates the size of the entity-body, in [i']decimal number of OCTETs[/i] I dont understand the italic bits... btw, the below quote is what I've found the most useful for "boundary", As stated previously' date=' each body part is preceded by a boundarydelimiter line that contains the boundary delimiter. The boundary delimiter MUST NOT appear inside any of the encapsulated parts, on a line by itself or as the prefix of any line. [b']This implies that it is crucial that the composing agent be able to choose and specify a unique boundary parameter value that does not contain the boundary parameter value of an enclosing multipart as a prefix.[/b] but the bold line is not clear to me, what does it mean by the "unique boundary parameter value that does not contain the boundary parameter value of an enclosing multipart as a prefix"??? any more help??? thx
Aeternus Posted August 29, 2005 Posted August 29, 2005 http://en.wikipedia.org/wiki/Octet#Computers_and_networking An octet is 8 bits or one byte. Therefore it wants the number represented in decimal or base ten form of octets (sets of 8 bits) that the data takes up. For instance if it were a text file, as you are most likely representing the letters in Ascii (or extended Ascii as basic Ascii is only 7 bits) that takes up 8 bits or 1 byte/octet per character and so the content-length would be the number of characters (including boundary characters etc). This implies that it is crucial that the composing agent be able to choose and specify a unique boundary parameter value that does not contain the boundary parameter value of an enclosing multipart as a prefix. The wording sounds a bit complicated but I think it's just saying that the client (web browser or any form of http client (your java app for instance)) must pic a boundary that doesn't occur naturally in the data it is sending. It wouldn't really work as a boundary otherwise as the server would think there was a boundary WITHIN one of the data fields/files and so would assume it was 2 data fields/files and this is what you would want. Therefore a boundary that isn't found in the data you are sending must be used. This is all as far as I know / can work out.I'm not an expert by any means, I'm sure someone else might be able to clarify or provide more information .
albertlee Posted August 29, 2005 Author Posted August 29, 2005 oh, thx Aeternus again... so, what does entity-body include?? the actual file, the boundary, and what?? btw, from your explanation, it sounds like any file has boundary?? ...must specify a boundary that's not found in the file?? and, is there any effect for different values of boundary?? causing any error?? since it seems to be it is the webbrowser which sets the boundary for me.... thx for your patience for answering my ques.. ps, I know you are not expert because only "expert" will need to read the specs thoroughly... *sigh* why no others are helping me in this thread?? is it the "boundary" of Science Forums's knowledge????
albertlee Posted August 29, 2005 Author Posted August 29, 2005 btw, for boundary, what's the unit for the value of boundary?? bit, byte, or what?
Aeternus Posted August 30, 2005 Posted August 30, 2005 ... Its just some random string/data set. Don't know if theres a limit but I don't think its a set amount. Usually its specified in text form so several bytes ?? (ie --3333 would be 6 bytes --33334343 would be 10 bytes etc). I don't think it's anything specific. Maybe I'm misunderstanding what you mean? (its a bit late here heh (01:35)).
albertlee Posted August 30, 2005 Author Posted August 30, 2005 abit late?? maybe playing around with computer?? ok, thx, so does the boundary tell how many parts are being delimited??
Aeternus Posted August 30, 2005 Posted August 30, 2005 I suppose there is no reason it couldn't (just have the number as the boundary value with a few random characters afterwards) but I don't think it is actually in the spec or actually used that way. If you simply mean, can you use the boundaries to tell how many parts there are, then sure, that's the point, the boundary as you say is the delimiter, it sections each post field up so they can be seen as seperate and you know where one begins and one ends, you can simply go from boundary to boundary to boundary and count how many data segments there are. As far as I am aware there is no way of recording in the http headers (within the spec), how many post fields there are, but dont quote me on that, if I find out there is, I'll mention it and would ask you to do the same
albertlee Posted August 30, 2005 Author Posted August 30, 2005 Well, sorry I dont understand... why is the HTTP protocol designed to delimit a file content via boundary? and more over, how does the browser know at what point of the file content has to be delimited???
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now