Ilya Gazman Posted October 13, 2013 Posted October 13, 2013 Hi, My name is Ilya Gazman. I found an exact solution for Traveling salesmen problem. Currently the best implementation according to wiki is Held–Karp algorithm that solves the problem in time O(N^2 * 2^N). I believe that my algorithm can do this in O(N*C * 2^N) when C is a bit smaller than 1. I need your help to officially approve my work and update the wiki page. So if this is interesting you here is how I did it: Lets say we want to solve a 6 cities route with the brout algorithm. There are (6-1)! options for that, we will need to test them all and return the shortest route founded. So it will look something like that(Cities names are: A, B, C, D, E, F): Option 1 : A -> B -> C -> D -> E -> F -> A Option 2 : A -> B -> C -> D -> F -> E -> A Option 3 : A -> C -> B -> D -> E -> F -> A Option 4 : A -> C -> B -> D -> F -> E -> A . . Option 119 Option 120 Now I am saying that after calculating option 1, you can skip over options 2 and only calculate part of options 3 and 4. How do you do that? It's simple: When calculating option 1 you need to calculate what will be the shortest route starting from City D, finishing in City A, and going thru cities E, F. Now you can skip option 2 because you already calculated it in option 1. And when you start calculating options 3 and 4 you can stop when reaching City D, because you already know what would be the shortest route starting at city D, finishing in City A and going thru cities E, F. This is the principle that I used in my algorithm. I run a brute algorithm and mapped all the sub results, those results are not sub routes, do not confuse there. They are just part of calculation that need to be done in order to find the shortest route. So each time I recognize I am doing the same calculation I used a solution from a map. Here is an output of my algorithm running over 19 cities, (in the attached file you can find java code that implements it). Source(19) [10.0,65.0, 34.0,52.0, 37.0,98.0, 39.0,44.0, 43.0,37.0, 45.0,89.0, 66.0,79.0, 69.0,74.0, 7.0,76.0, 70.0,15.0, 77.0,27.0, 78.0,11.0, 78.0,13.0, 80.0,5.0, 81.0,38.0, 82.0,64.0, 87.0,7.0, 90.0,61.0, 93.0,31.0] Finish MapEngine test after 321550 mills Created: 20801457 Map(3) Write 2448 Read 34272 Map(4) Write 12240 Read 159120 Map(5) Write 42840 Read 514080 Map(6) Write 111384 Read 1225224 Map(7) Write 222768 Read 2227680 Map(8) Write 350064 Read 3150576 Map(9) Write 437580 Read 3500640 Map(10) Write 437580 Read 3084270 Map(11) Write 352185 Read 2344256 Map(12) Write 245131 Read 1382525 Map(13) Write 135638 Read 570522 Map(14) Write 54320 Read 156758 Map(15) Write 15077 Read 27058 Map(16) Write 2809 Read 2087 Map(17) Write 306 Read 0 Map(18) Write 18 Read 0 Map(19) Write 1 Read 0 0) 295.5947584525372> [10.0,65.0, 34.0,52.0, 39.0,44.0, 43.0,37.0, 70.0,15.0, 78.0,13.0, 78.0,11.0, 80.0,5.0, 87.0,7.0, 77.0,27.0, 93.0,31.0, 81.0,38.0, 90.0,61.0, 82.0,64.0, 69.0,74.0, 66.0,79.0, 45.0,89.0, 37.0,98.0, 7.0,76.0, 10.0,65.0] Source(19) is the input cities. It took my PC 321550 mills to calculate, (about 5 minutes). Created: 20801457 represent the number of atomic actions that the algorithm performed(about 20M actions). Map(3) speaks about the number of maps with 3 cities that been created. It created 2448 3 cities maps and used them 34272 times. To calculate the efficiency of my algorithm all you need to do is to sum all the maps that he produce, then you will get the answer. So the number of maps that my algorithm will produce with K cities size in N cities route will be: The number of times I can select the first city of my map: N, multiplies the number of times I can choose different selection of my cities from the remaining cities: (n-1)! / ((n - k - 1)! * (k-1)!). Thas come to n! / ((n - k - 1)! * (k-1)!). Assuming that creating a map of size 3 is an atomic action, then my algorithm efficiency will be the sum of all those maps. So my algorithm have the next efficiency. N * (N - 1) * (N - 2) / 2! + N * (N - 1) * (N - 2) * (N - 3) / 3! + N * (N - 1) * (N - 2) * (N - 3) (N -4) / 4! + ... N! / (N - 1)! = N * (N - 1) * (N - 2) / 2! + N * (N - 1) * (N - 2) * (N - 3) / 3! + N * (N - 1) * (N - 2) * (N - 3) (N -4) / 4! + ... N Now lets solve this efficient algorithm with N from 7 to 100, and compare it to the previous results(result of N = 9 with N =8, result of N = 24 with N = 23). I found out that for big numbers of N the comparison result is 2. Then I did the same with the traditional dynamic programing algorithm efficiency. Here is the list of what I got: 7 2.55769 2.72222 2.98397 8 2.40601 2.61224 2.74973 9 2.31562 2.53125 2.60507 10 2.2582 2.46913 2.50912 11 2.21972 2.42 2.44169 12 2.19258 2.38016 2.39191 13 2.17251 2.34722 2.35356 14 2.15701 2.31952 2.32293 15 2.14456 2.29591 2.29774 16 2.13424 2.27555 2.27652 17 2.12548 2.25781 2.25832 18 2.1179 2.24221 2.24248 19 2.11124 2.22839 2.22853 20 2.10533 2.21606 2.21614 21 2.10003 2.205 2.20503 22 2.09525 2.19501 2.19503 23 2.09091 2.18595 2.18596 24 2.08696 2.17769 2.17769 25 2.08333 2.17013 2.17014 26 2.08 2.1632 2.1632 27 2.07692 2.1568 2.1568 28 2.07407 2.15089 2.15089 29 2.07142 2.1454 2.1454 30 2.06896 2.1403 2.1403 31 2.06666 2.13555 2.13555 32 2.06451 2.13111 2.13111 33 2.0625 2.12695 2.12695 34 2.0606 2.12304 2.12304 35 2.05882 2.11937 2.11937 36 2.05714 2.11591 2.11591 37 2.05555 2.11265 2.11265 38 2.05405 2.10956 2.10956 39 2.05263 2.10664 2.10664 40 2.05128 2.10387 2.10387 41 2.05 2.10125 2.10125 42 2.04878 2.09875 2.09875 43 2.04761 2.09637 2.09637 44 2.04651 2.0941 2.0941 45 2.04545 2.09194 2.09194 46 2.04444 2.08987 2.08987 47 2.04347 2.0879 2.0879 48 2.04255 2.08601 2.08601 49 2.04166 2.0842 2.0842 50 2.04081 2.08246 2.08246 51 2.04 2.0808 2.0808 52 2.03921 2.0792 2.0792 53 2.03846 2.07766 2.07766 54 2.03773 2.07618 2.07618 55 2.03703 2.07475 2.07475 56 2.03636 2.07338 2.07338 57 2.03571 2.07206 2.07206 58 2.03508 2.07079 2.07079 59 2.03448 2.06956 2.06956 60 2.03389 2.06837 2.06837 61 2.03333 2.06722 2.06722 62 2.03278 2.06611 2.06611 63 2.03225 2.06503 2.06503 64 2.03174 2.06399 2.06399 65 2.03125 2.06298 2.06298 66 2.03076 2.06201 2.06201 67 2.0303 2.06106 2.06106 68 2.02985 2.06014 2.06014 69 2.02941 2.05925 2.05925 70 2.02898 2.05839 2.05839 71 2.02857 2.05755 2.05755 72 2.02816 2.05673 2.05673 73 2.02777 2.05594 2.05594 74 2.02739 2.05516 2.05516 75 2.02702 2.05441 2.05441 76 2.02666 2.05368 2.05368 77 2.02631 2.05297 2.05297 78 2.02597 2.05228 2.05228 79 2.02564 2.05161 2.05161 80 2.02531 2.05095 2.05095 81 2.025 2.05031 2.05031 82 2.02469 2.04968 2.04968 83 2.02439 2.04907 2.04907 84 2.02409 2.04848 2.04848 85 2.0238 2.0479 2.0479 86 2.02352 2.04733 2.04733 87 2.02325 2.04678 2.04678 88 2.02298 2.04624 2.04624 89 2.02272 2.04571 2.04571 90 2.02247 2.04519 2.04519 91 2.02222 2.04469 2.04469 92 2.02197 2.04419 2.04419 93 2.02173 2.04371 2.04371 94 2.0215 2.04324 2.04324 95 2.02127 2.04277 2.04277 96 2.02105 2.04232 2.04232 97 2.02083 2.04188 2.04188 98 2.02061 2.04144 2.04144 99 2.0204 2.04102 2.04102 100 2.0202 2.0406 2.0406 Column 1 is N, column 2 is my algorithm efficiency compare, column 3 is dynamic programming algorithm compare and column 4 is my algorithm efficiency multiply N compare. See how column 3 and 4 are almost the same. This is how I found it. Please verify my work, take a look at the code, tell me if you agree or not with me. If not please show me where my algorithm or my math is not working by exact sample. CityMapSample2.zip 1
Sensei Posted October 14, 2013 Posted October 14, 2013 (edited) If city has location x,y then distance between two cities is sqrt((city1.x-city2.x)^2+(city1.y-city2.y)^2) sqrt() operation is slow to cpu. It might be possible to get ride of it during the main calculation because sqrt(x)>sqrt(y) is returning the same as x>y We are interested what is greater after all, not what is exact value, isn't? sqrt() use in final calculation of path length for user, where speed is not important. pow(x,y) with any y is also slow. For such simple thing as ^2 I would never thought about using pow(), simply x*x private double mathOption(City city1, City city2){ double powerX = Math.pow(city1.x - city2.x, 2); double powerY = Math.pow(city1.y - city2.y, 2); return Math.sqrt(powerX + powerY); } Change to: private double mathOption(City city1, City city2){ double dx = city1.x - city2.x; double dy = city1.y - city2.y; dx *= dx; dy *= dy; return ( dx + dy ); } And try whether it's running faster. Did you thought about precalculating data? If you have x cities, you can allocate x^2 distance table and fill it at beginning And then reuse instead of calculating over and over again distance from city to city[j]. But this would require using index to city, instead of City object. distance=distances[ i*count+j ] i = 0...count-1 j = 0...count-1 Dynamic tables are slow, they require reallocating and copying data. In the worstest implementation of dynamic table, it's extended every added element to array. Inserting element at beginning must end up with copying and pasting the all elements (the only way to get ride of it, is using double linked lists. I doubt java uses them). If we know how many elements table will have (city count), better is to allocate static table and then fill, instead of relying on dynamic table/array. (I noticed this in Path class) Edited October 14, 2013 by Sensei
Ilya Gazman Posted October 14, 2013 Author Posted October 14, 2013 Thank you Quark, for reviewing the code. This indeed will improve the speed. As you probably saw more improvements can be done. This is just a sample and it's purpose to explain the algorithm. This is why I used pow. Another improvement could be working with integers, instead of doubles. And only use double for the full calculation(Of course it will require more code and testing if this is possible, for the current given input of the cities, and I think it will be possible in 99% of the cases). Another problem of the code, is that it is using Ram memory and not database, even so db is slower, it will allow to perform bigger calculation. On my debugger, I am getting out of memory for N = 20;
Sensei Posted October 14, 2013 Posted October 14, 2013 Another problem of the code, is that it is using Ram memory and not database, even so db is slower, it will allow to perform bigger calculation. On my debugger, I am getting out of memory for N = 20; The problem is that Path and therefor PathList contain copies of the all Cities. If you have ArrayList<Object> it's taking count * sizeof( Object ) memory. sizeof( City ) = 20, so count * 20 for each Path at least. You really don't want whole structure of City copied there. What you need is just 32 bit index, 4 bytes long, to City from table of cities. It will reduce immediately memory requirement to 20% what you have now. Using unsigned short, 16 bit, would reduce number of cities to 16384, and memory requirement to 10% of now.
Ilya Gazman Posted October 14, 2013 Author Posted October 14, 2013 Its not working this way. You should read about shallow copy. I do not copy the cities but only the pointers.
Enthalpy Posted October 14, 2013 Posted October 14, 2013 Just a sub-sub-detail: integers are often slower than doubles or floats. It's often the case for multiplication - just because Cpu designers don't pay much attention to integers - and always for division, because integer division uses an exact algorithm which take about one cycle per bit (or slightly less now) while double division and sqrt are approximate and iterative algorithms that double the accurate bits at each iteration. Presently, Cpu take one cycle per multiply or add on doubles (or even on double pairs, some Cpu on double quadruplets) while an int multiply can need several cycles. And did I read that since the Core2, integer computations are made by the Sse hardware? Though, if building hardware purposely, integers do save gate counts and may be faster, sure. Sorry for interrupting, I understand operation speed is not a part of the algorithm's intrinsic speed. Any improvement on such a well-known (...and useless!) problem would be remarquable.
Sensei Posted October 14, 2013 Posted October 14, 2013 Pointer on 64 bit machine is 8 bytes, while 16 bit index is 2 bytes, so it's 4x less memory.
Ilya Gazman Posted October 15, 2013 Author Posted October 15, 2013 Organism you are the man! I had no idea about anything of this. I just posted my second part of the algorithm. You are welcome to read it. Also there is a link to chat with me one of the post comments. http://cs.stackexchange.com/questions/16076/traveling-salesman-heldkarp-algorithm-big-improvement
Sensei Posted October 15, 2013 Posted October 15, 2013 What said Enthalpy is about SSE extension: it's used when you enable it in C/C++ compiler, and get used usually when you are using vector math, f.e. adding/subtracting/multiplying vector with 4 elements by other vector with 4 elements. Basically execute instruction doing multiple maths in one. http://en.wikipedia.org/wiki/Streaming_SIMD_Extensions Visual Studio 2008 Express is supporting only no-SSE, SSE1, SSE2. I am not sure how it's with newer Express and commercial Visual Studio. But you're using Java, not C/C++. From experience with very math heavy ray-tracing engine I can say you that speed difference between no-SEE and SSE1 compiled project was 8% and between SSE1 and SSE2 there was also 8% (no-SSE -> SSE2 ~15%). It was tested by compiling project after switching compiler option and trying it to render the same 3d scene. SSE are truly useful when you have a lot of data to process, and they are in very simple form, like adding-subtracting-multiplying one image from another image (so there is continuous array of RGB/RGBA pixels), and programmer will call SSE processor functions manually.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now