Jump to content

Recommended Posts

Posted
Hi,


My name is Ilya Gazman. I found an exact solution for Traveling salesmen problem. Currently the best implementation according to wiki is Held–Karp algorithm that solves the problem in time O(N^2 * 2^N).


I believe that my algorithm can do this in O(N*C * 2^N) when C is a bit smaller than 1. I need your help to officially approve my work and update the wiki page.


So if this is interesting you here is how I did it:


Lets say we want to solve a 6 cities route with the brout algorithm. There are (6-1)! options for that, we will need to test them all and return the shortest route founded. So it will look something like that(Cities names are: A, B, C, D, E, F):


Option 1 : A -> B -> C -> D -> E -> F -> A

Option 2 : A -> B -> C -> D -> F -> E -> A

Option 3 : A -> C -> B -> D -> E -> F -> A

Option 4 : A -> C -> B -> D -> F -> E -> A

.

.

Option 119

Option 120


Now I am saying that after calculating option 1, you can skip over options 2 and only calculate part of options 3 and 4. How do you do that? It's simple: When calculating option 1 you need to calculate what will be the shortest route starting from City D, finishing in City A, and going thru cities E, F. Now you can skip option 2 because you already calculated it in option 1. And when you start calculating options 3 and 4 you can stop when reaching City D, because you already know what would be the shortest route starting at city D, finishing in City A and going thru cities E, F.


This is the principle that I used in my algorithm. I run a brute algorithm and mapped all the sub results, those results are not sub routes, do not confuse there. They are just part of calculation that need to be done in order to find the shortest route. So each time I recognize I am doing the same calculation I used a solution from a map.


Here is an output of my algorithm running over 19 cities, (in the attached file you can find java code that implements it).




Source(19) [10.0,65.0, 34.0,52.0, 37.0,98.0, 39.0,44.0, 43.0,37.0, 45.0,89.0, 66.0,79.0, 69.0,74.0, 7.0,76.0, 70.0,15.0, 77.0,27.0, 78.0,11.0, 78.0,13.0, 80.0,5.0, 81.0,38.0, 82.0,64.0, 87.0,7.0, 90.0,61.0, 93.0,31.0]
Finish MapEngine test after 321550 mills
Created: 20801457
Map(3) Write 2448 Read 34272
Map(4) Write 12240 Read 159120
Map(5) Write 42840 Read 514080
Map(6) Write 111384 Read 1225224
Map(7) Write 222768 Read 2227680
Map(8) Write 350064 Read 3150576
Map(9) Write 437580 Read 3500640
Map(10) Write 437580 Read 3084270
Map(11) Write 352185 Read 2344256
Map(12) Write 245131 Read 1382525
Map(13) Write 135638 Read 570522
Map(14) Write 54320 Read 156758
Map(15) Write 15077 Read 27058
Map(16) Write 2809 Read 2087
Map(17) Write 306 Read 0
Map(18) Write 18 Read 0
Map(19) Write 1 Read 0


0) 295.5947584525372> [10.0,65.0, 34.0,52.0, 39.0,44.0, 43.0,37.0, 70.0,15.0, 78.0,13.0, 78.0,11.0, 80.0,5.0, 87.0,7.0, 77.0,27.0, 93.0,31.0, 81.0,38.0, 90.0,61.0, 82.0,64.0, 69.0,74.0, 66.0,79.0, 45.0,89.0, 37.0,98.0, 7.0,76.0, 10.0,65.0]


Source(19) is the input cities. It took my PC 321550 mills to calculate, (about 5 minutes). Created: 20801457 represent the number of atomic actions that the algorithm performed(about 20M actions). Map(3) speaks about the number of maps with 3 cities that been created. It created 2448 3 cities maps and used them 34272 times.


To calculate the efficiency of my algorithm all you need to do is to sum all the maps that he produce, then you will get the answer.


So the number of maps that my algorithm will produce with K cities size in N cities route will be: The number of times I can select the first city of my map: N, multiplies the number of times I can choose different selection of my cities from the remaining cities: (n-1)! / ((n - k - 1)! * (k-1)!). Thas come to n! / ((n - k - 1)! * (k-1)!). Assuming that creating a map of size 3 is an atomic action, then my algorithm efficiency will be the sum of all those maps.


So my algorithm have the next efficiency.




N * (N - 1) * (N - 2) / 2! + N * (N - 1) * (N - 2) * (N - 3) / 3! + N * (N - 1) * (N - 2) * (N - 3) (N -4) / 4! + ... N! / (N - 1)! = N * (N - 1) * (N - 2) / 2! + N * (N - 1) * (N - 2) * (N - 3) / 3! + N * (N - 1) * (N - 2) * (N - 3) (N -4) / 4! + ... N



Now lets solve this efficient algorithm with N from 7 to 100, and compare it to the previous results(result of N = 9 with N =8, result of N = 24 with N = 23). I found out that for big numbers of N the comparison result is 2. Then I did the same with the traditional dynamic programing algorithm efficiency. Here is the list of what I got:




7 2.55769 2.72222 2.98397
8 2.40601 2.61224 2.74973
9 2.31562 2.53125 2.60507
10 2.2582 2.46913 2.50912
11 2.21972 2.42 2.44169
12 2.19258 2.38016 2.39191
13 2.17251 2.34722 2.35356
14 2.15701 2.31952 2.32293
15 2.14456 2.29591 2.29774
16 2.13424 2.27555 2.27652
17 2.12548 2.25781 2.25832
18 2.1179 2.24221 2.24248
19 2.11124 2.22839 2.22853
20 2.10533 2.21606 2.21614
21 2.10003 2.205 2.20503
22 2.09525 2.19501 2.19503
23 2.09091 2.18595 2.18596
24 2.08696 2.17769 2.17769
25 2.08333 2.17013 2.17014
26 2.08 2.1632 2.1632
27 2.07692 2.1568 2.1568
28 2.07407 2.15089 2.15089
29 2.07142 2.1454 2.1454
30 2.06896 2.1403 2.1403
31 2.06666 2.13555 2.13555
32 2.06451 2.13111 2.13111
33 2.0625 2.12695 2.12695
34 2.0606 2.12304 2.12304
35 2.05882 2.11937 2.11937
36 2.05714 2.11591 2.11591
37 2.05555 2.11265 2.11265
38 2.05405 2.10956 2.10956
39 2.05263 2.10664 2.10664
40 2.05128 2.10387 2.10387
41 2.05 2.10125 2.10125
42 2.04878 2.09875 2.09875
43 2.04761 2.09637 2.09637
44 2.04651 2.0941 2.0941
45 2.04545 2.09194 2.09194
46 2.04444 2.08987 2.08987
47 2.04347 2.0879 2.0879
48 2.04255 2.08601 2.08601
49 2.04166 2.0842 2.0842
50 2.04081 2.08246 2.08246
51 2.04 2.0808 2.0808
52 2.03921 2.0792 2.0792
53 2.03846 2.07766 2.07766
54 2.03773 2.07618 2.07618
55 2.03703 2.07475 2.07475
56 2.03636 2.07338 2.07338
57 2.03571 2.07206 2.07206
58 2.03508 2.07079 2.07079
59 2.03448 2.06956 2.06956
60 2.03389 2.06837 2.06837
61 2.03333 2.06722 2.06722
62 2.03278 2.06611 2.06611
63 2.03225 2.06503 2.06503
64 2.03174 2.06399 2.06399
65 2.03125 2.06298 2.06298
66 2.03076 2.06201 2.06201
67 2.0303 2.06106 2.06106
68 2.02985 2.06014 2.06014
69 2.02941 2.05925 2.05925
70 2.02898 2.05839 2.05839
71 2.02857 2.05755 2.05755
72 2.02816 2.05673 2.05673
73 2.02777 2.05594 2.05594
74 2.02739 2.05516 2.05516
75 2.02702 2.05441 2.05441
76 2.02666 2.05368 2.05368
77 2.02631 2.05297 2.05297
78 2.02597 2.05228 2.05228
79 2.02564 2.05161 2.05161
80 2.02531 2.05095 2.05095
81 2.025 2.05031 2.05031
82 2.02469 2.04968 2.04968
83 2.02439 2.04907 2.04907
84 2.02409 2.04848 2.04848
85 2.0238 2.0479 2.0479
86 2.02352 2.04733 2.04733
87 2.02325 2.04678 2.04678
88 2.02298 2.04624 2.04624
89 2.02272 2.04571 2.04571
90 2.02247 2.04519 2.04519
91 2.02222 2.04469 2.04469
92 2.02197 2.04419 2.04419
93 2.02173 2.04371 2.04371
94 2.0215 2.04324 2.04324
95 2.02127 2.04277 2.04277
96 2.02105 2.04232 2.04232
97 2.02083 2.04188 2.04188
98 2.02061 2.04144 2.04144
99 2.0204 2.04102 2.04102
100 2.0202 2.0406 2.0406



Column 1 is N, column 2 is my algorithm efficiency compare, column 3 is dynamic programming algorithm compare and column 4 is my algorithm efficiency multiply N compare.


See how column 3 and 4 are almost the same. This is how I found it. Please verify my work, take a look at the code, tell me if you agree or not with me. If not please show me where my algorithm or my math is not working by exact sample.

CityMapSample2.zip

Posted (edited)

If city has location x,y

then distance between two cities is

sqrt((city1.x-city2.x)^2+(city1.y-city2.y)^2)

 

sqrt() operation is slow to cpu.

It might be possible to get ride of it during the main calculation because

sqrt(x)>sqrt(y)

is returning the same as

x>y

We are interested what is greater after all, not what is exact value, isn't?

 

sqrt() use in final calculation of path length for user, where speed is not important.

 

pow(x,y) with any y is also slow.

For such simple thing as ^2 I would never thought about using pow(), simply x*x

 

private double mathOption(City city1, City city2){
double powerX = Math.pow(city1.x - city2.x, 2);
double powerY = Math.pow(city1.y - city2.y, 2);
return Math.sqrt(powerX + powerY);
}

 

Change to:

private double mathOption(City city1, City city2){
double dx = city1.x - city2.x;
double dy = city1.y - city2.y;

dx *= dx;

dy *= dy;
return ( dx + dy );
}

 

And try whether it's running faster.

 

 

Did you thought about precalculating data?

 

If you have x cities, you can allocate x^2 distance table and fill it at beginning

And then reuse instead of calculating over and over again distance from city to city[j].

But this would require using index to city, instead of City object.

 

distance=distances[ i*count+j ]

i = 0...count-1

j = 0...count-1

 

Dynamic tables are slow, they require reallocating and copying data.

In the worstest implementation of dynamic table, it's extended every added element to array. Inserting element at beginning must end up with copying and pasting the all elements (the only way to get ride of it, is using double linked lists. I doubt java uses them).

If we know how many elements table will have (city count), better is to allocate static table and then fill, instead of relying on dynamic table/array.

(I noticed this in Path class)

Edited by Sensei
Posted

Thank you Quark, for reviewing the code. This indeed will improve the speed. As you probably saw more improvements can be done. This is just a sample and it's purpose to explain the algorithm. This is why I used pow. Another improvement could be working with integers, instead of doubles. And only use double for the full calculation(Of course it will require more code and testing if this is possible, for the current given input of the cities, and I think it will be possible in 99% of the cases).

 

Another problem of the code, is that it is using Ram memory and not database, even so db is slower, it will allow to perform bigger calculation. On my debugger, I am getting out of memory for N = 20;

Posted

Another problem of the code, is that it is using Ram memory and not database, even so db is slower, it will allow to perform bigger calculation. On my debugger, I am getting out of memory for N = 20;

 

The problem is that Path and therefor PathList contain copies of the all Cities.

If you have ArrayList<Object> it's taking count * sizeof( Object ) memory. sizeof( City ) = 20, so count * 20 for each Path at least.

You really don't want whole structure of City copied there. What you need is just 32 bit index, 4 bytes long, to City from table of cities.

It will reduce immediately memory requirement to 20% what you have now.

Using unsigned short, 16 bit, would reduce number of cities to 16384, and memory requirement to 10% of now.

Posted

Just a sub-sub-detail: integers are often slower than doubles or floats. It's often the case for multiplication - just because Cpu designers don't pay much attention to integers - and always for division, because integer division uses an exact algorithm which take about one cycle per bit (or slightly less now) while double division and sqrt are approximate and iterative algorithms that double the accurate bits at each iteration.

 

Presently, Cpu take one cycle per multiply or add on doubles (or even on double pairs, some Cpu on double quadruplets) while an int multiply can need several cycles. And did I read that since the Core2, integer computations are made by the Sse hardware? Though, if building hardware purposely, integers do save gate counts and may be faster, sure.

 

Sorry for interrupting, I understand operation speed is not a part of the algorithm's intrinsic speed. Any improvement on such a well-known (...and useless!) problem would be remarquable.

Posted

What said Enthalpy is about SSE extension: it's used when you enable it in C/C++ compiler, and get used usually when you are using vector math, f.e. adding/subtracting/multiplying vector with 4 elements by other vector with 4 elements. Basically execute instruction doing multiple maths in one.

http://en.wikipedia.org/wiki/Streaming_SIMD_Extensions

Visual Studio 2008 Express is supporting only no-SSE, SSE1, SSE2.

I am not sure how it's with newer Express and commercial Visual Studio.

But you're using Java, not C/C++.

 

From experience with very math heavy ray-tracing engine I can say you that speed difference between no-SEE and SSE1 compiled project was 8% and between SSE1 and SSE2 there was also 8% (no-SSE -> SSE2 ~15%).

It was tested by compiling project after switching compiler option and trying it to render the same 3d scene.

SSE are truly useful when you have a lot of data to process, and they are in very simple form, like adding-subtracting-multiplying one image from another image (so there is continuous array of RGB/RGBA pixels), and programmer will call SSE processor functions manually.

 

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.