CSC 5542 Neural Networks  Summer, 2004 

Assignment #2

Due Date: June 29, 2004

 

TSP Neural Network.

 

Reference: "A Fuzzy Hopfield-Tank TSP Model" -- in class notes.

 

 

Implement a neural network that "solves" the Euclidean Traveling Salesman Problem.  To do this you will represent the problem as an nxn array (grid) of neurons, where states that have one maximal neural activation in each row and column correspond to a tour (permutation matrix).  The neural activations will be initialized to very small random values, and then the network dynamics will drive the network into a state that corresponds to a permutation matrix (or close to it). 

 

Convergence Problems:  Unfortunately, there may be cases when the neural activations do not converge to a state that corresponds, unambiguously, to a permutation matrix.  First of all, the activations will only approach the maximum and minimum values (e.g.: .98), but you should have seen something similar with the KWTA network.  The best way to handle this is to pick a threshold value (say, .80), and then check the activations in the array at each iteration for "convergence".

 

Convergence Criteria:  If there is a unique neuron in each row and column with activation above the threshold, then halt the simulation and declare that permutation matrix to be the "output".

 

It is still possible that the activations will not satisfy these criteria.  So, you should have an absolute maximum number of iterations (Note:  I am not sure what a good value is for this -- I would guess at 10,000 iterations, but half that might be sufficient).  If most simulation runs reach the maximum number of iterations without satisfying the convergence criteria, then you should lower the threshold to 0.60 and try again.  The lower threshold may cause an occasional premature convergence, but it will also make the network run faster (fewer iterations).  In other words, with a higher threshold you may get slightly better tours, but it will cost you a little more run time.

 

Fuzzy Read Out:  In a creative attempt to "read" the network activations at every iteration (i.e.: convert the activations into a tour even when the activations are very small and when there is no clear winner in each row and column)  I invented what I call the "fuzzy read out".  In short, this means finding the center of mass of the positive activations in each row and then ordering the cities according ling.   The center of mass calculation is simplified by finding the maximum activation, and the corresponding column index, and then doing the calculation for a window of plus or minus "base", where "base" is set to a value of 2 or 3.  Also, we must pay attention to wrap around, so you will see some modular arithmetic in the pseudo-code below.

 

Here is the fuzzy read out calculation:

 

1.  Assuming the grid indices:

columns (time stops):           i

rows (cities):              j

 

2. Set: base = 2         // this is the width of the center of mass calculation.

 

3. Center of Mass calculation:        

for j = 0 to N-1 // for each city

            // calculate the max activation and corresponding index:

            max_act = -999999999

            i_max = -1

            for i = 0 to N-1

                        if act(i, j) > max_act then

max_act = act(i, j)

i_max = i

                                    end if

            next

           

            // calculate the center of mass:

            sum1 = 0

            sum2 = 0

            for k = i_max - base to i_max + base

                        if a((N+k)%N, j) > 0 then

                                    sum1 = sum1 + k * a((N+k)%N, j)

                                    sum2 = sum2 + a((N+k)%N, j)

                        end if

            next

 

            // now store the center of mass ("value") and the corresponding city index:

           

            center(j).value = sum1/sum2            // this is the "center of mass"

            center(j).city  = j                                 // this is the city associated with that center of mass

next

 

// Now SORT center by value (use a sort routine).

// After the sort, center(0).value, center(1).value, ...., center(N-1).value should be an increasing sequence // of center of mass values,

// And, we should have a corresponding sequence of cities: center(0).city, center(1).city, etc.

// So:

            for j = 0 to N-1

                        tour(j) = center(j).city

            next

// That's it.  The sequence of cities in tour(j) specifies the tour.

 

                       

Simulation Parameters:

 

Use an nxn grid of neurons.

 

n(i,x) is the neuron at the ith column (time stop) and xth row (city).

 

a(i,x) is the activation of neuron(i,x).

 

w(i,x,j,y) is the connection strength between neuron(i,x) and neuron(j,y).

 

net(i,x) = Sy Sj (w(i,x,j,y) * a(j,y))

Note: external input = 0.

 

Row connections:

if y = x and j <> i :

w(i,x,j,y) = 1/n2 - 1/n

 

Column connections:

if y <> x and j = i :

w(i,x,j,y) = 1/n2 - 1/n

 

Self connections:

            if y = x and j = i :

w(i,x,j,y) = 1/n2 - 2/n

 

Distance connections (neighboring columns):

            if y <> x and j = i+1 or j = i-1 :

w(i,x,j,y) = 1/n2 - d(x,y)/n

 

All other connections:

            if j <> i-1, i, i+1,  and y <> x :

                        w(i,x,j,y) = 1/n2                       

 

Maximum neural activation:

            M = 1

 

Minimum neural activation:

            m = -1/(n-1)

 

Dynamics:

            a(i,x)t+1 = a(i,x)t   +  step * (a(i,x)t - m) * (M - a(i,x)t) * net(i,x)t

 

Step size:

            step = 1.0

 

Initial activations:

            very small random numbers:

between 0 and 10-10

 

Convergence:

 

Use the convergence criteria described above.

 

City positions: In a unit square:

 

Case1: uniformly placed on a circle of radius = 0.25, centered at (0.5, 0.5)

           

            Case2: random (x,y) coordinates:

                        0 <= x <= 1

                        0 <= y <= 1

 

Number of cities:

 

            n = 20.

 

Post on your home page:

 

Write a report about this implementation, including:

 

A. An introduction that describes the TSP problem and the general idea for the neural network approach.  Include observations you made after seeing the simulations, including any ideas you have for what made, or could make, the network better (get shorter tours more consistently in fewer iterations, etc.).

 

B. A copy of the code.

 

C. Show the specific results for running the network twice for each case (case1, case2), with the following:

 

                        1. City positions.

                        2. Tour that resulted.

                        3. Drawing of the tour.

                        4. Length of the tour.

                        5. Final activations of the network.

6. Number of iterations for convergence.

7. A plot of the energy as a function of iteration.

 

D.  Convergence Issues:

 

1. What "convergence criteria" did you use? (How did your program know when to stop the simulation run?).

 

2.  It is assumed that you did more than a couple of simulation runs.  Approximately what percent of the time did the network converge to an ambiguous state (no clear winner in each row and column)?