Multi-Core computing

It's of interest to note that raw instruction speed is reaching a local if not absolute upper bound. Therefore, creating faster machines can only be done with increased parallel processing: adding multiple CPU sockets, or more efficiently and more likely, adding more cores to each CPU (core = independent processor within a given physical processing unit). Currently Intel offers both Dual and Quad core CPUs. They are effectively 2 and 4 independent processing units, respectively.

I have an Intel Core 2 Duo 2.66. Until now, I have not attempted to speed up my apps through multi-threading (since I have always had a single socketed desktop rather than a multi-socketed server, and therefore haven't until now stood to gain anything from it). Now, with this new chip however,  running more than one thread halves the completion time of CPU intensive tasks. For example: below is a simple program that attacks a problem with one and two threads and shows the respective times. If they're the same, then the chip doesn't support N-core technology (or doesn't have more than one socket on the mb); if they're different, then the 2nd one will be faster, and the difference is how much the 2nd thread was able to complete with the 1st thread concurrently.

On my 1.8 ghz P4 (at work), this program reports 18 seconds with 1 thread and 20 seconds with 2 threads (but is +- a few seconds, so basically the same).

At home on my 2.66 ghz Core 2 Duo, this takes ~4.3 s. and ~2.1 s. respectively. Each of my cores is 4 times faster than the 1.8 ghz P4, and if I multithread it, it takes 1/8th the total time on a Core 2 Duo (2.66) as it does on a 1.8 ghz P4.

'kb3/07 threading
Imports System.Threading
Module Module1
 
Dim N As Long
Dim M As Long = 2 'number of threads
Dim t(M - 1) As Thread
Delegate Sub ModTasks(ByVal N As Long)
Delegate Sub Printf(ByVal value As Object)
Delegate Function GetfromUser() As Object
 
 
 
Sub DoRegular(ByVal N As Long)
ModTask(New Long() {1, N, N})
End Sub
Sub DoWithThreads(ByVal N As Long)
For k As Integer = 0 To UBound(t)
Dim i As Long = k + 1
t(k) = New Thread(AddressOf ModTask)
t(k).Start(New Long() {(i - 1) * (N) \ M + 1, i * N \ M, N})
Next
For k As Integer = 0 To UBound(t)
t(k).Join()
Next
End Sub
Sub PrintConsole(ByVal value As Object)
Console.WriteLine(value.ToString)
End Sub
Function GetFromConsole() As Object
Return Console.ReadLine
End Function

 

Sub Main()
 
N = 402542656
Dim dR As New ModTasks(AddressOf DoRegular)
Dim dT As New ModTasks(AddressOf DoWithThreads)
Dim dP As New Printf(AddressOf PrintConsole)
Dim dG As New GetfromUser(AddressOf GetFromConsole)
 
Dim ticks As Long = Now.Ticks
dP.Invoke("Please wait while the operations are performed in one thread.")
dR.Invoke(N)
dP.Invoke("It took " & (Now.Ticks - ticks) / 10000000 & " s. single threaded.")
dP.Invoke("Doing the same thing with two threads, each doing half...")
ticks = Now.Ticks
dT.Invoke(N)
dP.Invoke("It took " & (Now.Ticks - ticks) / 10000000 & " s. double threaded.")
dG.Invoke()
End Sub
 
 
Sub ModTask(ByVal obj As Object)
Dim lb As Long = CLng(obj(0))
Dim ub As Long = CLng(obj(1))
Dim num As Long = CLng(obj(2))
For k As Long = lb To ub
Dim x As Long = num Mod k
Next
End Sub
End Module

 

 

Vista purportedly takes more advantage of multi-core CPUs than previous generation OSs. This will improve Vista's performance for multi-tasking but it doesn't improve the performance when running one application. I think in many years, a single application running on a single (state of the art) processor will perform about as well as that same application performs now on a single (state of the art) processor. Of course, computers may have 1000 or more of these processors per machine (1000 processors contained in a handful, or even just 1, physical CPU).

This change in the paradigm however, should impact the computer market negatively once consumers get wise to it;  although now everyone appears to be mostly oblivious to it.  Programmers need to take advantage of multi-core processing architecture, and they must change the way they write programs to do so. Previously, chips just ran faster and faster and the programmer didn't have to do much but run his old serial program on a faster CPU. It ran faster because the next gen CPU was simply able to complete instruction sets in less time. Voila, things got done quicker. Unfortunately, parallel programming doesn't work this way. I, as the programmer, must farm out independent tasks to different threads, then cobble them together into a coherent data structure or step and decide what to do with it. And I have to recode and recompile it. With just the standard executable in my hand, all the multi-processing cores in the world won't help. The code will execute whenever the fastest chip, running by itself, finishes the tasks.

Adding threads to programs is easy in general, but it's more than just additional steps.  If I had 80 cores just now on my CPU (as Intel just debuted a short bit ago), I could certainly make use of them.

I would divide the particle system into 80 parts, and for the main computational subroutine, spawn a thread to handle each part and update the results to its own region of a common data structure (an array of a particle object). After all of these had completed, the program performs the tasks required to display the particle system on the screen.

There is one little problem here though. I have to wait for all of the threads to finish. I've launched 80 concurrent processes, each of them referencing the same 'computation' procedure and passing different arguments (like which part of the common data structure each will update). The sub then waits for them to finish. Suppose one of those threads encountered a problem. It would be very hard to track down because, instead of erring out on one statement, and only one statement, in a very long sequence of instructions, I have a fork that goes in 80 different directions at a certain point. Debugging will be a nightmare if I don't know which thread is broken, and if I am viewing parameter values, it's going to be difficult if not impossible to see what the Nth thread was doing when it broke. I haven't tried "breaking" my data structure so that just one thread gets baked when it runs (I only have two threads at home, but I'll try this later) to see what VS.net 2005 will report.

Instructions that require previous results are thwarted by parallel processing. Animals use massive parallel processing. The human mind is the latest and best available animal computer. It uses many millions of slow, parallel processors (where clusters of neurons count as 'processors' where individual neurons better relate to transistor clusters) for everything from vision to thought. It cannot churn out millions of instructions per second sequentially, and in fact, serially, it's much slower than the earliest serial computers.

What it does do however is launch millions of threads simultaneously and cobble them all back together into a coherent object or idea after a few milliseconds.  This is the destiny of computers one way or another; and this process is incomprehensibly complicated and muddled. The human mind as a parallel processing program is unreadable. These future machines will be likewise unreadable, but not because they were designed by trial and error dictated only by random culling of populations and the exchange of DNA over billions of years as our brains were, but because it isn't possible to serialize parallel processes precisely. A rough cut can be done at a very slow pace (because each thread must be plucked and serialized), but the specifics of each process will remain largely unknown.

In the end, computers will be able to generate a familiar world because they will have become more like us.  In order to continue evolving, they must. The laws of physics are becoming increasingly heavy and limiting. Machines are currently the least like us that they will ever be: They're enormously fast, binary, singular, computing objects that push the boundaries of physics in one direction, but their design is as simple as it could be. Trivially simple. In order to improve, they will have to interact with themselves in chaotic ways as our neurons do. In the end, they'll have the best of both worlds.

3/09/07 kurt

---------------------

Monty Hall game, computed

 
I first heard of the Monty Hall problem in 1993 while an undergrad. It was mentioned in my discrete math class. Basically, the problem is this.
 
There are 3 closed doors. Behind one of the doors is a prize. Behind the other two are goats. You wish to choose the door behind which there is a prize.
So, you choose a door. After you choose, the game show host will open one of the other two doors to reveal a goat.
 
You are then asked if you wish to stay with the door you first selected or switch to the other still closed door. What do you do, or really, which choice gives you the best chance of choosing the door with the prize behind it?
 
The intuitive answer is to say, well, there are just two doors left, and it could be in either, so my chance is 50/50.
 
However, if you compute it mathematically, it suggests you have a 2/3 chance of winning if you switch and a 1/3 chance of winning if you stick with the door you chose initially.
 
The solution 2/3-1/3 is of course arrived at by noting that your chances of choosing the door with the prize initially is 1/3, and that you only have two choices: switch or stay. If you have 2 choices, and the sum of the individual probabilities is always 1, then (1-1/3) = 2/3  is the probability of winning if you  switch.
 
I programmed this in a simple C# program which can be used to demonstrate that in fact, you get a round a 2/3 win rate if you switch and around 1/3 if you don't switch over thousands of trials (or as many trials as you wish to use)
 

 

When running this for 10,000 trials each (10,000 switching and 10,000 not switching), the results are

Which is quite close to the theoretical 2/3, 1/3 analytic expected value.

 

 

using System;
using System.Collections.Generic;
using System.Text;
 
 
namespace goats
{
class Program
{
static System.Random R = new Random();
static int pick(int N)
{
double d = (double)R.Next((int)1, (int)N);
return (int)Math.Round(d, 0);
}// end int pick
static bool noSwitch()
{
int goat = pick(4);
int mychoice = pick(4);
if (goat == mychoice) { return true; }
else { return false; };
}
static bool Switch()
{
int goat = pick(4);
int mychoice = pick(4);
//Console.WriteLine(goat.ToString() + ", " + mychoice.ToString());
if (goat != mychoice) { return true; }
else { return false; };
}
static void Main(string[] args)
{
Console.Write("Enter the number of trials-> ");
int num = int.Parse (Console.ReadLine());
if (num < 1) { num = 1; };
int intNoSwitch = 0;
int intSwitch = 0;
for (int k = 0; k < num; k++)
{
if (noSwitch() == true) { /*Console.WriteLine("NoSwitch=true");*/ intNoSwitch++; };
if (Switch() == true) { /*Console.WriteLine("Switch=true");*/ intSwitch++; };
 
};
Console.WriteLine("Person switched: {0} wins in {1} trials->Win(%) {2}",intSwitch,num,(double)100*intSwitch / num);
Console.WriteLine("Person did not switch: {0} wins in {1} trials->Win(%){2} " ,intNoSwitch, num, (double)100*intNoSwitch / num);
Console.ReadLine();
}// end void main
}// end class program
}

 

kurt bingham 5/16/07

 
Home