Voice from the field

Donwloads: Demo Project

Task vs Thread

May 13 2016

Many developers believes that task always run in separate threads. Honestly, I also thought that task is running in separate threads to get performance benefits, but it not always true.

I believe tasks should works like following example

  1. User intact with WinForm App by clicking on the Button "Open" (A).
  2. App request Web service S1 (B) to get portions of data
  3. App request Web service S2 (C) to get another portion of data
  4. App merge requests and represent the model on UI for user (D).

The process has 3 threads. Thread #1 - main STA thread responsible for start task and process other activities while the task has been executed. Like respond on UI interaction.

Thread #2 - pool thread is executing first task and retrieve date from Web Service S1.

Thread #3 - pool thread is executing second task and retrieve date from Web Service S2.

By position in the time P1 process can find 2 threads are executing tasks. By position P2 just only one thread #2 continue and thread #1 was update result with retrieved data and got back to the start state. Main thread sill wait while data comes from thread #1.  

At the end "D" the process got all data delivered and bound and able to continue exciting further instruction.

 

In the C# code it might looks like this:

C#

private void button1_Click(object sender, EventArgs e)

{

     Task.Run(() => GetDataFromWebOne()); //run thread #1

     Task.Run(() => GetDataFromWebTwo()); //run thread #2

}

 

private static async Task GetDataFromWebOne()

{        

    /* load data */

    return;        

}        

private static async Task GetDataFromWebTwo()

{        

    /* load data */        

    return;        

}

So the "threading" scenario usually happened when code execute Task.Run(..) but not happened when code just await for task!!

 

However in real world task NOT need to be run in separate thread. Why? Simply because the thread's count is limited. Moreover, overtime for creating, starting and synchronizing thread will waist CPU.

To deal with this problem asynchronous tasks was been invented J Lets take a look in the same example for Async/Await pattern. No threads, no waist CPU.

  1. User intact with WinForm App by clicking on the Button "Open" (A).
  2. App request Web service S1 (B) to get portions of data
  3. App request Web service S2 (C) to get another portion of data
  4. App merge requests and represent the model on UI for user (D).

 

All requests are execute in the same thread (Main STA). While the code wait for response from Web services it, still be able to process UI interaction.

The magic is handed in the awaiting of the Task. In the time position P1 code just follow straight forward logic and await task #1. Then code has been executed further and start await for task #2 (position P2). By the time position P3 code runs two tasks and await for them while still been responsive for user interaction. It basically switch from executing of the task #1 to task #2 and other code. How? "It's state machine, baby!" J

Under the hood CLR create state machine for each task and able to switch between its executing simply transfer from one state to another. In the position P3 code follow switch context between state machines and main threads to make all of them processed.    

 

In C# it looks like following example:

C#

private async void button2_Click(object sender, EventArgs e)

{

     /* P1 */

     await GetDataFromWebOne();

     /* P2 */

     await GetDataFromWebTwo();

     /* P3 */

     /*

     ....do something or left function

     */

     /* P4 */

}      

 

I know that "one dump tells you more than hundreds of articles" and will explain it from the dump analyzing perspective. You may download the app attached to this post.

Please compile it with DEBUG mode first.

Lest run the project and attach windbg to it and select "Y" option to run all in main thread like a pure Async await pattern.

C#

private static async Task RunTaskInMainThreads()

{

     List<Task> tasks = new List<Task>();

     Enumerable.Range(1, 10).ToList().ForEach(i => tasks.Add(KeepCPUBusy(i)));

     await Task.WhenAll(tasks);

}

Then if you run un "Debug" mode you got braked on one of the action with ID =5  

C#

public static async Task KeepCPUBusy(int taksid)

{

       Console.WriteLine("Start task {0}", taksid);

#if DEBUG

       if (taksid == 5) DebugBreak();

#endif

       Random rnd = new Random(DateTime.Now.Millisecond);

       byte[] buffer = new byte[1024 * 1024 * 10];

       rnd.NextBytes(buffer);

...

}

In WinDBG I use netext extension for dumping threads.

You can see that state machine are create and been execute with MoveNext(..) to change the it's state and run the task of "KeepCPUBuzy(..)"

 

You can repeat the same operation with  "N" option selected to run all in thread pool

C#

private static async Task RunTaskInThreadPool()

{        

            

     List<Task> tasks = new List<Task>();

     Enumerable.Range(1, 10).ToList().ForEach(i => tasks.Add(Task.Run(() => KeepCPUBusy(i))));  //Task.RUN run task in the read pool

     await Task.WhenAll(tasks);

            

}

Brake in the 5th task and dump threads. Surprise! No state machine, tasks are running in the Pool.

 

Now lets compile the code with Release mode and run the performance test.

 

It demonstration in which case Thread better than Async. Threads demo works in ~10 times faster than Async, because this specific demo based on CPU bound load. On I/O bound load Async will works better. But if your code base on I/O and retrieve data from network or disk. Async/wait will provide you better performance.

 

In next posts I will try to demonstrate you scenario when Thread pool and Parallel is better (or worst) then Async/await.