PLINQ

April 29, 2020

PLINQ, or Parallel LINQ when is deduced to automate parallelization. As you might have already guessed that it is parallelization using the same linq queries that we are used to. Except is all we do is say dot as parallel and then that query could run AsParallel query. If we were to execute a certain operation in parallel, we would need to partition the work into tasks, execute those tasks on multiple different threads, and then combine the results. The good news is that all this is done by PLINQ automatically and that’s why its declarative and not imperative.

Operations that can prevent a query from being parallelized

  • Take, Select, SelectMany, Skip, TakeWhile, SkipWhile, ElementAt

So if you are using one of these operations, the query would be sequential even though you might say as parallel, it still will be running sequentially

Anomalies

Some of these operators end up being anomalies when it comes to performance. If we start using these operators in a parallelizable query, we use an expensive partitioning strategy that can actually sometimes slow a query even more than a sequential query.

  • Join, GroupBy, GroupJoin, Distinct, Union, Intersect, Except

Force to run by parallel

Sometimes PLINQ can change the execution to sequential if it believes that it will be better off running that query in a sequential manner. It’s done by design because a lot of times, users may not fully understand that the query might run sequentially faster than the parallel query. It’s just done to avoid any kind of overhead.

However, if you are an expert user and you would like to force parallelism, all you need to do is right after AsParallel, use the execution mode property dot with execution mode and then pass in the parallel execution mode done or parallelism property and you will be able to force a parallel query every time you use dot as parallel. (.WithExecuationMode(ParallelExecuation.ForceParallelism))

I/O intensive queries

Something like an API call or a database call or even a call to a webpage. In those situations, PLINQ could be told defectively parallelize such queries by saying dot as parallel dot with degree of parallelism and then add that degree. So that’s one very handy way of dealing with I/O intensive operations.

Using degree of parallelism

Concurrency Bag

In case of the Parallel.For or Parallel.Foreach, we can actually order the results. The way it works is they still execute everything in parallel, but then later they call a sorting algorithm, which sometimes might mean that it might actually end up being slower than the sequential piece. However, it allows you to order it. 

The concurrency bag has one advantage that it actually lets you add the results as they’re coming. So anytime there is a production of result the concurrency bag will keep adding it. But the downside is that it will never be able to give you the order. So it’s actually faster, but will not be able to bring order even if you want it. So you can’t even force it to have an ordered result. It just wouldn’t sort it for you. You’ll have to do it manually.

Merge Options

  • AsParallel().WithMergeOptions(ParallelMergeOptions.NotBuffered)
  • AsParallel().WithMergeOptions(ParallelMergeOptions.FullyBuffered)
  • AsParallel().WithMergeOptions(ParallelMergeOptions.AutoBuffered)

NotBuffered: not buffered at all and yield results as coming.
FullyBuffered: buffer all the result and not yield at all and return all together.
AutoBuffered: Something in between, which return results by chunk

Leave a Reply:

Your email address will not be published. Required fields are marked *