标签 ‘ Performance

并发性能优化 – 降低锁粒度

原文链接  作者:Adrianos Dadis 译者:买蓉(sky.mairong@gmail.com) 校对:方腾飞

在高负载多线程应用中性能是非常重要的。为了达到更好的性能,开发者必须意识到并发的重要性。当我们需要使用并发时, 常常有一个资源必须被两个或多个线程共享。


  • 上下文切换
  • 内存同步
  • 阻塞


Write Combining


Modern CPUs employ lots of techniques to counteract the latency cost of going to main memory. These days CPUs can process hundreds of instructions in the time it takes to read or write data to the DRAM memory banks.

The major tool used to hide this latency is multiple layers of SRAM cache. In addition, SMP systems employ message passing protocols to achieve coherence between caches. Unfortunately CPUs are now so fast that even these caches cannot keep up at times. So to further hide this latency a number of less well known buffers are used.

This article explores “write combining store buffers” and how we can write code that uses them effectively.

CPU caches are effectively unchained hash maps where each bucket is typically 64-bytes. This is known as a “cache line”. The cache line is the effective unit of memory transfer. For example, an address A in main memory would hash to map to a given cache line C.


return top