Framework ’ 目录归档

剖析Disruptor:为什么会这么快?(三)揭秘内存屏障

原文地址:http://ifeve.com/disruptor-memory-barriers/

译者:杜建雄     校对:欧振聪

最近我博客文章更新有点慢,因为我在忙着写一篇介绍内存屏障(Memory Barries)以及如何将其应用于Disruptor的文章。问题是,无论我翻阅了多少资料,向耐心的MartinMike请教了多少遍,以试图理清一些知识点,可我总是不能直观地抓到重点。大概是因为我不具备深厚的背景知识来帮助我透彻理解。

所以,与其像个傻瓜一样试图去解释一些自己都没完全弄懂的东西,还不如在抽象和大量简化的层次上,把我在该领域所掌握的知识分享给大家 。Martin已经写了一篇文章《going into memory barriers》介绍内存屏障的一些具体细节,所以我就略过不说了。

免责声明:文章中如有错误全由本人负责,与Disruptor的实现和LMAX里真正懂这些知识的大牛们无关。

阅读全文

剖析Disruptor:为什么会这么快?(一)Ringbuffer的特别之处

原文地址:http://ifeve.com/ringbuffer/

作者:Trisha    译者寒桐  校对:方腾飞

最近,我们开源了LMAX Disruptor,它是我们的交易系统吞吐量快(LMAX是一个新型的交易平台,号称能够单线程每秒处理数百万的订单)的关键原因。为什么我们要将其开源?我们意识到对高性能编程领域的一些传统观点,有点不对劲。我们找到了一种更好、更快地在线程间共享数据的方法,如果不公开于业界共享的话,那未免太自私了。同时开源也让我们觉得看起来更酷。

从这个站点,你可以下载到一篇解释什么是Disruptor及它为什么如此高性能的文档。这篇文档的编写过程,我并没有参与太多,只是简单地插入了一些标点符号和重组了一些我不懂的句子,但是非常高兴的是,我仍然从中提升了自己的写作水平。

我发现要把所有的事情一下子全部解释清楚还是有点困难的,所有我准备一部分一部分地解释它们,以适合我的NADD听众。

首先介绍ringbuffer。我对Disruptor的最初印象就是ringbuffer。但是后来我意识到尽管ringbuffer是整个模式(Disruptor)的核心,但是Disruptor对ringbuffer的访问控制策略才是真正的关键点所在。

阅读全文

伪共享(False Sharing)

原文地址:http://ifeve.com/false-sharing/

作者:Martin Thompson  译者:丁一

缓存系统中是以缓存行(cache line)为单位存储的。缓存行是2的整数幂个连续字节,一般为32-256个字节。最常见的缓存行大小是64个字节。当多线程修改互相独立的变量时,如果这些变量共享同一个缓存行,就会无意中影响彼此的性能,这就是伪共享。缓存行上的写竞争是运行在SMP系统中并行线程实现可伸缩性最重要的限制因素。有人将伪共享描述成无声的性能杀手,因为从代码中很难看清楚是否会出现伪共享。

为了让可伸缩性与线程数呈线性关系,就必须确保不会有两个线程往同一个变量或缓存行中写。两个线程写同一个变量可以在代码中发现。为了确定互相独立的变量是否共享了同一个缓存行,就需要了解内存布局,或找个工具告诉我们。Intel VTune就是这样一个分析工具。本文中我将解释Java对象的内存布局以及我们该如何填充缓存行以避免伪共享。
阅读全文

剖析Disruptor:为什么会这么快?(二)神奇的缓存行填充

原文地址:http://ifeve.com/disruptor-padding/

作者:Trisha  译者:方腾飞 校对:丁一

我们经常提到一个短语Mechanical Sympathy,这个短语也是Martin博客的标题(译注:Martin Thompson),Mechanical Sympathy讲的是底层硬件是如何运作的,以及与其协作而非相悖的编程方式。

我在上一篇文章中提到RingBuffer后,我们收到一些关于RingBuffer中填充高速缓存行的评论和疑问。由于这个适合用漂亮的图片来说明,所以我想这是下一个我该解决的问题了。
(译注:Martin Thompson很喜欢用Mechanical Sympathy这个短语,这个短语源于赛车驾驶,它反映了驾驶员对于汽车有一种天生的感觉,所以他们对于如何最佳的驾御它非常有感觉。)

计算机入门

我喜欢在LMAX工作的原因之一是,在这里工作让我明白从大学和A Level Computing所学的东西实际上还是有意义的。做为一个开发者你可以逃避不去了解CPU,数据结构或者大O符号 —— 而我用了10年的职业生涯来忘记这些东西。但是现在看来,如果你知道这些知识并应用它,你能写出一些非常巧妙和非常快速的代码。

因此,对在学校学过的人是种复习,对未学过的人是个简单介绍。但是请注意,这篇文章包含了大量的过度简化。

CPU是你机器的心脏,最终由它来执行所有运算和程序。主内存(RAM)是你的数据(包括代码行)存放的地方。本文将忽略硬件驱动和网络之类的东西,因为Disruptor的目标是尽可能多的在内存中运行。

CPU和主内存之间有好几层缓存,因为即使直接访问主内存也是非常慢的。如果你正在多次对一块数据做相同的运算,那么在执行运算的时候把它加载到离CPU很近的地方就有意义了(比如一个循环计数-你不想每次循环都跑到主内存去取这个数据来增长它吧)。

阅读全文

False Sharing

原文地址http://mechanical-sympathy.blogspot.com/2011/07/false-sharing.html(有墙移到墙内)

作者

Memory is stored within the cache system in units know as cache lines.  Cache lines are a power of 2 of contiguous bytes which are typically 32-256 in size.  The most common cache line size is 64 bytes.   False sharing is a term which applies when threads unwittingly impact the performance of each other while modifying independent variables sharing the same cache line.  Write contention on cache lines is the single most limiting factor on achieving scalability for parallel threads of execution in an SMP system.  I’ve heard false sharing described as the silent performance killer because it is far from obvious when looking at code.

To achieve linear scalability with number of threads, we must ensure no two threads write to the same variable or cache line.  Two threads writing to the same variable can be tracked down at a code level.   To be able to know if independent variables share the same cache line we need to know the memory layout, or we can get a tool to tell us.  Intel VTune is such a profiling tool.  In this article I’ll explain how memory is laid out for Java objects and how we can pad out our cache lines to avoid false sharing.

cache-line.png
Figure 1.

阅读全文

Dissecting the Disruptor: How do I read from the ring buffer?

原文地址:http://mechanitis.blogspot.com/2011/06/dissecting-disruptor-how-do-i-read-from.html(因为有墙转载到墙内)

作者:Trisha

The next in the series of understanding the Disruptor pattern developed at LMAX.

After the last post we all understand ring buffers and how awesome they are.  Unfortunately for you, I have not said anything about how to actually populate them or read from them when you’re using the Disruptor.

ConsumerBarriers and Consumers
I’m going to approach this slightly backwards, because it’s probably easier to understand in the long run.  Assuming that some magic has populated it: how do you read something from the ring buffer?

(OK, I’m starting to regret using Paint/Gimp.  Although it’s an excellent excuse to purchase a graphics tablet if I do continue down this road.  Also UML gurus are probably cursing my name right now.)

阅读全文

Dissecting the Disruptor: Writing to the ring buffer

原文地址:http://mechanitis.blogspot.com/2011/07/dissecting-disruptor-writing-to-ring.html(因被墙移到墙内)  作者:Trisha

This is the missing piece in the end-to-end view of the Disruptor.  Brace yourselves, it’s quite long.  But I decided to keep it in a single blog so you could have the context in one place.

The important areas are: not wrapping the ring; informing the consumers; batching for producers; and how multiple producers work. 阅读全文

Disruptor 2.0 – All Change Please

原文地址:http://mechanitis.blogspot.com/2011/08/disruptor-20-all-change-please.html  (因被墙移到墙内)  作者:Trisha

Martin recently announced version 2.0 of the Disruptor – basically there have been so many changes since we first open-sourced it that it’s time to mark that officially.  His post goes over all the changes, the aim of this article is to attempt to translate my previous blog posts into new-world-speak, since it’s going to take a long time to re-write each of them all over again. Now I see the disadvantage of hand-drawing everything.

阅读全文

Dissecting the Disruptor: What’s so special about a ring buffer?

原文地址:http://mechanitis.blogspot.com/2011/06/dissecting-disruptor-whats-so-special.html(因被墙移到墙内) 作者:Trisha

Recently we open sourced the LMAX Disruptor, the key to what makes our exchange so fast.  Why did we open source it?  Well, we’ve realised that conventional wisdom around high performance programming is… a bit wrong. We’ve come up with a better, faster way to share data between threads, and it would be selfish not to share it with the world.  Plus it makes us look dead clever.

On the site you can download a technical article explaining what the Disruptor is and why it’s so clever and fast.  I even get a writing credit on it, which is gratifying when all I really did is insert commas and re-phrase sentences I didn’t understand.

However I find the whole thing a bit much to digest all at once, so I’m going to explain it in smaller pieces, as suits my NADD audience.

First up – the ring buffer.  Initially I was under the impression the Disruptor was just the ring buffer.  But I’ve come to realise that while this data structure is at the heart of the pattern, the clever bit about the Disruptor is controlling access to it.
阅读全文

剖析Disruptor:为什么会这么快?(一)锁的缺点

原文:http://ifeve.com/disruptor-locks-are-bad/

作者:Trisha’s  译者:张文灼,潘曦  整理和校对:方腾飞,丁一

Martin Fowler写了一篇非常好的文章,里面不仅提到了Disruptor,而且还解释了Disruptor 如何应用在LMAX的架构里。里面有提及了一些目前没有涉及的概念,但最经常问到的问题是 “Disruptor究竟是什么?"。

目前我正准备在回答这个问题,但首先回答"为什么它会这么快?"

这些问题持续出现,但是我不能没有说它是干什么的就说它为什么会这么快,不能没有说它为什么这样做就说它是什么东西。

所以我陷入了一个僵局,一个如何写博客的僵局。

要打破这个僵局,我准备以最简单方式回答第一个问题,如果幸运的话,在以后博文里,如果需要解释的话我会重新提回:Disruptor提供了一种线程之间信息交换的方式。

作为一个开发者,因为"线程"一词的出现,我的警钟已经敲响,它意味着并发,而并发是困难的。

阅读全文

LMAX架构

原文地址:http://martinfowler.com/articles/lmax.html 作者:Martin Fowler

译文地址:http://www.jdon.com/42452 译者:banq

LMAX是一种新型零售金融交易平台,它能够以很低的延迟(latency)产生大量交易(吞吐量). 这个系统是建立在JVM平台上,核心是一个业务逻辑处理器,它能够在一个线程里每秒处理6百万订单. 业务逻辑处理器完全是运行在内存中(in-memory),使用事件源驱动方式(event sourcing). 业务逻辑处理器的核心是Disruptors,这是一个并发组件,能够在无锁的情况下实现网络的Queue并发操作。他们的研究表明,现在的所谓高性能研究方向似乎和现代CPU设计是相左的。

目录

  1. 整体架构
  2. 业务逻辑处理
  3. Disruptors的输入和输出
  4. 队列和机制偏爱的缺乏
  5. 你需要使用这个架构吗

 

过去几年我们不断提供这样声音:免费午餐已经结束。我们不再能期望在单个CPU上获得更快的性能,因此我们需要写使用多核处理的并发软件,不幸的是, 编写并发软件是很难的,锁和信号量是很难理解的和难以测试,这意味着我们要花更多时间在计算机上,而不是我们的领域问题,各种并发模型,如Actors 和软事务STM(Software Transactional Memory), 目的是更加容易使用,但是按下葫芦飘起瓢,还是带来了bugs和复杂性.

我很惊讶听到去年3月QCon上一个演讲, LMAX是一种新的零售的金融交易平台。它的业务创新 – 允许任何人在一系列的金融衍生产品交易。这就需要非常低的延迟,非常快速的处理,因为市场变化很快,这个零售平台因为有很多人同时操作自然具备了复杂性,用户越多,交易量越大,不断快速增长。

鉴于多核心思想的转变,这种苛刻的性能自然会提出一个明确的并行编程模型 ,但是他们却提出用一个线程处理6百万订单,而且是每秒,在通用的硬件上。

通过低延迟处理大量交易,取得低延迟和高吞吐量,而且没有并发代码的复杂性,他们是怎么做到呢?现在LMAX已经产品化一段时间了,现在应该可以揭开其神秘而迷人的面纱了。

阅读全文

Dissecting the Disruptor: Demystifying Memory Barriers

原文地址:http://mechanitis.blogspot.com/2011/08/dissecting-disruptor-why-its-so-fast.html (因有墙,移到墙内)

My recent slow-down in posting is because I’ve been trying to write a post explaining memory barriersand their applicability in the Disruptor. The problem is, no matter how much I read and no matter how many times I ask the ever-patient Martin and Mike questions trying to clarify some point, I just don’t intuitively grasp the subject. I guess I don’t have the deep background knowledge required to fully understand.

So, rather than make an idiot of myself trying to explain something I don’t really get, I’m going to try and cover, at an abstract / massive-simplification level, what I do understand in the area.  Martin has written a post going into memory barriers in some detail, so hopefully I can get away with skimming the subject.

阅读全文

Dissecting the Disruptor: Why it’s so fast (part two) – Magic cache line padding

原文地址:http://mechanitis.blogspot.com/2011/07/dissecting-disruptor-why-its-so-fast_22.html(因有墙移到墙内)

We mention the phrase Mechanical Sympathy quite a lot, in fact it’s even Martin’s blog title.  It’s about understanding how the underlying hardware operates and programming in a way that works with that, not against it.

We get a number of comments and questions about the mysterious cache line padding in theRingBuffer, and I referred to it in the last post.  Since this lends itself to pretty pictures, it’s the next thing I thought I would tackle.
阅读全文

Dissecting the Disruptor: Why it’s so fast (part one) – Locks Are Bad

原文地址:http://mechanitis.blogspot.com/2011/07/dissecting-disruptor-why-its-so-fast.html(因前方有墙,所以我移到墙内)

Martin Fowler has written a really good article describing not only the Disruptor, but also how it fits into the architecture at LMAX.  This gives some of the context that has been missing so far, but the most frequently asked question is still “What is the Disruptor?”.

I’m working up to answering that.  I’m currently on question number two: “Why is it so fast?”.

 

These questions do go hand in hand, however, because I can’t talk about why it’s fast without saying what it does, and I can’t talk about what it is without saying why it is that way.

So I’m trapped in a circular dependency.  A circular dependency of blogging.

To break the dependency, I’m going to answer question one with the simplest answer, and with any luck I’ll come back to it in a later post if it still needs explanation: the Disruptor is a way to pass information between threads.
阅读全文

return top