問(wèn)題描述
一個(gè)全局變量在 2 個(gè)不同內(nèi)核上的 2 個(gè)并發(fā)運(yùn)行線程之間共享.線程寫(xiě)入和讀取變量.對(duì)于原子變量,一個(gè)線程可以讀取一個(gè)陳舊的值嗎?每個(gè)內(nèi)核可能在其緩存中都有一個(gè)共享變量的值,當(dāng)一個(gè)線程寫(xiě)入緩存中的副本時(shí),不同內(nèi)核上的另一個(gè)線程可能會(huì)從其自己的緩存中讀取過(guò)時(shí)的值.或者編譯器執(zhí)行強(qiáng)內(nèi)存排序以從另一個(gè)緩存中讀取最新值?c++11 標(biāo)準(zhǔn)庫(kù)有 std::atomic 支持.這與 volatile 關(guān)鍵字有何不同?在上述場(chǎng)景中,volatile 和 atomic 類型的行為有何不同?
A global variable is shared across 2 concurrently running threads on 2 different cores. The threads writes to and read from the variables. For the atomic variable can one thread read a stale value? Each core might have a value of the shared variable in its cache and when one threads writes to its copy in a cache the other thread on a different core might read stale value from its own cache. Or the compiler does strong memory ordering to read the latest value from the other cache? The c++11 standard library has std::atomic support. How this is different from the volatile keyword? How volatile and atomic types will behave differently in the above scenario?
推薦答案
首先,volatile
并不意味著原子訪問(wèn).它專為諸如內(nèi)存映射 I/O 和信號(hào)處理之類的事情而設(shè)計(jì).volatile
與 std::atomic
一起使用時(shí)完全沒(méi)有必要,除非您的平臺(tái)另有說(shuō)明,否則 volatile
與原子訪問(wèn)或內(nèi)存排序無(wú)關(guān)線程之間.
Firstly, volatile
does not imply atomic access. It is designed for things like memory mapped I/O and signal handling. volatile
is completely unnecessary when used with std::atomic
, and unless your platform documents otherwise, volatile
has no bearing on atomic access or memory ordering between threads.
如果您有一個(gè)在線程之間共享的全局變量,例如:
If you have a global variable which is shared between threads, such as:
std::atomic<int> ai;
那么可見(jiàn)性和排序約束取決于您用于操作的內(nèi)存排序參數(shù),以及鎖、線程和訪問(wèn)其他原子變量的同步效果.
then the visibility and ordering constraints depend on the memory ordering parameter you use for operations, and the synchronization effects of locks, threads and accesses to other atomic variables.
在沒(méi)有任何額外同步的情況下,如果一個(gè)線程向 ai
寫(xiě)入一個(gè)值,則無(wú)法保證另一個(gè)線程在任何給定時(shí)間段內(nèi)都能看到該值.該標(biāo)準(zhǔn)規(guī)定它應(yīng)該在合理的時(shí)間段內(nèi)"可見(jiàn),但任何給定的訪問(wèn)都可能返回一個(gè)陳舊的值.
In the absence of any additional synchronization, if one thread writes a value to ai
then there is nothing that guarantees that another thread will see the value in any given time period. The standard specifies that it should be visible "in a reasonable period of time", but any given access may return a stale value.
std::memory_order_seq_cst
的默認(rèn)內(nèi)存排序?yàn)樗凶兞康乃?std::memory_order_seq_cst
操作提供了一個(gè)全局總順序.這并不意味著您無(wú)法獲得過(guò)時(shí)的值,但這確實(shí)意味著您獲得的值決定了您的操作在整個(gè)順序中的位置.
The default memory ordering of std::memory_order_seq_cst
provides a single global total order for all std::memory_order_seq_cst
operations across all variables. This doesn't mean that you can't get stale values, but it does mean that the value you do get determines and is determined by where in this total order your operation lies.
如果您有 2 個(gè)共享變量 x
和 y
,初始為零,并且有一個(gè)線程向 x
寫(xiě)入 1,另一個(gè)向 x
寫(xiě)入 2y
,那么讀取兩者的第三個(gè)線程可能會(huì)看到 (0,0)、(1,0)、(0,2) 或 (1,2),因?yàn)閮烧咧g沒(méi)有排序約束操作,因此操作可以在全局順序中以任何順序出現(xiàn).
If you have 2 shared variables x
and y
, initially zero, and have one thread write 1 to x
and another write 2 to y
, then a third thread that reads both may see either (0,0), (1,0), (0,2) or (1,2) since there is no ordering constraint between the operations, and thus the operations may appear in any order in the global order.
如果兩個(gè)寫(xiě)入都來(lái)自同一個(gè)線程,則 x=1
在 y=2
之前,讀取線程在 y
之前讀取 y
code>x then (0,2) 不再是一個(gè)有效的選項(xiàng),因?yàn)樽x取 y==2
意味著更早的寫(xiě)入 x
是可見(jiàn)的.其他 3 對(duì) (0,0)、(1,0) 和 (1,2) 仍然是可能的,這取決于 2 個(gè)讀取與 2 個(gè)寫(xiě)入的交錯(cuò)方式.
If both writes are from the same thread, which does x=1
before y=2
and the reading thread reads y
before x
then (0,2) is no longer a valid option, since the read of y==2
implies that the earlier write to x
is visible. The other 3 pairings (0,0), (1,0) and (1,2) are still possible, depending how the 2 reads interleave with the 2 writes.
如果您使用其他內(nèi)存排序,例如 std::memory_order_relaxed
或 std::memory_order_acquire
,那么約束會(huì)進(jìn)一步放寬,并且單個(gè)全局排序不再適用.如果沒(méi)有額外的同步,線程甚至不必就兩個(gè)存儲(chǔ)的順序達(dá)成一致以分隔變量.
If you use other memory orderings such as std::memory_order_relaxed
or std::memory_order_acquire
then the constraints are relaxed even further, and the single global ordering no longer applies. Threads don't even necessarily have to agree on the ordering of two stores to separate variables if there is no additional synchronization.
保證您擁有最新"的唯一方法value 是使用讀-修改-寫(xiě)操作,例如 exchange()
、compare_exchange_strong()
或 fetch_add()
.讀-修改-寫(xiě)操作有一個(gè)額外的限制,即它們總是對(duì)最新的"數(shù)據(jù)進(jìn)行操作.值,因此一系列線程的一系列 ai.fetch_add(1)
操作將返回一個(gè)沒(méi)有重復(fù)或間隙的值序列.在沒(méi)有額外約束的情況下,仍然無(wú)法保證哪些線程會(huì)看到哪些值.特別要注意的是,使用 RMW 操作不會(huì)強(qiáng)制其他線程的更改更快地變得可見(jiàn),這只是意味著如果 RMW 沒(méi)有看到這些更改,那么所有線程必須同意它們?cè)谠幼兞康男薷捻樞蛑斜?RMW 操作晚.來(lái)自不同線程的存儲(chǔ)仍然可以延遲任意時(shí)間,這取決于 CPU 實(shí)際何時(shí)將存儲(chǔ)發(fā)布到內(nèi)存(而不僅僅是它自己的存儲(chǔ)緩沖區(qū)),物理執(zhí)行線程的 CPU 相距多遠(yuǎn)(在多處理器系統(tǒng)的情況下),以及緩存一致性協(xié)議的詳細(xì)信息.
The only way to guarantee you have the "latest" value is to use a read-modify-write operation such as exchange()
, compare_exchange_strong()
or fetch_add()
. Read-modify-write operations have an additional constraint that they always operate on the "latest" value, so a sequence of ai.fetch_add(1)
operations by a series of threads will return a sequence of values with no duplicates or gaps. In the absence of additional constraints, there's still no guarantee which threads will see which values though. In particular, it is important to note that the use of an RMW operation does not force changes from other threads to become visible any quicker, it just means that if the changes are not seen by the RMW then all threads must agree that they are later in the modification order of that atomic variable than the RMW operation. Stores from different threads can still be delayed by arbitrary amounts of time, depending on when the CPU actually issues the store to memory (rather than just its own store buffer), physically how far apart the CPUs executing the threads are (in the case of a multi-processor system), and the details of the cache coherency protocol.
使用原子操作是一個(gè)復(fù)雜的話題.我建議您閱讀大量背景資料,并在使用原子編寫(xiě)生產(chǎn)代碼之前檢查已發(fā)布的代碼.在大多數(shù)情況下,編寫(xiě)使用鎖的代碼更容易,而且效率不會(huì)明顯降低.
Working with atomic operations is a complex topic. I suggest you read a lot of background material, and examine published code before writing production code with atomics. In most cases it is easier to write code that uses locks, and not noticeably less efficient.
這篇關(guān)于并發(fā):C++11 內(nèi)存模型中的原子性和易失性的文章就介紹到這了,希望我們推薦的答案對(duì)大家有所幫助,也希望大家多多支持html5模板網(wǎng)!