pbootcms网站模板|日韩1区2区|织梦模板||网站源码|日韩1区2区|jquery建站特效-html5模板网

訪問 ConcurrentHashMap<Element, Boolean> 的每

Scalable way to access every element of ConcurrentHashMaplt;Element, Booleangt; exactly once(訪問 ConcurrentHashMaplt;Element, Booleangt; 的每個元素的可擴展方式恰好一次)
本文介紹了訪問 ConcurrentHashMap<Element, Boolean> 的每個元素的可擴展方式恰好一次的處理方法,對大家解決問題具有一定的參考價值,需要的朋友們下面隨著小編來一起學(xué)習(xí)吧!

問題描述

我有 32 個機器線程和一個 ConcurrentHashMap<Key,Value>map,其中包含很多鍵.Key 定義了一個公共方法 visit().我想visit() 使用我可用的處理能力以及可能的某種線程池,只對 map 的每個元素進行一次.

I have 32 machine threads and one ConcurrentHashMap<Key,Value> map, which contains a lot of keys. Key has defined a public method visit(). I want to visit() every element of map exactly once using the processing power I have available and possibly some sort of thread pooling.

我可以嘗試的事情:

  • 我可以使用 map.keys() 方法.生成的 Enumeration 可以使用 nextElement() 進行迭代,但是由于對 key.visit() 的調(diào)用非常簡短,因此我無法使線程保持忙碌.枚舉本質(zhì)上是單線程的.
  • 我可以使用同步的 HashSet<Key> 代替,調(diào)用方法 toArray() 并將數(shù)組上的工作分成所有 32 個線程.我嚴(yán)重懷疑這個解決方案,因為方法 toArray() 很可能是單線程瓶頸.
  • 我可以嘗試從 ConcurrentHashMap 繼承,掌握其內(nèi)部 Segment<K,V> 的實例,嘗試將它們分成 32 個組并工作分別在每組上.不過,這聽起來像是一種硬核方法.
  • 或使用 Enumeration<Key> 的類似魔法.
  • I could use the method map.keys(). The resulting Enumeration<Key> could be iterated over using nextElement(), but since a call to key.visit() is very brief I won't manage to keep threads busy. The Enumeration is inherently single-threaded.
  • I could use a synchronised HashSet<Key> instead, invoke a method toArray() and split the work on the array into all 32 threads. I seriously doubt in this solution, since the method toArray() will likely be a single-thread bottle-neck.
  • I could try to inherit from ConcurrentHashMap, get my hands on the instances of its inner Segment<K,V>, try to group them into 32 groups and work on each group separately. This sounds like a hardcore approach though.
  • or similar magic with Enumeration<Key>.

理想情況下:

  • 理想情況下,ConcurrentHashMap<Key, Value> 將定義一個方法 keysEnumerator(intapproxPosition),這可能會使我丟失大約前 1/32 個元素的枚舉器,即map.keysEnumerator(map.size()/32)
  • Ideally a ConcurrentHashMap<Key, Value> would define a method keysEnumerator(int approximatePosition), which could drop me an enumerator missing approximately first 1/32 elements, i.e. map.keysEnumerator(map.size()/32)

我錯過了什么明顯的東西嗎?有沒有人遇到過類似的問題?

Am I missing anything obvious? Has anybody run into similar problem before?

編輯

我已經(jīng)進行了分析,看看這個問題是否真的會影響實踐中的性能.由于目前我無法訪問集群,因此我使用筆記本電腦并嘗試將結(jié)果外推到更大的數(shù)據(jù)集.在我的機器上,我可以創(chuàng)建一個 200 萬個鍵 ConcurrentHashMap,并且在每個鍵上調(diào)用 visit() 方法來迭代它大約需要 1 秒.該程序應(yīng)該擴展到 8500 萬鍵(及以上).集群的處理器稍微快一些,但它仍然需要大約 40 秒來迭代整個地圖.現(xiàn)在談?wù)劤绦虻倪壿嬃鞒?呈現(xiàn)的邏輯是順序的,即在上一步中的所有線程都完成之前,不允許任何線程進行下一步:

I've had a go at profiling to see whether this problem is actually going to affect the performance in practice. As I don't have access to the cluster at the moment I used my laptop and tried to extrapolate the results to a bigger dataset. On my machine I can create a 2 million keys ConcurrentHashMap and it takes about 1 second to iterate over it invoking the visit() method on every key. The program is supposed to scale to 85 million keys (and over). The cluster's processor is slightly faster, but it still should take about 40 seconds to iterate over entire map. Now a few words about the logic flow of the program. The logic presented is sequential, i.e. it is not allowed for any thread to proceed to the next step until all the threads in the previous step are finished:

  1. 創(chuàng)建哈希映射,創(chuàng)建鍵并填充哈希映射
  2. 遍歷訪問所有鍵的整個哈希映射.
  3. 進行一些數(shù)據(jù)混洗,即并行插入和刪除.
  4. 將第 2 步和第 3 步重復(fù)數(shù)百次.
  1. Create the hash map, create the keys and populate the hash map
  2. Iterate over entire hash map visiting all the keys.
  3. Do some data shuffling which is parallel insertions and deletions.
  4. Repeat step 2 and 3 a few hundred times.

這個邏輯流程意味著一個 40 秒的迭代將被重復(fù)幾百次,比如 100 次.這讓我們在訪問節(jié)點上花費了 一個多小時.使用一組 32 個并行迭代器,它可以縮短到幾分鐘,這是一個顯著的性能改進.

That logic flow means that a 40 second iteration is going to be repeated a few hundred times, say 100. Which gives us a bit over an hour spent just in visiting the nodes. With a set of 32 parallel iterators it could go down to just a few minutes, which is a significant performance improvement.

現(xiàn)在談?wù)?ConcurrentHashMap 是如何工作的(或者我認為它是如何工作的).每個 ConcurrentHashMap 都由段組成(默認為 16 個).對哈希映射的每次寫入都會在相關(guān)段上同步.假設(shè)我們正在嘗試將兩個新鍵 k1 和 k2 寫入哈希映射,并且它們將被解析為屬于同一段,例如 s1.如果嘗試同時寫入它們,則其中一個將首先獲取鎖,然后再添加另一個.兩個元素被解析為屬于同一段的機會是多少?如果我們有一個好的散列函數(shù)和 16 個段,那么它就是 1/16.

Now a few words on how ConcurrentHashMap works (Or how I believe it works). Every ConcurrentHashMap consists of segments (by default 16). Every write to a hash map is synchronised on a relevant segment. So say we're trying to write two new keys k1 and k2 to the hash map and that they would be resolved to belong to the same segment, say s1. If they are attempted to be written simultaneously, one of them is going to acquire the lock first and be added earlier then the other. What is the chance of two elements to be resolved to belong to the same segment? In case we have got a good hash function and 16 segements it is 1/16.

我相信 ConcurrentHashMap 應(yīng)該有一個方法 concurrentKeys(),它將返回一個枚舉數(shù)組,每個段一個.我有一些想法如何通過繼承將它添加到 ConcurrentHashMap,如果我成功了,我會告訴你的.就目前而言,解決方案似乎是創(chuàng)建一個 ConcurrentHashMaps 數(shù)組并預(yù)??先散列每個鍵以解析為此類數(shù)組的一個成員.準(zhǔn)備好后,我也會分享該代碼.

I belive that ConcurrentHashMap should have a method concurrentKeys(), which would return an array of Enumerations, one per each segment. I have got a few ideas how to add it to ConcurrentHashMap through inheritance and I'll let you know if I succeed. As for now the solution seems to be to create an array of ConcurrentHashMaps and pre-hashing every key to resolve to one member of such array. I'll share that code as well, once it's ready.

編輯

這是不同語言的相同問題:

This is the same problem in a different language:

并行迭代器

推薦答案

我最終會采用的解決方案是一個 ConcurrentHashMaps 數(shù)組,而不是一個 ConcurrentHashMap.這是臨時的,但似乎與我的用例有關(guān).我不在乎第二步的速度很慢,因為它不會影響我的代碼的性能.解決辦法是:

The solution I will eventually go for is an array of ConcurrentHashMaps instead of one ConcurrentHashMap. This is ad hoc, but seems to be relevant for my usecase. I don't care about the second step being slow as it doesn't affect my code's performance. The solution is:

對象創(chuàng)建:

  1. 創(chuàng)建一個大小為 t 的 ConcurrentHashMaps 數(shù)組,其中 t 是線程數(shù).
  2. 創(chuàng)建一個大小為 t 的 Runnables 數(shù)組.
  1. Create an array of size t of ConcurrentHashMaps, where t is a number of threads.
  2. Create an array of Runnables, also of size t.

數(shù)組填充(單線程,不是問題):

Array Population (single threaded, not an issue):

  1. 創(chuàng)建鍵并應(yīng)用 pre-hash 函數(shù),它將返回 0 ... t-1 范圍內(nèi)的 int.在我的情況下,只需模 t.
  2. 通過訪問數(shù)組中的適當(dāng)條目,將鍵放入哈希圖中.例如.如果預(yù)哈希導(dǎo)致索引為 4,則使用 hashArray[4].put(key)
  1. Create the keys and apply pre-hash function, which will return an int in the range 0 ... t-1. In my case simply modulo t.
  2. Put the key in the hashmap, by accessing appropriate entry in the array. E.g. if the pre-hash has resulted in index 4, then go for hashArray[4].put(key)

數(shù)組迭代(很好的多線程,性能提升):

Array Iteration (nicely multithreaded, performance gain):

  1. 為 Runnables 數(shù)組中的每個線程分配一個使用相應(yīng)索引遍歷 hashmap 的作業(yè).與單線程相比,這應(yīng)該使迭代時間縮短 t 倍.
  1. Assign every thread from Runnables array a job of iterating over the hashmap with a corresponding index. This should give give a t times shorter iteration as opposed to single threaded.

要查看概念驗證代碼(因為它有一些來自項目的依賴項,我無法在此處發(fā)布)前往我在 github 上的項目

To see the proof of concept code (as it's got some dependencies from the project I can't post it here) head towards my project on github

編輯

實際上,為我的系統(tǒng)實施上述概念證明已被證明是耗時、容易出錯且令人非常失望的.此外,我發(fā)現(xiàn)我會錯過標(biāo)準(zhǔn)庫 ConcurrentHashMap 的許多功能.我最近一直在探索的解決方案是使用 Scala,它看起來不那么特別而且更有希望,它產(chǎn)生的字節(jié)碼可以與 Java 完全互操作.概念證明依賴于 本文 中描述的令人驚嘆的庫和 AFAIK考慮到標(biāo)準(zhǔn)庫和相應(yīng)第三方庫的當(dāng)前狀態(tài),如果不編寫數(shù)千行代碼,目前在 vanilla Java 中實現(xiàn)相應(yīng)的解決方案是不可能的.

Actually, implementing the above proof of concept for my system has proven to be time-consuming, bug-prone and grossly disappointing. Additionally I've discovered I would have missed many features of the standard library ConcurrentHashMap. The solution I have been exploring recently, which looks much less ad-hoc and much more promising is to use Scala, which produces bytecode that is fully interoperable with Java. The proof of concept relies on stunning library described in this paper and AFAIK it is currently IMPOSSIBLE to achieve a corresponding solution in vanilla Java without writing thousands lines of code, given the current state of the standard library and corresponding third-party libraries.

import scala.collection.parallel.mutable.ParHashMap

class Node(value: Int, id: Int){
    var v = value
    var i = id
    override def toString(): String = v toString
}

object testParHashMap{
    def visit(entry: Tuple2[Int, Node]){
        entry._2.v += 1
    }
    def main(args: Array[String]){
        val hm = new ParHashMap[Int, Node]()
        for (i <- 1 to 10){
            var node = new Node(0, i)
            hm.put(node.i, node)
        }

        println("========== BEFORE ==========")
        hm.foreach{println}

        hm.foreach{visit}

        println("========== AFTER ==========")
        hm.foreach{println}

    }
}

這篇關(guān)于訪問 ConcurrentHashMap&lt;Element, Boolean&gt; 的每個元素的可擴展方式恰好一次的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網(wǎng)!

【網(wǎng)站聲明】本站部分內(nèi)容來源于互聯(lián)網(wǎng),旨在幫助大家更快的解決問題,如果有圖片或者內(nèi)容侵犯了您的權(quán)益,請聯(lián)系我們刪除處理,感謝您的支持!

相關(guān)文檔推薦

Convert List of Strings into Map using Java-8 Streams API(使用 Java-8 Streams API 將字符串列表轉(zhuǎn)換為 Map)
Getting data from JSON(從 JSON 獲取數(shù)據(jù))
java linkedhashmap iteration(javalinkedhashmap迭代)
Converting a list of objects to Map(將對象列表轉(zhuǎn)換為 Map)
Create a HashMap with a fixed Key corresponding to a HashSet. point of departure(用一個固定的Key對應(yīng)一個HashSet創(chuàng)建一個HashMap.出發(fā)點)
HttpMessageConverter exception : RestClientException: Could not write request: no suitable HttpMessageConverter found(HttpMessageConverter 異常:RestClientException:無法寫入請求:找不到合適的 HttpMessageConverter) - IT屋-程序員
主站蜘蛛池模板: 长沙中央空调维修,中央空调清洗维保,空气能热水工程,价格,公司就找维小保-湖南维小保环保科技有限公司 | 中图网(原中国图书网):网上书店,尾货特色书店,30万种特价书低至2折! | 禹城彩钢厂_钢结构板房_彩钢复合板-禹城泰瑞彩钢复合板加工厂 | 网带通过式抛丸机,,网带式打砂机,吊钩式,抛丸机,中山抛丸机生产厂家,江门抛丸机,佛山吊钩式,东莞抛丸机,中山市泰达自动化设备有限公司 | 手机游戏_热门软件app下载_好玩的安卓游戏下载基地-吾爱下载站 | 非小号行情 - 专业的区块链、数字藏品行情APP、金色财经官网 | 苏州伊诺尔拆除公司_专业酒店厂房拆除_商场学校拆除_办公楼房屋拆除_家工装拆除拆旧 | 复合土工膜厂家|hdpe防渗土工膜|复合防渗土工布|玻璃纤维|双向塑料土工格栅-安徽路建新材料有限公司 | 二手Sciex液质联用仪-岛津气质联用仪-二手安捷伦气质联用仪-上海隐智科学仪器有限公司 | 北京成考网-北京成人高考网| 杭州标识标牌|文化墙|展厅|导视|户内外广告|发光字|灯箱|铭阳制作公司 - 杭州标识标牌|文化墙|展厅|导视|户内外广告|发光字|灯箱|铭阳制作公司 | 复合肥,化肥厂,复合肥批发,化肥代理,复合肥品牌-红四方 | 会议会展活动拍摄_年会庆典演出跟拍_摄影摄像直播-艾木传媒 | 河南包装袋厂家_河南真空袋批发价格_河南服装袋定制-恒源达包装制品 | 定时排水阀/排气阀-仪表三通旋塞阀-直角式脉冲电磁阀-永嘉良科阀门有限公司 | 耐腐蚀泵,耐腐蚀真空泵,玻璃钢真空泵-淄博华舜耐腐蚀真空泵有限公司 | 钢骨架轻型板_膨石轻型板_钢骨架轻型板价格_恒道新材料 | 交变/复合盐雾试验箱-高低温冲击试验箱_安奈设备产品供应杭州/江苏南京/安徽马鞍山合肥等全国各地 | 招商帮-一站式网络营销服务|搜索营销推广|信息流推广|短视视频营销推广|互联网整合营销|网络推广代运营|招商帮企业招商好帮手 | 承插管件_不锈钢承插管件_锻钢高压管件-温州科正阀门管件有限公司 | 美国PARKER齿轮泵,美国PARKER柱塞泵,美国PARKER叶片泵,美国PARKER电磁阀,美国PARKER比例阀-上海维特锐实业发展有限公司二部 | 烟台金蝶财务软件,烟台网站建设,烟台网络推广 | 兰州牛肉面加盟,兰州牛肉拉面加盟-京穆兰牛肉面 | 机器视觉检测系统-视觉检测系统-机器视觉系统-ccd检测系统-视觉控制器-视控一体机 -海克易邦 | 钢托盘,铁托盘,钢制托盘,镀锌托盘,饲料托盘,钢托盘制造商-南京飞天金属13260753852 | 四川成都干燥设备_回转筒干燥机_脉冲除尘器_输送设备_热风炉_成都川工星科机电设备有限公司 | 艺术生文化课培训|艺术生文化课辅导冲刺-济南启迪学校 | 法兰螺母 - 不锈钢螺母制造厂家 - 万千紧固件--螺母街 | 展厅设计公司,展厅公司,展厅设计,展厅施工,展厅装修,企业展厅,展馆设计公司-深圳广州展厅设计公司 | 「安徽双凯」自动售货机-无人售货机-成人用品-自动饮料食品零食售货机 | 螺钉式热电偶_便携式温度传感器_压簧式热电偶|无锡联泰仪表有限公司|首页 | PAS糖原染色-CBA流式多因子-明胶酶谱MMP-上海研谨生物科技有限公司 | atcc网站,sigma试剂价格,肿瘤细胞现货,人结肠癌细胞株购买-南京科佰生物 | 称重传感器,测力传感器,拉压力传感器,压力变送器,扭矩传感器,南京凯基特电气有限公司 | 清管器,管道清管器,聚氨酯发泡球,清管球 - 承德嘉拓设备 | 微信聊天记录恢复_手机短信删除怎么恢复_通讯录恢复软件下载-快易数据恢复 | 土壤检测仪器_行星式球磨仪_土壤团粒分析仪厂家_山东莱恩德智能科技有限公司 | 瓶盖扭矩仪(扭力值检测)-百科 | 长沙印刷厂-包装印刷-画册印刷厂家-湖南省日大彩色印务有限公司 青州搬家公司电话_青州搬家公司哪家好「鸿喜」青州搬家 | 成都治疗尖锐湿疣比较好的医院-成都治疗尖锐湿疣那家医院好-成都西南皮肤病医院 | 船用锚链|专业锚链生产厂家|安徽亚太锚链制造有限公司 |