亚洲视频在线观看一区,日韩欧美一区二区三区,婷婷亚洲五月

本文介紹了pandas 中的 .sum() 方法給出的結果不一致的處理方法，對大家解決問題具有一定的參考價值，需要的朋友們下面隨著小編來一起學習吧！

問題描述

我有一個大的 DataFrame(大約 4e+07 行).

I have a large DataFrame (circa 4e+07 rows).

求和時，我得到 2 個明顯不同的結果，無論我是在在列選擇之前還是之后進行求和.
此外，類型從 float32 更改為到 float64，即使總數都低于 2**31

When summing it, I get 2 significantly different results whether I do the sum before or after the column selection.
Also, the type changes from float32 to float64 even though totals are all below 2**31

df[[col1, col2, col3]].sum()
Out[1]:
col1         9.36e+07
col2         1.39e+09
col3         6.37e+08
dtype: float32

df.sum()[[col1, col2, col3]]
Out[2]:
col1         1.21e+08
col2         1.70e+09
col3         7.32e+08
dtype: float64

我顯然遺漏了一些東西，有人遇到過同樣的問題嗎?

I am obviously missing something, has anybody had the same issue?

感謝您的幫助.

推薦答案

使用 np.float32 相對于 np.float64 可能會丟失精度

You can lose precision with np.float32 relative to np.float64

np.finfo(np.float32)

finfo(resolution=1e-06, min=-3.4028235e+38, max=3.4028235e+38, dtype=float32)

和

np.finfo(np.float64)

finfo(resolution=1e-15, min=-1.7976931348623157e+308, max=1.7976931348623157e+308, dtype=float64)

一個人為的例子

df = pd.DataFrame(dict(
    x=[-60499999.315, 60500002.685] * int(2e7),
    y=[-60499999.315, 60500002.685] * int(2e7),
    z=[-60499999.315, 60500002.685] * int(2e7),
)).astype(dict(x=np.float64, y=np.float32, z=np.float32))

print(df.sum()[['y', 'z']], df[['y', 'z']].sum(), sep='

')

y    80000000.0
z    80000000.0
dtype: float64

y    67108864.0
z    67108864.0
dtype: float32

這篇關于pandas 中的 .sum() 方法給出的結果不一致的文章就介紹到這了，希望我們推薦的答案對大家有所幫助，也希望大家多多支持html5模板網！

【網站聲明】本站部分內容來源于互聯網,旨在幫助大家更快的解決問題，如果有圖片或者內容侵犯了您的權益，請聯系我們刪除處理，感謝您的支持！

pbootcms网站模板|日韩1区2区|织梦模板||网站源码|日韩1区2区|jquery建站特效-html5模板网

pandas 中的 .sum() 方法給出的結果不一致

問題描述

推薦答案

相關文檔推薦