问题解决:SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame

转载:y小川html

SettingWithCopyWarning 解决方案

问题场景:我在读取csv文件以后,由于要新增一个特征列并根据已有特征修改新增列的值,结果在修改的时候就碰到了SettingWithCopyWarning这个警告,花了很长时间才解决这个问题。数组

一个简易版的范例函数

import pandas as pd import numpy as np aa = np.array([1, 0, 1, 0]) bb = pd.DataFrame(aa.T, columns=['one']) print(bb)

输出为:url

添加一个新列后在输出spa

bb['two'] = 0 print(bb) output[]: one two 0 1 0 1 0 0 2    1 0 3    0    0

按条件修改新列再输出就报错了:.net

for i in range(bb.shape[0]): if bb['one'][i] == 0: bb['two'][i] = 1
print(bb) output[]: C:/PycharmProjects/NaiveBayesProduct/pandas/try_index.py:22: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  bb['two'][i] = 1 one two 0 1 0 1    0    1
2    1 0 3    0    1

这个问题怎么解决呢,我查了stackoverflow上的不少帖子,试了loc/iloc等函数都无论用,最后才发现是顺序错了。正确方案应该是生成好正确的数组再插入dataframe中。下面我把上面的例子用正确地方法再从新生成一遍。code

import pandas as pd import numpy as np aa = np.array([1, 0, 1, 0]) bb = pd.DataFrame(aa.T, columns=['one']) # 生成一个ndarray,装要插入的值
two = np.zeros(bb.shape[0]) # 按条件修改two
for i in range(bb.shape[0]): if bb['one'][i] == 0: two[i] = 1
# 完成后将two插入dataframe中
bb.insert(1,'two', two) print(bb) output[]: one two 0 1  0.0
1    0  1.0
2    1  0.0
3    0  1.0
相关文章
相关标签/搜索