Python问题汇总

1.dict is not callable

tree是一个字典类型。

tree("left") -> tree["left"]

2.list indices must be integers or slices, not tuple

dataset是原生的python数组，是list类型（python原生数组叫list类型）

errorMerge = sum(power(dataset[:, -1] - treeMean, 2))

尝试使用numpy里面的array索引方式来索引原生数组，将会爆此错误。

3.'NoneType' object is not iterable

代码返回值为None, value；直接处理返回值第一个值将会爆此错误；

4.shapes (1,1) and (4,1) not aligned: 1 (dim 1) != 4 (dim 0)

# 基于给出的dataset，（新）生成K个样本，用于作质点
def randCentoids(dataset, k): n = shape(dataset)[1] centoids = mat(zeros((k, n))) for j in range(n): minJ = min(dataset[:, j]) maxJ = max(dataset[:, j]) rangJ = maxJ - minJ centoids[:, j] = mat(minJ + rangJ * random.rand(k, 1)) return centoids

这个错误的意思是做为矩阵相乘，行列数没法直接相乘，由于min（）和max返回的都是numpy.matrix类型；为何会返回矩阵类型？由于dataset就是matrix类型，因此返回的虽然是单值，可是也会被认为是矩阵类型。

rangJ = float(maxJ - minJ)

强转为float以后，问题解决。

5.could not broadcast input array from shape (2) into shape (1,1)

1 sampleCenterRecord = mat(zeros((m, 1))) 2 ... 3 dist = distCaculate(centroids[j, :], dataset[i, :])

sampleCenterRecord的维数定义有问题，改成(m, 2)问题解决。

6.IndexError: index 0 is out of bounds for axis 0 with size 0

1 os.chdir("D:\\galaxy\\aliyunsvn\\code\\MLInAction\\dataset") 2 dataArr = loadDataSet("ex00.txt") 3 dataMat = mat(dataArr) 4 value = [[0.996757]] 5 feature = 0 6 dataMat[nonzero(dataMat[:, feature] > value)[0], :][0]

这个是由于dataMat中知足这个条件的日志的数量为0，因此最后索引[0]回报数组越界异常。python

7.unhashable type: 'numpy.ndarray'

1 for splitVal in set(dataSet[:,featIndex].A): 2     ...

以前是异常是unhashable type: 'matrix'，后来添加A想要尝试转化为Array看看依然报错。

这异常的意思是set里面只支持python原生的数据类型，对于numpy的对象没法识别（处理）。因此unhashable，本质就是参数类型不匹配。

7.only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

这个异常说明了索引类型有问题：数组

overLap = nonzero(logical_and(dataMat[:, item].A>0, dataMat[:, j].A>0))[0]app

由于item是从参数过来，可是外部调用的时候这个参数误传为一个function，故报错。dom

8.data type must provide an itemsize xTx = xMat.T * xMat 这个执行的时候爆的错，缘由就是在loadDataset的时候没有进行发咯at转化，直接处理，致使字符串之间矩阵运算致使异常。须要转化为float，问题解决

 1 def loadDataset(fileName):  2     X = []  3     y = []  4     for line in open(fileName):  5         values = line.split()  6         lineArr = []  7  lineArr.append(float(values[0]))  8 lineArr.append(float(values[1]))  9  X.append(lineArr) 10         y.append(float(values[-1])) 11     return X, y

9. unhashable type: 'matrix'ide

1 for splitValue in set(dataset[:, featureIndex]): 2     ... ...

　　这是由于在python里面set实际上是对于其里面的元素取Hash值而后根据hashz值进行排序；可是若是是对于numpy.ndarry/ Matrix等被封装的类型则没法获取其hash值，set里面的元素只能是原生类型。做以下处理问题解决：svn

1 for splitValue in set(dataset[:, featureIndex]).A.flatten().tolist(): 2     ... ...

10. ValueError: Unknown label type: 'continuous'spa

发生这个异常是由于我使用了RandomForestClassification，可是y值却使用了float，因此报错；做为分类器的y值必须是int，不然怎么分类啊。rest