sklearn.make_classification

时间 2019-11-11

标签 sklearn.make sklearn make classification 繁體版

原文原文链接

sklearn.datasets.make_classification(n_samples=100, n_features=20, n_informative=2, 算法

n_redundant=2, n_repeated=0, n_classes=2, n_clusters_per_class=2, weights=None, 数组

flip_y=0.01, class_sep=1.0, hypercube=True,shift=0.0, scale=1.0, dom

shuffle=True, random_state=None) spa

功能：生成样本集，一般用于分类算法orm

参数：ip

n_features :特征个数= n_informative（） + n_redundant + n_repeated
n_informative：多信息特征的个数
n_redundant：冗余信息，informative特征的随机线性组合
n_repeated ：重复信息，随机提取n_informative和n_redundant 特征
n_classes：分类类别
n_clusters_per_class ：某一个类别是由几个cluster构成的ci

weights:列表类型，权重比io

class_sep:乘以超立方体大小的因子。较大的值分散了簇/类，并使分类任务更容易。默认为1form

random_state: 若是是int，random_state是随机数发生器使用的种子; 若是RandomState实例，random_state是随机数生成器; 若是没有，则随机数生成器是np.random使用的RandomState实例。class

返回值：

X：形状数组[n_samples，n_features]
生成的样本。

y：形状数组[n_samples]每一个样本的类成员的整数标签。

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。