numpy
numpy.vstack
Stack arrays in sequence vertically (row wise).
按行添加数据
>>> a = np.array([1,2,3])
>>> b = np.array([2,3,4])
>>> np.vstack((a,b))
array([[1,2,3],[2,3,4]])
>>> a = np.array([[1],[2],[3]])
>>> b = np.array([[2],[3],[4]])
>>> np.vstack((a,b))
array([[1],[4]])
numpy.ravel
多维数据变成一维数据
Returns:
y : array_like
If a is a matrix,y is a 1-D ndarray,otherwise y is an array of the same subtype as a. The shape of the returned array is (a.size,). Matrices are special cased for backward compatibility.
matplotlib
pandas
describe
将数据集的一些特性打印出来,默认打印的是数字类型的,如果想要打印categorical 类型,可以
df.describe(include=[‘O’]),这里是大写的字母O。include,exclude : list-like,‘all’,or None (default)
Specify the form of the returned result. Either:
None to both (default). The result will include only numeric-typed columns or,if none are,only categorical columns.
A list of dtypes or strings to be included/excluded. To select all numeric types use numpy numpy.number. To select categorical objects use type object. See also the select_dtypes documentation. eg. df.describe(include=[‘O’])
If include is the string ‘all’,the output column-set will match the input one.
sklearn
sklearn.datasets.make_classificationn_
samples : int,optional (default=100)
The number of samples.n_features : int,optional (default=20)
The total number of features. These comprise n_informative informative features,n_redundant redundant features,n_repeated duplicated features and n_features-n_informative-n_redundant- n_repeated useless features drawn at random.n_informative : int,optional (default=2)
The number of informative features. Each class is composed of a number of gaussian clusters each located around the vertices of a hypercube in a subspace of dimension n_informative. For each cluster,informative features are drawn independently from N(0,1) and then randomly linearly combined within each cluster in order to add covariance. The clusters are then placed on the vertices of the hypercube.n_classes : int,optional (default=2)
The number of classes (or labels) of the classification problem.weights : list of floats or None (default=None)
The proportions of samples assigned to each class. If None,then classes are balanced. Note that if len(weights) == n_classes - 1,then the last class weight is automatically inferred. More than n_samples samples may be returned if the sum of weights exceeds 1.
sklearn.metrics.confusion_matrix
y_true : array,shape = [n_samples]
Ground truth (correct) target values.y_pred : array,shape = [n_samples]
Estimated targets as returned by a classifier.labels : array,shape = [n_classes],optional
List of labels to index the matrix. This may be used to reorder or select a subset of labels. If none is given,those that appear at least once in y_true or y_pred are used in sorted order.sample_weight : array-like of shape = [n_samples],optional
Sample weights.return
C : array,shape = [n_classes,n_classes] Confusion matrix
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。