函数使用

numpy

numpy.vstack

Stack arrays in sequence vertically (row wise).
按行添加数据

>>> a = np.array([1,2,3])
>>> b = np.array([2,3,4])
>>> np.vstack((a,b))
array([[1,2,3],[2,3,4]])

>>> a = np.array([[1],[2],[3]])
>>> b = np.array([[2],[3],[4]])
>>> np.vstack((a,b))
array([[1],[4]])

numpy.ravel

多维数据变成一维数据
Returns:
y : array_like
If a is a matrix,y is a 1-D ndarray,otherwise y is an array of the same subtype as a. The shape of the returned array is (a.size,). Matrices are special cased for backward compatibility.

matplotlib

pandas

describe

将数据集的一些特性打印出来，默认打印的是数字类型的，如果想要打印categorical 类型，可以
df.describe(include=[‘O’])，这里是大写的字母O。

include,exclude : list-like,‘all’,or None (default)
Specify the form of the returned result. Either:
None to both (default). The result will include only numeric-typed columns or,if none are,only categorical columns.
A list of dtypes or strings to be included/excluded. To select all numeric types use numpy numpy.number. To select categorical objects use type object. See also the select_dtypes documentation. eg. df.describe(include=[‘O’])
If include is the string ‘all’,the output column-set will match the input one.

sklearn

sklearn.datasets.make_classificationn_

samples : int,optional (default=100)
The number of samples.

n_features : int,optional (default=20)
The total number of features. These comprise n_informative informative features,n_redundant redundant features,n_repeated duplicated features and n_features-n_informative-n_redundant- n_repeated useless features drawn at random.

n_informative : int,optional (default=2)
The number of informative features. Each class is composed of a number of gaussian clusters each located around the vertices of a hypercube in a subspace of dimension n_informative. For each cluster,informative features are drawn independently from N(0,1) and then randomly linearly combined within each cluster in order to add covariance. The clusters are then placed on the vertices of the hypercube.

n_classes : int,optional (default=2)
The number of classes (or labels) of the classification problem.

weights : list of floats or None (default=None)
The proportions of samples assigned to each class. If None,then classes are balanced. Note that if len(weights) == n_classes - 1,then the last class weight is automatically inferred. More than n_samples samples may be returned if the sum of weights exceeds 1.

sklearn.metrics.confusion_matrix

y_true : array,shape = [n_samples]
Ground truth (correct) target values.

y_pred : array,shape = [n_samples]
Estimated targets as returned by a classifier.

labels : array,shape = [n_classes],optional
List of labels to index the matrix. This may be used to reorder or select a subset of labels. If none is given,those that appear at least once in y_true or y_pred are used in sorted order.

sample_weight : array-like of shape = [n_samples],optional
Sample weights.

return
C : array,shape = [n_classes,n_classes] Confusion matrix

numpy

numpy.vstack

numpy.ravel

matplotlib

pandas

describe

sklearn

sklearn.datasets.make_classificationn_

sklearn.metrics.confusion_matrix

相关推荐