iris数据集
鸢尾花数据集,一个非常经典的用于多分类任务的数据集。
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| from sklearn.datasets import load_iris
iris = load_iris()
n_samples,n_features = iris.data.shape print("Number of sample:", n_samples) print("Number of feature", n_features)
print(iris.data[0]) print(iris.data.shape) print(iris.target.shape) print(iris.target)
|
digits数据集
手写数字数据集,也是一个可用于多分类任务的数据集。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| import matplotlib.pyplot as plt from sklearn.datasets import load_digits import numpy as np
digits = load_digits() digits.keys() n_samples,n_features = digits.data.shape
print((n_samples,n_features)) print(digits.images.shape)
fig = plt.figure(figsize=(6,6)) fig.subplots_adjust(left=0,right=1,bottom=0,top=1,hspace=0.05,wspace=0.05)
for i in range(64): ax = fig.add_subplot(8,8,i+1,xticks=[],yticks=[]) ax.imshow(digits.images[i],cmap=plt.cm.binary,interpolation='nearest') ax.text(0,7,str(digits.target[i])) plt.show()
|
breast_cancer数据集
乳腺癌数据集,经典的用于二分类任务的数据集。
1 2 3 4 5 6 7 8 9
| from sklearn.datasets import load_breast_cancer
barest_cancer = load_breast_cancer()
n_samples, n_features = barest_cancer.data.shape print("Number of sample:", n_samples) print("Number of feature", n_features)
|
diabetes数据集
糖尿病数据集,经典的用于回归任务的数据集。
1 2 3 4 5 6 7 8 9
| from sklearn.datasets import load_diabetes
diabetes = load_diabetes()
n_samples, n_features = diabetes.data.shape print("Number of sample:", n_samples) print("Number of feature", n_features)
|
boston数据集
波士顿房价数据集,经典的用于回归任务的数据集。
1 2 3 4 5 6 7 8 9
| from sklearn.datasets import load_boston
boston = load_boston()
n_samples, n_features = boston.data.shape print("Number of sample:", n_samples) print("Number of feature", n_features)
|
linnerud数据集
体能锻炼数据集,经典的用于多变量回归任务的数据集。
1 2 3 4 5 6 7 8 9
| from sklearn.datasets import load_linnerud
linnerud = load_linnerud()
n_samples, n_features = linnerud.data.shape print("Number of sample:", n_samples) print("Number of feature", n_features)
|