The data set contains a set of 150 records under five attributes: sepal length, sepal width, petal length, petal width and species. '' The iris data set is widely used as a beginner's data set for machine learning purposes. The data set is included in
R base and Python in the machine learning library
scikit-learn, so that users can access it without having to find a source for it. Several versions of the data set have been published. ===
R code illustrating usage=== The example R code shown below reproduce the scatterplot displayed at the top of this article: • Show the data set iris • Show the help page, with information about the data set ?iris • Create scatterplots of all pairwise combination of the 4 variables in the data set pairs(iris[1:4], main="Iris Data (red=setosa,green=versicolor,blue=virginica)", pch=21, bg=c("red","green3","blue")[unclass(iris$Species)]) • Alternatively using ggplot and ggally. install.packages(c("ggplot2","GGally")) # install packages if you don't have it. • Load libraries. library(ggplot2) library(GGally) • Plot scatter plot matrix for the iris data set ggpairs(data = iris, # your iris data columns = 1:4, # columns for the scatter plot mapping = aes(colour = Species,fill = Species), title = 'Scatter Plot Matrix for Iris Data Set', ) + theme(plot.title = element_text(hjust = 0.5,face = 'bold')) + scale_color_brewer(palette = 'Set1') ===
Python code illustrating usage=== from sklearn.datasets import load_iris iris = load_iris() print(iris) This code gives: {'data': array(5.1, 3.5, 1.4, 0.2], [4.9, 3., 1.4, 0.2], [4.7, 3.2, 1.3, 0.2], [4.6, 3.1, 1.5, 0.2],... 'target': array([0, 0, 0, ... 1, 1, 1, ... 2, 2, 2, ... 'target_names': array(['setosa', 'versicolor', 'virginica'], dtype=' ==See also==