Skip to content

Latest commit

 

History

History
30 lines (18 loc) · 1.17 KB

File metadata and controls

30 lines (18 loc) · 1.17 KB

iris-exploration

Data science activity on the Iris data set.

  1. Set up a Git Repository for this activity on https://github.com/.

  2. Using Python, R, Jupyter load the ‘Iris’ dataset. a. If using Python: use the sklearn package https://scikit-learn.org/stable/auto_examples/datasets/plot_iris_dataset.html b. If using R: use the data/datasets library

  3. Perform Exploratory Data Analysis on the Iris dataset. Create visualisations of your choice and comment on any insights/trends.

  4. Store your code on the git repository you have created and provide a link to your repository.

  5. Describe the steps and checks you would take to prepare this dataset for a machine learning model and provide comments on any challenges you may face in modelling.

=========================================================

The Iris Dataset

This data sets consists of 3 different types of irises' (Setosa, Versicolour, and Virginica) petal and sepal length, stored in a 150x4 numpy.ndarray

The rows being the samples and the columns being: Sepal Length, Sepal Width, Petal Length and Petal Width.

See here <https://en.wikipedia.org/wiki/Iris_flower_data_set>_ for more information on this dataset.