Purpose: The purpose of this extension is to provide the user with a statistical summary of their data that can be used to determine what data preparation tasks are necessary for further analysis.
Steps to Run:
- Go to the Tools menu
- Select the Data Science option
- Select the Data Description option
- Enter the data table the analysis should be performed on
- The output is a single table.
- The first column (column) specifies the column being analyzed.
The second column (count) counts the number of records, which should be the same number for all columns, as it does not exclude nulls.
- The third column (type) lists the data type of the specified column.
- The fourth column (num.na) counts how many cells have nulls.
- The remaining columns describe the data in terms of min, max, mean, median and other statistical measures.