The purpose of this article is to explain how to setup Data Description
Purpose: The purpose of this extension is to provide the user with a statistical summary of their data that can be used to determine what data preparation tasks are necessary for further analysis.
- Data table
Steps to Run:
- Go to the Tools menu
- Select the Data Science option
- Select the Data Description option
- Enter the data table the analysis should be performed on.
- Data Table
- Check box for transformation for : [*,1/x, log10, sqrt]
- Click Ok.
Outputs: Without Transformation
- The output is a single table.
- The first column (column) specifies the column being analyzed.
- The second column (count) counts the number of records, which should be the same number for all columns, as it does not exclude nulls.
- The third column (type) lists the data type of the specified column.
- The fourth column (num.na) counts how many cells have nulls.
- The remaining columns describe the data in terms of min, max, mean, median and other statistical measures.
- 1 table visualization ( 17 columns) of statistical summary and distribution characteristic of each numeric column (same as above)
- 1 table visualization (5 columns) for transformation results for normality.
How to filter a subset of a data:
- Open the filter panel by clicking the filter icon on the top bar.
- Choose the correct filtering scheme
- Click "Refresh Data Table" icon on the Data Description table.
For additional information watch video: