Describe Dataset

Describe Dataset workflow diagram


The Describe Dataset tool provides a more detailed look into your dataset. Before running analysis on large amounts of data, use this tool to determine what and where your input data is. By default, the result will be a JSON string outlining key descriptors of your input layer and a table detailing the summary statistics for each field.

Optionally you can create additional output feature layers to further describe your data. The options are a sample layer or an extent layer. You can choose to output one or both.

For example, say you are given a big data file share containing 15 datasets. Each dataset has 10 million area features representing buildings and houses across different regions of your country. You are tasked with analyzing region C, but you do not know which dataset contains these features. To find out which dataset you should use, run Describe Dataset and choose to create an extent layer to investigate which dataset is in your study region.

As another example, imagine you are tasked with completing an analysis workflow on a large volume of data. You want to try the workflow, but it could take hours or days with your full dataset. Instead of utilizing time and resources running the full analysis, create a sample layer to efficiently test your complete workflow on.

Choose dataset to describe


The layer containing point, line, area, or tabular features that will be described, summarized, and sampled.

In addition to choosing a layer from your map, you can choose Choose Analysis Layer at the bottom of the drop-down list to browse to your contents for a big data file share dataset or feature layer.

Understand your dataset by creating a (optional)


Output additional descriptive layers to improve your understanding of your big data. You can choose to output zero, one, or both of the following layers:

  • Extent layer—creates an area feature representing the extent of your input features or area of interest.
  • Sample layer—creates a subset layer containing a specified amount of input features from within your dataset or area of interest.

Sample layer


Output a layer containing a subset of features from your input layer. If Sample layer is selected you are able to specify the number of features you want to return in the sample layer. By default, 100 sample features will be returned in the output layer.

The value must be greater than zero. If you specify a number greater than the total number of features all features will be returned.

If Use current map extent is selected then the sample layer will contain features from within the extent of the map.

Extent layer


Output a feature layer containing a single feature representing the extent of the input features by selecting the Extent layer button. This is selected by default.

If Use current map extent is selected then the extent layer will represent the visible extent of the map.

Result layer name


The name of the layer that will be created. If you are writing to an ArcGIS Data Store, your results will be saved in My Content and added to the map. If you are writing to a big data file share, your results will be stored in the big data file share and added to its manifest. It will not be added to the map. The default name is based on the tool name and the input layer name. If the layer already exists, the tool will fail.