Find Outliers

The Find Outliers tool will determine if there are any statistically significant outliers in the spatial pattern of your data.

Are there anomalous areas in the pattern of your data (crime incidents, trees, traffic accidents)? How can you be sure?
Have you truly discovered a statistically significant outlier (for spending, infant mortality, consistently high test scores) or would your map tell a different story if you changed the way it was symbolized?

The Find Outliers tool will help you answer these questions with confidence.

Every time we look at a map, it is natural for our eyes and our brains to start to try to find patterns even when none exist. Consequently, it can be difficult to know if the patterns in your data are the result of real spatial processes at work or just the result of random chance. This is why researchers and analysts use statistical methods like Find Outliers (Anselin Local Moran's I) to quantify spatial patterns. When you do find statistically significant outliers or clustering in your data, you have valuable information. Knowing where and when outliers have occurred can provide important clues about the processes promoting the patterns you're seeing. The next step would be to investigate why things are significantly different in those outlier areas. Knowing that residential burglaries, for example, are significantly higher in a particular neighborhood despite being surrounded by neighborhoods with low burglaries is vital information if you need to design effective prevention strategies, allocate limited police resources, initiate neighborhood watch programs, authorize in-depth criminal investigations, or identify potential suspects.

Choose layer for which outliers will be calculated

The point or area layer from which outliers will be found.

Find outliers of

This analysis answers the question: Where are the spatial outliers in my data?

If your data is points and you choose Point Counts, this tool will evaluate the spatial arrangement of the point features to answer the question: Where are points unexpectedly clustered or spread out?

If you choose a field, this tool will evaluate the spatial arrangement of the values associated with each feature to answer the questions: Where are there low values surrounded by high values? Where are there high values surrounded by low values?

Count points within

The default is to count points within a fishnet grid created by the tool based on your point data. Alternatively, you can choose to count points within a hexagon grid or to provide an area layer (typically, these would reflect administrative reporting districts such as census tracts, municipal boundaries, or counties) to answer the question: Given the number of points counted within each area feature, are there locations with statistically significant high or low point counts compared to their neighbors?

Define where points are possible

Either draw or provide a layer defining where incidents are possible to answer the question: Within the areas, are there any locations with unexpectedly high or low point concentrations?

The area features you draw or the features in the area layer you specify should define where points could possibly occur. To draw these areas, click the Draw button and click a location on the map to create an area shape. To draw additional areas, click the draw button again and click a location on the map to continue.

Divide by

Sometimes you may want to analyze patterns that take into account underlying distributions. For example, if your points represent crimes, dividing by total population would result in an analysis of crimes per capita rather than raw crime counts. Choosing an attribute to divide by is often referred to as normalization.

Choosing Esri Population will enrich each area feature with population values, which will then be used as the attribute to divide by. This option will use credits.

Optimize for

You may choose to optimize for speed or precision.

This tool uses permutations to determine how different from random the spatial pattern of your data is. Increasing the number of permutations increases precision but also increases processing time.

Override Options

The tool will find optimal settings for Cell Size and Distance Band defaults based on the characteristics of your data. However, if you have a particular Cell Size or Distance Band that makes sense for your analysis, the Override Options can be used to set those values.

The Override Options are also useful when running analysis on different datasets, allowing you to keep Distance Band and Cell Size consistent across multiple datasets. You can then compare the results (for example, obesity and diabetes rates or even crime rates for two different years).

Cell Size

The size of the grid cells used to count points within.

When using a hexagon grid to count points within, this distance is used as the height of the hexagons.

Distance Band

Each feature is analyzed within the context of those neighboring features located within the distance you specify. The tool will calculate a default distance for you or you may use this option to set a particular distance that makes sense for your analysis.

For instance, if you are studying commuting patterns and you know that the average journey to work is 15 miles, for example, you may want to use a 15-mile distance band.

Result layer name

Provide a name for the layer that will be created in My Content and added to the map. This result layer will show you statistically significant outliers of high and low values or point counts. If the result layer name already exists, you will be asked to rename it.

Using the Save result in drop-down box, you can specify the name of a folder in My Content where the result will be saved.