Skip to Main Content

ENTS 120: Introduction to Geospatial Analysis

Professor John Berini - Spring 2024

Strategies for Finding Geospatial Data

Searching for data doesn't really happen in a step by step process. It's more circular and iterative than that. However, having a process in mind can help you notice the decisions you're making and possibilities that you haven't yet pursued. 

Step 1: Decide What You're Looking For

Take a moment to decide what kind of data you are hoping to find. You may want data about a specific subject or topic. Or you may want data for a specific geographic place. Or you may want data from a specific time period. Which of these: topic, place, or time -- is most important to you?

Step 2: Consider Who Might Produce These Data

Think about how data are distributed and shared. Some data like educational data are collected by schools and distributed through governmental agencies. Environmental data are collected by scientists and shared with research articles or by agencies across different levels of government. GIS data across a wide range of topics are shared by GIS practitioners in repositories. You might not have a clear sense of who would collect the data you want, but it helps to keep it in mind as you're searching, especially in addressing the next step.

Step 3: Decide How and Where to Start Searching

There isn't just one way to look for data, and most important thing to you (see Step 1) will affect where to start your search. If shapes within a US state are what you are looking for, maybe start with a state data repository. If you want to see what is available in shapefile format, start with a GIS repository. This guide helps you explore in several different but overlapping ways to find data. The following questions help you decide where to start.

1. Could your data be found in a location-based collection of data? If so, then use the Explore by Place tab.

2. Could your data be found in a format-specific collection of data? If so, then start with the Explore by Format tab.

3. Could your data be found in a topical collection of data? If yes, then try the Explore by Topic tab.

4. Can you go directly to the source and bypass collections? Is your data likely to be part of a specific project or process you already know about? If so, Google searching may be sufficient.

Step 4: Document Your Progress and Seek Assistance

Keep notes for yourself of where you look and what you find, so you can retrace your steps. Something you rule out early might lead you to a dataset you need later. 

When you feel you could benefit from some guidance in your searching, get in touch with the librarian

Formats to Look For

Formatted for GIS Software

.shp - shape files - These are formatted for GIS software and will be easy for you to work with. When you see a shape file, be sure to download all accompanying files, too. They work together, so you need all of them. Keep an eye out for metadata files (usually .xml or .gml).

.gdb - geodatabase - This is your shapefile and all associated files saved as one, more portable file

.geojson or .json - geographic JavaScript object notation - vector points, lines, and polygons and tabular information can all be saved in GeoJSON format. You're most likely to find this format if you download from a web-based map. 

.kml or .kmz - Google Keyhole markup language - Google Earth and Google Maps download in this format, which is readable in ArcGIS Online and can be reformatted into a shape file.

Tabular Data with Shapes and Points

.csv - comma separated values - The most flexible option for tabular data

.xls or .xlsx - Excel format, which can be converted to CSV

.tab - tab separated values - Similar to CSV

When collecting point data in these formats, make sure one of the columns contains latitude/longitude.

 

Terms to Watch For

Data Catalog: A structured list of data that helps users find available information. It often includes descriptions and links to access the data, focusing on collections that match the organization's interests (like universities or NGOs).

Example: World Bank Microdata Catalog

Data Portal: A website where users can search for data, making it easier for data producers to share their information. Portals can take different forms, such as catalogs or listings, and usually provide links to download data from various sources.

Example: Data.gov

Data Repository: A centralized place for storing, preserving, and providing access to research data. Repositories are tailored to specific communities, based on factors like location, topic, discipline, or format of the data.

Example: University of Michigan's Deep Blue Data and Data Repository for University of Minnesota (DRUM)

Metadata: Information about the data itself, such as descriptions of datasets, variable meanings, units, creators, and conditions of use. Metadata are crucial for understanding and using datasets accurately and ethically.

Query Tools: Features on data websites that allow users to create customized subsets or tables of data. These tools may be called "custom tables," "online analysis," "interactive data," or "data query."