Searching for data doesn't really happen in a step by step process. It's more linear and intuitive than that. However, having a process in mind can help you notice the decisions you're making and possibilities that you haven't yet pursued.
Take a moment to define what kind of data you are hoping to find. You may want data about a topic, or for a specific kind of place, or from a specific time period. Considering the following qualities, which are most important: subject matter, place, time, etc.
Think about how data are distributed and shared. Some data like educational data are collected by schools and distributed through governmental agencies. Environmental data are collected by scientists and shared alongside research articles or by agencies across varying levels of government. GIS data across a wide array of topics are shared by GIS practitioners in repositories. You might not have a clear sense of who would collect the data you want, but it helps to keep it in mind as you're searching, especially in addressing the next step.
There is not one way to look for data, and the qualities most important to you will impact where it will be most fruitful to search. If shapes within a US state are what you are looking for, maybe start with a state data repository. If you want to see what is available in shapefile format, start with a GIS repository. This guide helps you explore in several different, but overlapping ways to find data. The following questions help you decide where to start.
1. Could your data be found in a location-based collection of data? If so, then use the Explore by Place tab.
2. Could your data be found in a format-specific collection of data? If so, then start with the Explore by Format tab.
3. Could your data be found in a topical collection of data? If yes, then try the Explore by Topic tab.
4. Can you go directly to the source and bypass collections? Is your data likely to be part of a specific project or process you already know about? If so, Google searching will likely be sufficient.
5. Check the "other data-searching tools" tab to help you answer some of these questions. Some of these tools let you search by keyword and then sort by source or type.
Keep notes for yourself of where you look and what you find, so you can retrace your steps. Something you rule out early might be a lead you need later.
When you feel you could benefit from some guidance in your searching, get in touch with me, Kristin, your librarian. I'm happy to meet with you in person, on Zoom, over email, the phone -- whatever works well for you. Follow the tab "Consult with a librarian" to find my contact information.
.shp - shape files - These are formatted for GIS software and will be easy for you to work with. When you see a shape file, be sure to download any accompanying files, too. They work together in tandem, so you need all of them. Keep an eye out for metadata files (usually .xml or .gml).
.gdb - geodatabase - This is your shapefile and all associated files saved as one, more portable file
.kml or .kmz - Google Keyhole markup language - Google Earth and Google Maps download in this format, which is readable in ArcGIS Online and can be reformatted into a shape file.
.csv - comma separated values - The most flexible option for tabular data
.xls or .xlsx - Excel format, which can be converted to CSV
.tab - tab separated values - Similar to CSV
When collecting point data in these formats, make sure one of the columns contains latitude/longitude.
[Explain terms like: data repositories, data catalogs, datasets, etc. Also identify documentation terms like codebooks, data dictionaries, metadata, etc.]
Data Catalog: an organized listing of data, which can be searched in order to discover available data. Often the content of data catalogs are descriptive with links to where data can be accessed. Data catalogs are created around collections (usually) that reflect the interests of the organization that collects. Brainstorming organizations that might collect and distribute the data you need is an important part of searching for data (e.g., universities, non-governmental organizations). When searching a data catalog, use keywords that are likely to be used in the study-level description of a dataset. Example: World Bank Microdata Catalog
Data Portal: a web site that allows users to search for data, and which allows data producers to make their data more easily discoverable. Portal is a broader concept and can take the form of a data catalog, a simple listing, or even a data repository. Like catalogs, portals usually point out to where the data can be downloaded. When using a portal, expect to hop across various web sites. Example: Data.gov
Data Repository: in the context of research, a data repository is a centralized place (usually itself a database infrastructure accessible via a website) to store, preserve, organize, and provide access to data of interest to researchers. Data repositories are created around communities, which might be defined by place (e.g., a state data repository), topic of interest (e.g, snow and ice data repository), disciplinary focus (e.g. health sciences), governmental or organizational mission (e.g., meteorological data), or format (e.g. GIS data). Brainstorming and identifying potential repositories that might hold data of interest to you is a major part of searching for data.
Metadata: data require documentation to be usable. Metadata are data about data -- structured descriptions of datasets. These help you by telling you what variables mean, what units are being employed, who created the dataset and under what conditions. Metadata are key to determining whether you will be able to use a dataset accurately and ethically.
Query Tools: Many data websites let you build your own subset or your own table of data instead of requiring you to download the whole thing or scour through premade tables. This feature goes by many names. Look for "custom tables," "online analysis," "interactive data," "data query,"
Questions? Contact firstname.lastname@example.org
Powered by Springshare.