Skip to Main Content

Data, Datasets, and Statistical Resources

Factors to Consider When Evaluating Statistics


  • Who collected it?
  • Was it an individual or organization or agency? 
  • The data source and the reporter or citer are not always the same. For example, advocacy organizations often publish data that were produced by some other organization. When feasible, it is best to go to the original source (or at least know and evaluate the source).
  • If the data are repackaged, is there proper documentation to lead you to the primary source? Would it be useful to get more information from the primary source? Could there be anything missing from the secondary version?


  • How widely known or cited is the producer? Who else uses these data?
  • Is the measure or producer contested?
  • What are the credentials of the data producer?
  • If an individual, are they an expert on the subject?
  • If an individual, what organizations are they associated with? Could that association affect the work?

Objectivity & Purpose

  • Who sponsored the production of these data?
  • What was the purpose of the collection/study?
  • Who was the intended audience for or users of the data?
  • Was it collected as part of the mission of an organization? Or for advocacy? Or for business purposes?


  • When were the data collected? Not always close to when they were released or published -- there is often a time lag between collection and reporting because of the time required to analyze the data.
  • Are these the newest figures? Sometimes the newest available figures are a few years old. That is okay, as long as you can verify that there isn't something newer.

Collection Methods & Completeness

  • How are the data collected? Count, measurement or estimation?
  • Even a reputable source and collection method can introduce bias. Crime data come from many sources, from victim reports to arrest records.
  • If a survey, what was the total population -- how does that compare to the size of the population it is supposed to represent?
  • If a survey, what methods used to select the population included, how was the total population sampled?
  • If a survey, what was the response rate?
  • What populations included? Excluded?

Consistency / Verification

  • Do other sources provide similar numbers?
  • Can the numbers be verified?

Researching Your Data

To fully understand your data, what it can tell you, and how much it will strengthen your argument, look in the following places.

  1. Data Documentation
  2. The Research Literature
  3. Specialized Bibliographies

Note: some data will be much easier to research than others, and not all three suggestions above will apply to all datasets.

Data Documentation

Find the website of the institution that creates, disseminates, or hosts the data you are using and look for documentation in the form of:

  • User Guides
  • Codebook
  • Questionnaire or "survey instrument"
  • Statistical overviews can be helpful (ex. "How Australians Use Their Time)

These types of publications will give you the following information about your data (and much more):

  • Purpose / overview / background of survey
  • Methodology
    • Scope and coverage (who's included, who's not?)
    • Sample design
    • Data collection methods
    • Response rates
  • Data quality

Research Literature

Use the following indexes to search for research articles that use the data you're researching. Three main kinds of searches will help you:

  • Search for the name of the data in the abstracts
  • Do a citation search for the data in SSCI
  • Search on the topic and look at the data used by the research you find. How does their data compare to yours?

Specialized Bibliographies

1. With large surveys, it is often the case that the web site where they are hosted will also contain a bibliography of research that is about or uses that data. Look for these on the project web sites for the data.

2. ICPSR maintains a Bibliography of Data Related Literature, which you can search from their web site.

3. Often, topical entries in subject encyclopedias will have sections on data. For example the entry for Recidivism in the Encyclopedia of Crime and Justice has a discussion of data commonly used to measure recidivism.

Helpful Articles Evaluating Data Sources