(Note: the subtopic Psychological Health Outcomes is on a separate page.)


Having focused so far in this report on the environmental data (source data, environmental monitoring data, and bio-monitoring) that serves in the present discussion to help determine exposure to human beings, we turn now to examine the available data related to health outcomes.

All health outcomes, including mental health outcomes and those diseases primarily genetic in origin, bear some relationship to environmental factors. These environmental factors may be both physical (including the built environment) and social. Environmental factors may, for example, be the primary cause of a particular disease (e.g., asthma, certain cancers), act along with non-environmental factors to cause a disease (e.g. heart disease, depression), influence the course of a disease (e.g., asthma), or affect access to medical care for a disease (e.g., diabetes, schizophrenia). 

One central goal of environmental health is to understand and control the influence of environmental factors on the health of human beings. Yet we are exposed daily to a multitude of environmental factors, and these may affect our health, both now and in the future, in a multitude of ways. If we confine the discussion to chemical toxins in the environment, we find that, of the 80,000 man-made chemicals available today in the U.S., only a few hundred have been systematically tested for toxicity. Chemical toxins may affect every organ system in the body.  An alphabetical table of diseases and a listing of chemicals suspected to directly contribute to their causation, along with an assessment of the strength of the scientific evidence for this causation, is given in Appendix I: Diseases and Environmental Toxins Suspected to Cause Them.

Environmental risks to communities from particular toxic chemicals are often extrapolated from occupational studies looking at the risks of such chemicals to workers. This is because (1), workplace exposures, usually higher than community exposures, often lead to detectable health effects in much smaller populations, and (2) in a controlled workplace situation it is often possible to do continuous or intermittent ambient and/or personal monitoring and thus to calculate both peak dosages and averages over time.

It is often very difficult to predict how likely it is that a given environmental factor will result in a particular health outcome for a particular person, to know whether a given health outcome is due, in part or in full, to a given environmental exposure. Likewise, at a population level, it is typically difficult to map environmental exposure-health outcome relationships geographically, or to track trends in them over time. These difficulties are often due both to (1), the inherent complexity of exposure-outcome relationships themselves (e.g., variable disease expression, diseases caused by many factors, long periods of time separating exposures and diseases) as well as to (2), inadequacies in the available exposure and outcome data. The 3 main types of epidemiological studies used to examine exposure-outcome relationships are (1), case-control studies (comparing a group with a certain health outcome with another group without that outcome in terms of past environmental exposures), (2), cross-sectional studies (comparing groups at a single point in time in terms of both exposures and outcomes), and (3), cohort studies (following groups with different exposures over time to see which individuals develop disease). All three of these studies depend on clear definitions and accurate measurements of both exposures and outcomes. Good exposure and outcome data, when available, can often allow for meaningful answers to environmental health questions even in the face of the complexities of a particular exposure-outcome relationship. Conversely, if such data are not available, such answers are typically impossible to obtain.

General Nature of Health Information

Although it might seem that it would be easier to obtain health information about a group of people living in a certain place than to obtain environmental information about that place, this is not always the case. Some environmental data, as for example the levels of certain chemicals in the air, can be monitored mechanically, whether continuously or at periodic intervals. This is not possible for health outcomes, which must be reported or detected in order to be known. If people get sick with a certain disease, but do not either seek health care (or die), the disease will not appear on any information “radar screen”.

A general model of the steps in the pathway that health data must follow in order to be available as information, from the individual through a health care system to an information system, is this:

Risk of exposure or disease (e.g. based on residence, as in a census)à (Possible) biomarker of exposureà (Possible) marker of sub-clinical diseaseà Clinical disease (or death)à Contact with health facilityà Accurate diagnosisà Adequate record-keepingà Reporting of health facility to a databaseà Availability of the databaseà Analysis of and reporting from the database.


In addition to this pathway through the health system, wherein health outcomes are passively detected depending on whether individuals seek treatment, it is also possible to conduct surveys, screenings, or studies that actively detect risk, exposure, sub-clinical disease, or clinical disease. These surveys, screenings, and studies are sometimes the only way that the incidence and prevalence of an exposure or health outcome can be known in a population. They are especially important to understanding disease in medically underserved populations, such as low-income and minority groups, whose disease profiles may be underrepresented in data from health facilities. Yet it should be borne in mind that surveys, for example, are expensive in terms of the required time (e.g., 9-12 months for planning, collecting, processing, and analyzing a personal interview type survey, 4-6 months for a telephone interview survey), staff (for survey design and data collection, management, and analysis), and money involved (e.g., $100 per personal interview and $70 per telephone interview for the National Health Interview Survey several years ago).[324]  Many organizations that may be interested in investigating environmental health issues simply do not have these resources available.

It should also be mentioned here that personal health data, unlike environmental data, often involves issues of privacy and confidentiality. Sharing of health related data by health facilities requires adherence to such laws as HIPAA., the Health Insurance Portability and Accountability Act,[325] whereas collection and use of health data from schools requires adherence to the Family Educational Rights and Privacy ACT (FERPA).[326] In addition, gathering health information about an individual may require that individual’s signed informed consent and, if done for research purposes, pre-approval by the Institutional Review Board (IRB) of the responsible university or organization.

Figure 5: Common Sources of Health Data

Hospital records (e.g., discharge data)

Emergency department records

Ambulatory care records

School nurse records

Health insurance company records

Medication sales records (including over-the-counter medications)

Birth certificates

Death certificates

Disease tracking systems and registries





Health Outcomes Data Sources and Systems

This section will review the major health data available to public health departments, researchers, and (sometimes) the public at large. Locally, the Allegheny County Health Department (ACHD) gathers and maintains its own public health information databases, subsequently reporting these data to the state Department of Health (PaDOH), while data from surrounding counties are gathered and maintained directly by PaDOH. It is possible for the public to readily query a few of these databases online. EpiQMS (Epidemiology Query and Mapping System), for example, is an interactive health statistics website developed by the Pennsylvania Department of Health in collaboration with the Washington State Department of Health. The system uses state, regional, and county population, birth, death, and cancer datasets and SVG (scalable vector graphics) technology. Users of the system “can produce numbers, rates, graphs, charts, maps, and county profiles using various demographic variables (age, sex, race, etc.)”. [327]

 Reports from public health departments themselves often make use of multiple data sources. The Pennsylvania Healthy People 2010 Report,[328] for example, is based on analyses of data from the United States Bureau of the Census, the National Immunization Program (NIP), the CDC’s National Center for Health Statistics, the PA Health Care Cost Containment Council (PHC4), the ChildLine and Abuse Registry of the PA Department of Public Welfare, the Behavioral Risk Factor Surveillance System (BRFSS), and the PA Department of Health’s own Bureau of Health Statistics and Research, Bureau of Epidemiology, and Division of School Health.

PaDOH also makes a very useful “Electronic Guide to Health Statistics” available online. This website gives direct links to internet health data report sources related to many health and disease subjects for Pennsylvania and its Counties and Communities. Information is also supplied on the content and geographical detail available for these sources. A few of these data sources are examined below. They include:

·         Birth and death records and infant mortality

·         Hospital discharge data

·         Pennsylvania's National Electronic Disease Surveillance System (PA-NEDSS)

·         Cancer Registry Data

·         Chronic Disease Tracking in Pennsylvania

·         Real-time Outbreak Disease Surveillance (RODS) System

·         Behavioral Risk Factor Surveillance System (BRFSS)

Birth and Death Records and Infant Mortality

The county and state health departments maintain both birth and death records, including an infant mortality database with death records matched to birth certificates. Since 2002 both primary and underlying causes of death are listed by ICD-10 code. These data, when de-identified are coded down to the census tract level and could thus be spatially and temporally correlated, for example, with census data or environmental data. There are a few practical applications of this data, but the coarseness of the geographical unit, the lack of known information about potential confounders (e.g., smoking) and length of time at place of residence are some of the major limitations on its usefulness for environmental health studies.

Hospital Discharge Data

The Pennsylvania Health Care Cost Containment Council (PHC4) is a state agency that collects data quarterly from every inpatient discharged from all hospitals (including general acute care, psychiatric, rehabilitation and long term acute care hospitals) in the state of Pennsylvania, as well as data such as surgeries, endoscopies, chemotherapies and certain cardiovascular procedures from ambulatory surgical centers. PHC4 checks the data for inaccuracies and provides feedback to facilities, but data accuracy is of course ultimately reliant on the facilities themselves. PHC4 users include hospitals, state government agencies, university researchers, commercial vendors, and other non-commercial users. Data fields in the datasets include clinical information such as Diagnosis Related Groups (DRGs), Major Diagnostic Categories (MDC’s), diagnosis and procedure codes, and utilization data. De-identified datasets are available from PHC4 on a fee schedule. Standard data sets include data for a single quarter for each hospital and ambulatory surgical facility in the state, each of the 9 state regions, and the state as a whole. Customized data sets are available as well (e.g., for all cases of breast cancer in Allegheny County), either in customized reports (in Excel format) or as data sets for further analysis.

There are a few major limitations to the PHC4 data.  Spatial information is limited to ZIP code of hospital or residential address, which makes it impossible to tease out individuals within a very small radius of a pollution source.  In addition, in a rural area a person with a post ofiice box in one zip code may live in another.  Besides these limitations on spatial accuracy, patients who visit a hospital multiple times for the same condition add error to incidence and prevalence calculations.[329]

Pennsylvania's National Electronic Disease Surveillance System (PA-NEDSS)

Certain diseases, especially infectious diseases (e.g., HIV/AIDS, STDs, vaccine-preventable diseases, tuberculosis, rabies), are reportable.  This means that it is mandatory for all health personnel encountering patients with these diseases to report cases to a health department. To track these diseases, Pennsylvania participates in the National Electronic Disease Surveillance System (NEDSS), wherein diagnoses and case histories from all laboratories in the state feed into a common database via electronic laboratory reports (ELR).  NEDSS is an evolving initiative whose mission “is to design and implement seamless surveillance and information systems that take advantage of the best information and surveillance technology”.[330]  NEDSS is intended to be a system for the continuous automatic capture and analysis of electronic data, and to assist with monitoring disease trends, informing policy, identifying research needs, and guiding prevention and intervention programs at the local, state, and national levels.

In PA-NEDSS, along with its infectious disease surveillance functions, blood lead levels are also reported and analyzed in the Lead Program. The system allows for an ongoing electronic record of case management and investigations management and has analysis and reporting (A&R) functionality that allows public health staff to easily create reports, charts, and graphs containing disease data that is updated on a daily basis. Although the Pennsylvania Lead Program is still limited by the uneven coverage and frequency of screening exams, PA-NEDSS allows for excellent capture and reporting of those exams that do occur.

Cancer Registry Data

Sometimes a larger-than-expected number of a certain adverse health outcome (such as cancer or birth defects) occurs in a group of people living in the same community or employed at a common workplace. This is known as a disease cluster. If the group of people in which the cluster occurs is thought to have had a common exposure, a cluster investigation may take place, wherein environmental exposures in the population are examined retrospectively (backwards in time). In 1997, for example, Pennsylvania registered 25 cancer cluster investigation requests, the 11th highest number of in the U.S.[331]. The Commonwealth has a standard protocol for responding to these requests, but the availability of baseline incidence data for the community is critical for a successful cluster investigation. This community incidence data is available for cancers through a tracking system known as a disease registry. Such a registry, to be most useful, should be statewide and updated continuously. It should also integrate data from multiple sources, including passive case detection by health facilities as well as active surveillance by public health officials.[332]

Registry data are very helpful in elucidating relationships between environmental factors and health outcomes. Cancer registries are the prototypical health registry, and in many parts of the country cancer is still the only chronic disease health outcome that has a registry available for examining its relationship with environmental factors. Cancer registries gather data from hospitals, outpatient facilities, and laboratories into a single repository at the state level. This data, for a given case, will include such information as cancer type, stage at diagnosis, and treatment received. When combined with patient demographic information, cancer incidence and prevalence rates can be mapped geographically, measured over time as trends, and estimated for various sub-populations, such as ethnic groups and age groups.

Since 1994, in an effort to support the development of uniform high-quality cancer registry data at a national level, the United States CDC (Centers for Disease Control and Prevention) has administered the National Program of Cancer Registries (NPCR). This program supports improvements in the quality and use of state cancer registry data through (1) financial assistance, (2) technical assistance such as the development of data transmission software for health facilities, (3) training and information sharing meetings for state registry personnel, (4) assistance with data analysis, reporting, and research. Although not yet integrating data from all states, a Nationwide Cancer Surveillance System has been instituted by NPCR since 2001. This system will have greater statistical power than individual state registries and so potentially be able to uncover more relationships between environmental factors and cancer outcomes, including less common exposures, rarer cancers, and cancers in population subgroups such as ethnic minorities.

In 1997, NAACCR (North American Association of Central Cancer Registries), in cooperation with the CDC, began reviewing the completeness, accuracy, and timeliness of state cancer registry data. In 2003, 35 states achieved Certification Awards from NAACCR (based on their 2001 data). Pennsylvania was not among these for this year, although the Commonwealth has earned the Certification Award in the past[333].

The Allegheny County Health Department wishes to link the Cancer Registry to county birth and death data, as this connection doesn’t currently exist.[334]

Chronic Disease Tracking in Pennsylvania

Pennsylvania was recently awarded $600,000 from the CDC to begin developing a tracking system for asthma, and the development of this system is underway[335]. There is particular interest in developing better school-based asthma surveillance systems. According to the PA Department of Health, more than 180,000 students (8.6% of all students) in Southwestern PA were diagnosed with asthma in 2001-2002.[336]  The PaDOH Bureau of Epidemiology’s Division of Environmental Health Assessment is currently investigating the two school districts with the highest asthma prevalence in the state (in McKean and Berks Counties).  As part of this work, parents and school nurses fill out questionnaires that include questions about a number of environmental factors that may be contributing to children’s asthma. 

Other than asthma, as is true of most other states, Pennsylvania has no accurate tracking systems for non-cancer chronic diseases in which environmental factors may play an important role. These diseases include learning disabilities in children, neurological problems in the elderly such as Alzheimer’s and Parkinson’s, very common diseases such as heart disease and diabetes, or rarer diseases such as lupus and sarcoidosis. Tracking systems for these and other diseases, if available, could potentially uncover relationships between exposures and disease, facilitate more timely and effective cluster investigations, allow better identification of disease trends, and provide more accurate information to communities about specific environmental health risks that they face[337].

Finally, it should be mentioned that Pennsylvania is now developing a birth defects registry, but lags behind many other states in this process[338].

Real-time Outbreak Disease Surveillance (RODS) System

Developed by the RODS Lab, a collaboration between University of Pittsburgh and Carnegie Mellon University staff, RODS is a public health surveillance system that has collected de-identified clinical data from hospitals in Pennsylvania since 1999, and allows real-time monitoring for particular health complaints or syndromes (i.e., sets of symptoms). Currently collected data include age, time/date of visit, gender, home/work ZIP codes, and the patient’s chief complaint.[339] Although developed and expanded chiefly as an early warning system against bioterrorism, certain components of RODS might also be useful for environmental health applications.  In the opinion of the project’s director, the real-time actual data currently collected would be of limited use due to such factors as outcome prevalence being too small or too delayed to draw generalizations.  In addition, the “chief complaint” may capture only part of the patient’s problem, and may or may not reflect their later diagnosis.  However, the expertise that the lab has in designing algorithms to work around data limitations might be useful for environmental health The data are currently available only to health department officials who have permission to access the system.[340]

Behavioral Risk Factor Surveillance System (BRFSS)

We focus on the BRFSS as an example of an important survey-based source of local health data. Since 1989, Pennsylvania has participated in the BRFSS, a cross-sectional random telephone survey of non-institutionalized adults conducted on a monthly basis by all state health departments, with financial and technical assistance from the CDC. For the survey, states utilize standard procedures wherein BRFSS interviewers ask questions from standardized questionnaires related to behaviors associated with preventable chronic diseases (including smoking, obesity, etc.), injuries, infectious diseases, clinical preventive practices, and health care access and use. For a given state in a given year, the BRFSS questionnaire is comprised of core questions, optional modules, and state-added questions. In 2002, for example, Pennsylvania used modules for arthritis, folic acid, heart attack and stroke, and tobacco indicators, and also added questions for injury, lead poisoning, oral health, osteoporosis, skin cancer, smoke detectors, and chlamydia awareness. States forward completed responses to the CDC, where they are aggregated into monthly data for each state and published online[341]. The Pennsylvania state reports for 1997-2002 are at (WEBSITE[342]) Annual BRFSS questionnaires dating back to 1991 are also available online.[343]

For Allegheny County in 2002 the Office of Health Survey Research of the Department of Behavioral & Community Health Sciences at University of Pittsburgh's Graduate School of Public Health, under contract from the Allegheny County Health Department, conducted an expanded behavioral health risk survey, modeled on the BRFSS, of 4,750 adult County residents. Interestingly, this questionnaire also asked about perceived risks from environmental hazards ranging from crime and violence to lead-based paint. Results are available on the ACHD website.[344]

Case Study #5: Is Autism Related to Industrial Mercury Releases?

Autism is a developmental disability that begins in childhood. People with autism typically exhibit repetitive behaviors and problems with certain social and communication skills. Both genetic and environmental factors are thought to play a causative role in autism, but the exact role of individual factors is still largely speculative.[345]  For example, a few years ago, thimerosal, an ethylmercury-containing compound used as a preservative in vaccines since the 1930’s, was suspected to be contributing to rising autism rates. Although subsequent evidence seems to refute the association[346], the Public Health Service agencies and the American Academy of Pediatrics recommended in 1999 that, as a precaution, thimerosal no longer be used in vaccines.

There are indeed indications that autism has been on the rise in the U.S. and elsewhere. For example, according to Individuals with Disabilities Education Act (IDEA) data, new cases of autism among persons aged 6-22 increased from15,880 in 1992 to 141,022 in 2003.[347] This increase in persons identified as eligible for services under IDEA, however, does not necessarily mean that autism is more common: it may simply reflect changing medical and legal standards for diagnosis. According to the CDC, the actual number of people with autism in the U.S. is not known, nor is it known whether there has been a true increase in recent years. Population-based prevalence studies are expensive and time-consuming. [348] [349] For example, in late 1997, in response to concerns expressed by a citizen’s group in Brick Township, the New Jersey Department of Health and Senior Services, along with the CDC and the ASTDR (Agency for Toxic Substances and Disease Registry) conducted a study to determine the true prevalence of autism in Brick Township and its relationship to environmental factors. This study actively sought out suspected cases of autism from special education records, records from local clinicians, lists from community parent groups, and volunteers, and then conducted an extensive clinical assessment of these suspected cases to verify or rule out the diagnosis.

A study published in 2005 [350] found a correlation between industrial mercury releases and autism counts in Texas school districts. The study linked TRI data on mercury releases from industrial facilities with administrative data from the Texas Education Agency (TEA) from school years 2000–2001 detailing autism counts from 1184 school districts in 254 Texas counties. This type of study is known as an ecological study because it looks at data on exposures and outcomes at the aggregate rather than the individual level. While ecological studies cannot prove that a certain factor causes a certain disease, they can provide valuable clues about links between environmental factors and health outcomes, and can also suggest areas where more detailed research is needed. And yet, in order for researchers conducting such studies to even begin to link health outcomes in a group of people to environmental exposures, they must first of all be able to map good data about the incidence and prevalence of their health outcomes in time and space. To be useful, these data should be reliable and complete, or at least representative. The fact is, however, that for many health outcomes such data have not yet been collected. In the case of autism and mercury, before attempting to answer questions about whether they are linked, we ought to first look at the quality of the data that we could possibly use to try to answer those questions. Moreover, if the data are found to be insufficient, we can try to improve its quality so that we can then be in a position to answer such questions.

In an effort to improve autism tracking, CDC now funds ten ADDM (Autism and Developmental Disabilities Monitoring) Network projects in eleven states (Alabama, Arizona, Arkansas, Florida, Illinois, Missouri, New Jersey, South Carolina, Utah, West Virginia, and Wisconsin).  In Pennsylvania, The University of Pennsylvania/The Children's Hospital of Philadelphia Center for Autism and Developmental Disabilities Epidemiology is the CDC CADDRE (Centers of Excellence for Autism and Developmental Disabilities Research and Epidemiology) program studying autism. This Center is now using multiple sources to obtain a more complete estimate of the number 8-year-olds in Philadelphia County with autism. It is also part of the National CADDRE Study, a case-control study looking at possible environmental causes of autism.[351]

Continue to Psychological Health Outcomes