Having focused so far in this report on the environmental data (source data, environmental monitoring data, and bio-monitoring) that serves in the present discussion to help determine exposure to human beings, we turn now to examine the available data related to health outcomes.
All health outcomes, including mental health outcomes and those diseases primarily genetic in origin, bear some relationship to environmental factors. These environmental factors may be both physical (including the built environment) and social. Environmental factors may, for example, be the primary cause of a particular disease (e.g., asthma, certain cancers), act along with non-environmental factors to cause a disease (e.g. heart disease, depression), influence the course of a disease (e.g., asthma), or affect access to medical care for a disease (e.g., diabetes, schizophrenia).
One central goal of environmental health is to understand and control the influence of environmental factors on the health of human beings. Yet we are exposed daily to a multitude of environmental factors, and these may affect our health, both now and in the future, in a multitude of ways. If we confine the discussion to chemical toxins in the environment, we find that, of the 80,000 man-made chemicals available today in the U.S., only a few hundred have been systematically tested for toxicity. Chemical toxins may affect every organ system in the body. An alphabetical table of diseases and a listing of chemicals suspected to directly contribute to their causation, along with an assessment of the strength of the scientific evidence for this causation, is given in Appendix I: Diseases and Environmental Toxins Suspected to Cause Them.
Environmental risks to communities from particular toxic chemicals are often extrapolated from occupational studies looking at the risks of such chemicals to workers. This is because (1), workplace exposures, usually higher than community exposures, often lead to detectable health effects in much smaller populations, and (2) in a controlled workplace situation it is often possible to do continuous or intermittent ambient and/or personal monitoring and thus to calculate both peak dosages and averages over time.
It is often very difficult to predict how likely it is that a given environmental factor will result in a particular health outcome for a particular person, to know whether a given health outcome is due, in part or in full, to a given environmental exposure. Likewise, at a population level, it is typically difficult to map environmental exposure-health outcome relationships geographically, or to track trends in them over time. These difficulties are often due both to (1), the inherent complexity of exposure-outcome relationships themselves (e.g., variable disease expression, diseases caused by many factors, long periods of time separating exposures and diseases) as well as to (2), inadequacies in the available exposure and outcome data. The 3 main types of epidemiological studies used to examine exposure-outcome relationships are (1), case-control studies (comparing a group with a certain health outcome with another group without that outcome in terms of past environmental exposures), (2), cross-sectional studies (comparing groups at a single point in time in terms of both exposures and outcomes), and (3), cohort studies (following groups with different exposures over time to see which individuals develop disease). All three of these studies depend on clear definitions and accurate measurements of both exposures and outcomes. Good exposure and outcome data, when available, can often allow for meaningful answers to environmental health questions even in the face of the complexities of a particular exposure-outcome relationship. Conversely, if such data are not available, such answers are typically impossible to obtain.
Although it might seem that it would be easier to obtain health information about a group of people living in a certain place than to obtain environmental information about that place, this is not always the case. Some environmental data, as for example the levels of certain chemicals in the air, can be monitored mechanically, whether continuously or at periodic intervals. This is not possible for health outcomes, which must be reported or detected in order to be known. If people get sick with a certain disease, but do not either seek health care (or die), the disease will not appear on any information “radar screen”.
A general model of the steps in the pathway that health data must follow in order to be available as information, from the individual through a health care system to an information system, is this:
Risk of exposure or disease (e.g. based on residence, as in a census)à (Possible) biomarker of exposureà (Possible) marker of sub-clinical diseaseà Clinical disease (or death)à Contact with health facilityà Accurate diagnosisà Adequate record-keepingà Reporting of health facility to a databaseà Availability of the databaseà Analysis of and reporting from the database.
In addition to this pathway through the health system, wherein health outcomes are passively detected depending on whether individuals seek treatment, it is also possible to conduct surveys, screenings, or studies that actively detect risk, exposure, sub-clinical disease, or clinical disease. These surveys, screenings, and studies are sometimes the only way that the incidence and prevalence of an exposure or health outcome can be known in a population. They are especially important to understanding disease in medically underserved populations, such as low-income and minority groups, whose disease profiles may be underrepresented in data from health facilities. Yet it should be borne in mind that surveys, for example, are expensive in terms of the required time (e.g., 9-12 months for planning, collecting, processing, and analyzing a personal interview type survey, 4-6 months for a telephone interview survey), staff (for survey design and data collection, management, and analysis), and money involved (e.g., $100 per personal interview and $70 per telephone interview for the National Health Interview Survey several years ago). Many organizations that may be interested in investigating environmental health issues simply do not have these resources available.
It should also be mentioned here that personal health data, unlike environmental data, often involves issues of privacy and confidentiality. Sharing of health related data by health facilities requires adherence to such laws as HIPAA., the Health Insurance Portability and Accountability Act, whereas collection and use of health data from schools requires adherence to the Family Educational Rights and Privacy ACT (FERPA). In addition, gathering health information about an individual may require that individual’s signed informed consent and, if done for research purposes, pre-approval by the Institutional Review Board (IRB) of the responsible university or organization.
Hospital records (e.g., discharge data)
Emergency department records
Ambulatory care records
School nurse records
Health insurance company records
Medication sales records (including over-the-counter medications)
Disease tracking systems and registries
This section will review the major health data available to public health departments, researchers, and (sometimes) the public at large. Locally, the Allegheny County Health Department (ACHD) gathers and maintains its own public health information databases, subsequently reporting these data to the state Department of Health (PaDOH), while data from surrounding counties are gathered and maintained directly by PaDOH. It is possible for the public to readily query a few of these databases online. EpiQMS (Epidemiology Query and Mapping System), for example, is an interactive health statistics website developed by the Pennsylvania Department of Health in collaboration with the Washington State Department of Health. The system uses state, regional, and county population, birth, death, and cancer datasets and SVG (scalable vector graphics) technology. Users of the system “can produce numbers, rates, graphs, charts, maps, and county profiles using various demographic variables (age, sex, race, etc.)”. 
Reports from public health departments themselves often make use of multiple data sources. The Pennsylvania Healthy People 2010 Report, for example, is based on analyses of data from the United States Bureau of the Census, the National Immunization Program (NIP), the CDC’s National Center for Health Statistics, the PA Health Care Cost Containment Council (PHC4), the ChildLine and Abuse Registry of the PA Department of Public Welfare, the Behavioral Risk Factor Surveillance System (BRFSS), and the PA Department of Health’s own Bureau of Health Statistics and Research, Bureau of Epidemiology, and Division of School Health.
PaDOH also makes a very useful “Electronic Guide to Health Statistics” available online. This website gives direct links to internet health data report sources related to many health and disease subjects for Pennsylvania and its Counties and Communities. Information is also supplied on the content and geographical detail available for these sources. A few of these data sources are examined below. They include:
· Birth and death records and infant mortality
· Hospital discharge data
· Pennsylvania's National Electronic Disease Surveillance System (PA-NEDSS)
· Cancer Registry Data
· Chronic Disease Tracking in Pennsylvania
· Real-time Outbreak Disease Surveillance (RODS) System
· Behavioral Risk Factor Surveillance System (BRFSS)
The county and state health departments maintain both birth and death records, including an infant mortality database with death records matched to birth certificates. Since 2002 both primary and underlying causes of death are listed by ICD-10 code. These data, when de-identified are coded down to the census tract level and could thus be spatially and temporally correlated, for example, with census data or environmental data. There are a few practical applications of this data, but the coarseness of the geographical unit, the lack of known information about potential confounders (e.g., smoking) and length of time at place of residence are some of the major limitations on its usefulness for environmental health studies.
The Pennsylvania Health Care Cost Containment Council (PHC4) is a state agency that collects data quarterly from every inpatient discharged from all hospitals (including general acute care, psychiatric, rehabilitation and long term acute care hospitals) in the state of Pennsylvania, as well as data such as surgeries, endoscopies, chemotherapies and certain cardiovascular procedures from ambulatory surgical centers. PHC4 checks the data for inaccuracies and provides feedback to facilities, but data accuracy is of course ultimately reliant on the facilities themselves. PHC4 users include hospitals, state government agencies, university researchers, commercial vendors, and other non-commercial users. Data fields in the datasets include clinical information such as Diagnosis Related Groups (DRGs), Major Diagnostic Categories (MDC’s), diagnosis and procedure codes, and utilization data. De-identified datasets are available from PHC4 on a fee schedule. Standard data sets include data for a single quarter for each hospital and ambulatory surgical facility in the state, each of the 9 state regions, and the state as a whole. Customized data sets are available as well (e.g., for all cases of breast cancer in Allegheny County), either in customized reports (in Excel format) or as data sets for further analysis.
There are a few major limitations to the PHC4 data. Spatial information is limited to ZIP code of hospital or residential address, which makes it impossible to tease out individuals within a very small radius of a pollution source. In addition, in a rural area a person with a post ofiice box in one zip code may live in another. Besides these limitations on spatial accuracy, patients who visit a hospital multiple times for the same condition add error to incidence and prevalence calculations.
Certain diseases, especially infectious diseases (e.g., HIV/AIDS, STDs, vaccine-preventable diseases, tuberculosis, rabies), are reportable. This means that it is mandatory for all health personnel encountering patients with these diseases to report cases to a health department. To track these diseases, Pennsylvania participates in the National Electronic Disease Surveillance System (NEDSS), wherein diagnoses and case histories from all laboratories in the state feed into a common database via electronic laboratory reports (ELR). NEDSS is an evolving initiative whose mission “is to design and implement seamless surveillance and information systems that take advantage of the best information and surveillance technology”. NEDSS is intended to be a system for the continuous automatic capture and analysis of electronic data, and to assist with monitoring disease trends, informing policy, identifying research needs, and guiding prevention and intervention programs at the local, state, and national levels.
In PA-NEDSS, along with its infectious disease surveillance functions, blood lead levels are also reported and analyzed in the Lead Program. The system allows for an ongoing electronic record of case management and investigations management and has analysis and reporting (A&R) functionality that allows public health staff to easily create reports, charts, and graphs containing disease data that is updated on a daily basis. Although the Pennsylvania Lead Program is still limited by the uneven coverage and frequency of screening exams, PA-NEDSS allows for excellent capture and reporting of those exams that do occur.
Sometimes a larger-than-expected number of a certain
adverse health outcome (such as cancer or birth defects) occurs in a group
of people living in the same community or employed at a common workplace.
This is known as a disease cluster. If the group of people in which the cluster
occurs is thought to have had a common exposure, a cluster investigation may
take place, wherein environmental exposures in the population are examined
retrospectively (backwards in time). In 1997, for example,
Registry data are very helpful in elucidating relationships between environmental factors and health outcomes. Cancer registries are the prototypical health registry, and in many parts of the country cancer is still the only chronic disease health outcome that has a registry available for examining its relationship with environmental factors. Cancer registries gather data from hospitals, outpatient facilities, and laboratories into a single repository at the state level. This data, for a given case, will include such information as cancer type, stage at diagnosis, and treatment received. When combined with patient demographic information, cancer incidence and prevalence rates can be mapped geographically, measured over time as trends, and estimated for various sub-populations, such as ethnic groups and age groups.
Since 1994, in an effort to support the development of uniform high-quality cancer registry data at a national level, the United States CDC (Centers for Disease Control and Prevention) has administered the National Program of Cancer Registries (NPCR). This program supports improvements in the quality and use of state cancer registry data through (1) financial assistance, (2) technical assistance such as the development of data transmission software for health facilities, (3) training and information sharing meetings for state registry personnel, (4) assistance with data analysis, reporting, and research. Although not yet integrating data from all states, a Nationwide Cancer Surveillance System has been instituted by NPCR since 2001. This system will have greater statistical power than individual state registries and so potentially be able to uncover more relationships between environmental factors and cancer outcomes, including less common exposures, rarer cancers, and cancers in population subgroups such as ethnic minorities.
In 1997, NAACCR (North American Association of Central
Cancer Registries), in cooperation with the CDC, began reviewing the completeness,
accuracy, and timeliness of state cancer registry data. In 2003, 35 states
achieved Certification Awards from NAACCR (based on their 2001 data).
The Allegheny County Health Department wishes to link the Cancer Registry to county birth and death data, as this connection doesn’t currently exist.
Other than asthma, as is true of most other states, Pennsylvania has no accurate tracking systems for non-cancer chronic diseases in which environmental factors may play an important role. These diseases include learning disabilities in children, neurological problems in the elderly such as Alzheimer’s and Parkinson’s, very common diseases such as heart disease and diabetes, or rarer diseases such as lupus and sarcoidosis. Tracking systems for these and other diseases, if available, could potentially uncover relationships between exposures and disease, facilitate more timely and effective cluster investigations, allow better identification of disease trends, and provide more accurate information to communities about specific environmental health risks that they face.
Finally, it should be mentioned that
Developed by the RODS Lab, a collaboration between University of Pittsburgh and Carnegie Mellon University staff, RODS is a public health surveillance system that has collected de-identified clinical data from hospitals in Pennsylvania since 1999, and allows real-time monitoring for particular health complaints or syndromes (i.e., sets of symptoms). Currently collected data include age, time/date of visit, gender, home/work ZIP codes, and the patient’s chief complaint. Although developed and expanded chiefly as an early warning system against bioterrorism, certain components of RODS might also be useful for environmental health applications. In the opinion of the project’s director, the real-time actual data currently collected would be of limited use due to such factors as outcome prevalence being too small or too delayed to draw generalizations. In addition, the “chief complaint” may capture only part of the patient’s problem, and may or may not reflect their later diagnosis. However, the expertise that the lab has in designing algorithms to work around data limitations might be useful for environmental health The data are currently available only to health department officials who have permission to access the system.
We focus on the BRFSS as an example of an important
survey-based source of local health data. Since 1989, Pennsylvania has participated
in the BRFSS, a cross-sectional random telephone survey of non-institutionalized
adults conducted on a monthly basis by all state health departments, with
financial and technical assistance from the CDC. For the survey, states utilize
standard procedures wherein BRFSS interviewers ask questions from standardized
questionnaires related to behaviors associated with preventable chronic diseases
(including smoking, obesity, etc.), injuries, infectious diseases,
clinical preventive practices, and health care access and use.
For a given state in a given year, the BRFSS questionnaire is comprised of
core questions, optional modules, and state-added questions.
In 2002, for example, Pennsylvania used modules for arthritis, folic acid,
heart attack and stroke, and tobacco indicators, and also added questions
for injury, lead poisoning, oral health, osteoporosis, skin cancer, smoke
detectors, and chlamydia awareness. States forward completed responses to
the CDC, where they are aggregated into monthly data for each state and published
For Allegheny County in 2002 the Office of Health Survey Research of the Department of Behavioral & Community Health Sciences at University of Pittsburgh's Graduate School of Public Health, under contract from the Allegheny County Health Department, conducted an expanded behavioral health risk survey, modeled on the BRFSS, of 4,750 adult County residents. Interestingly, this questionnaire also asked about perceived risks from environmental hazards ranging from crime and violence to lead-based paint. Results are available on the ACHD website.