![]() |
4. Data management |
Vegetation scientistsfaced with the complexity of natureoften collect lots of data. The more data you have, the more important data management becomes. This section of the course suggests effective ways to record data in the field, organize your data in the laboratory, and keep data quality high.
You have the choice of several ways to record your data in the field. The simplest wayand the way I preferis to write your results in pencil on paper data forms. But first, consider some fancier alternatives.
Some vegetation scientists like to enter their results into a tape recorder. Using a tape recorder means you need not bother with pencils, paper, and clipboards in the field. Instead, the recorded results are transcribed into computer files back in the laboratory. Tape recordings are difficult to review and correct in the field. But tape recording is the best choice if your field conditions are so arduous that you need to use your hands to keep your balance, or you need to keep your eyes on what you are measuring.
More and more vegetation scientists are using portable computers, handheld computers, specialized data loggers, or even PDAs ("personal digital assistants") to record their data. The great advantage of these electronic devices is that the recorded data can later be transferred directly into files for statistical analysis, eliminating the chance for transcription errors. Well designed data-entry programs also allow you to review your data, compare samples, and make corrections. The U.S. Forest Service, for example, has made good use of these devices.
Despite the advantages of computerized data entry, I continue to prefer manual entry of data onto field data forms. Data forms have the great advantage of being the easiest format to review your data, compare samples, and make corrections. Paper and pencil are also more reliable in the field than are electronic devices. The main drawback of using data forms is that it requires an extra step for transcription back in the laboratory, which adds time and the chance for errors to the process. But transcription is much faster than data collection, and errors can be eliminated by careful proofreading.
There is another option: Recording data into a bound field notebook. This classical approach to data collection, the way you were probably taught to record data in your chemistry lab courses, has the advantage of generating an archive of your research. But bound field notebooks work poorly in vegetation science, because the data sets we collect are usually too big for a notebook page. It remains a good idea, however, to maintain a field notebook for unstructured observations, comments, and ideas that occur to you during field work.
Data forms should contain all essential information, be easy to use in the field, be easy to transcribe from, and help minimize errors. The type of field data form I frequently use for collecting vegetation data contains a heading section, a data section, and a section for notes. To see a printable example of this type of data form, partially filled out, click here.
The header should clearly identify the project and the type of data collected, the names of the crew members (in case questions arise later), and the date. Make sure that somewhere you state what the units of measurement are!
The data section for vegetation data usually takes the form of a matrix of attributes within entities. For sampling communities by area, the entities are often quadrats and the attributes are often cover values of species. A typical organization would put species along rows and quadrats along columns. To conserve space, species names are usually abbreviated. In the example, the abbreviations consist of the first three letters of the genus and the first three letters of the specific. For example, Agogra is the abbreviation for Agoseris grandiflora. (In case you are curious, the other species are Agrostis exarata, Beckmannia syzigachne, Camassia quamash, Carex densa, Carex scoparius, Deschampsia cespitosa, Grindelia integrifolia, Juncus tenuis, Plectritis congesta, Veronica arvensis, Eleocharis palustris, Boisduvalia densiflora, and Galium verum.) Putting species names in alphabetical order makes them easier to find in the field. One problem with this format, if you are making observations from many species, is that a single page will not hold all the species names. My preferred solution is to enter most of the abundant species in alphabetical order, but leave several blank rows so you can write in the name of the occasional other species you encounter.
Be sure to fill in each cell of the table. If the species was absent, enter something like a dash. (Entering a zero is technically correct, but can be confused with "present in very low abundance.")
It is good practice to include the sample location on your data sheets. For example, include the coordinates of the quadrat or Bitterlich point if you are using the coordinate system.
The notes section is important, as a space to clarify methods or make related unstructured observations for the data being collected.
Your hard-collected field data are extremely valuable. One of the first things you should do upon returning from the field is photocopy your field data forms and put the copies in a safe place.
You will need to enter your field data into a computer file for analysis. Details of what format to use depend on the type of program: spreadsheet (like Quattro or Excel), statistical analysis software (like S-Plus or Statgraphics), or specialized packages (like PC-ORD or the Cornell Ecology Programs). For most applications, the matrix format I suggested for the field data form works well.
You should accompany your computer data files with documentation that describes the structure of the data. For example, for the data collected on the sample field data form, the documentation should include the location and size of the study area, the location and size of the quadrats, the size of the data matrix, the full names of the species, and the units of measurement.
Don't skimp on documentation. Even include things that seem obvious at the time. If for some reason you have to put the data aside for a few months (or years!), you will be thankful for any documentation you leave behind. And if someone else will be using your data, complete documentation is essential.
Transcription is a step where errors can creep into your data set. The best way to avoid transcription errors is to proofread your entered data against the original field data forms. The proofreading step, although sometimes tedious, is essential and should never be skipped. Proofreading is more efficient (and more fun) when done with a partner. Someone other than the person who entered the data should read from the field data forms during proofreading. Be sure to document that the proofreading was done, by entering when and who proofread and when any errors you discovered were corrected.
Back up your data at all stages. This includes copying paper forms and backing-up computer files.
Finally, designate a spot in your office or laboratory to store data documentation. Include the field data forms, field notes, back-up disks of computer files, and explanations of the data structure. Keep copies of the paper forms in a separate location.
© 2007 Mark V. Wilson and Oregon State University