On this page we'll attempt to help you address some of the common data management questions that come up when working on metadata.

Multiple datasets containing the same information

A very common issue you may face is what to do with multiple versions of the same dataset. Do you need to create metadata for every one? Not necessarily. Determine which version is the best - the one you have the most information about, the one that is the most accurate, the one that is the most up-to-date - and create metadata for that dataset. Of course, if you've gone through the Instructional pages in this website, you now know that you can use that metadata document as a template to quickly create metadata for all of the versions of the dataset. Just remember to identify the differences between the versions in the metadata so that you can find that "best" dataset when you need it.

Not having the "best" dataset

But what if you don't have a "best" dataset? What if the dataset you have is just "okay"? You may be wondering if you should keep this dataset or remove it from your database. If this is your only version, keeping it may be better than throwing it away. If the dataset is the only one you have, but the information is wrong, is keeping it worthwhile? Can you do anything to make the data better? Is there any information available that will make the data more usable? It's a good idea to do some research before you delete any data.

Legacy datasets

A significant data management issue for many parks is legacy data -- that backlog of existing datasets without metadata. Is it worthwhile, and required, to create metadata for all of the datasets? If any of these data will be posted to the Data Store or sent to other GIS users, they do need metadata.

The best way to deal with legacy data is to spend some time priorizing your data. Which datasets would be most suited to the Data Store? Which ones do you use most often? Which ones do other request from you most often? Document these in-demand datasets first and address the others as you have time. Perhpas a strategy of creating metadata for one legacy dataset a week would help you get over this organizational hurdle.

Organizing your datasets

What is the best way to organize all of the datasets you have? The answer to this question depends largely on your personal preferences and the issue, but here are some basic recommendations.

  1. Document your data and keep your metadata embedded in the datasets. ArcCatalog automatically embeds metadata within a shapefile, geodatabase feature class, or other datset type. Be sure when you move data around that you include the .xml file. If you use ArcCatalog to move files around, you won't have to worry about losing metadata. Be very careful moving GIS data using Windows Explorer. It's very easy to miss one or more of the files associated with a GIS dataset in Windows Explorer.

  2. Make a copy of your dataset before you make edits. If you only edit the working copy, you have the luxury of reverting to an older version if you make a mistake. As you work, make notes so you can fill in the Process Steps in your metadata; alternatively, fill in the Process Steps as you go so you won't have to remember what you did later.

  3. Organize your data logically. Many parks organize their data by theme or by project. Keep in mind that another GIS Specialist may some day take over at your park, and your data will be vitally important to them as well. So create an organizational structure that you could easily explain to someone else. This will also help you if you need to share a segment of your database with another park, office, or NPS cooperator. The ISO or NPS Theme Category keywords may provide a good foundation for a new database folder structure.

  4. Establish a strategy for backing up your database on a regular basis. Keep your backup copies in a location that is physically separate from the working copy. Having a backup on the same machine will not help you if your entire machine crashes. An external hard-drive, a server, or tape media are examples of some of your options, depending on the size of your database and the hardware infrastructure available at your park.
Data Management