Geochem Reading List

Preparing a Data Matrix

Each of you should become familiar with database manipulations and graphing applications. You may decide to use one application (like Microsoft Works) for database manipulations and another (like Cricket Graph or Delta Graph) for display. Or, you may have access to one package which performs all of these functions. This exercise is designed around Works and Cricket -- these are not necessarily the best but they work reasonably well and are available on the departmental clusters. If you have any questions about other applications please ask. Chances are they will be suitable.

For most of our work we will use DataDesk which will accept Works files as imports.

We will use a stratigraphic data set assembled by William Fox as a part of his dissertation studies. More about this Fox data set later on. Make sure that you save the files that you will prepare in this exercise as you will use them later on. The data are arranged in matrix form. Each row of the matrix contains information about a specific geographic location in the study area. Each column contains the value of (1) the sample number, (2) the U-axis, (3) the V-axis, (4) sandstone thickness, (5) shale thickness, (6) carbonate thickness and (6) evaporite thickness at each sample locality. A map showing the spatial distribution of these samples is provided on the handout sheet. Keep the handout sheet and the files that you produce during this exercise.

Open Works and select the database icon from the menu. Enter the names of the columns (variables) as specified above. If you use the command L operation you will have a form in which you can enter the data. Enter the values as given on the handout. It will be worthwhile to check each value to make sure that they are in agreement with the handout. When you are finished save the file (command S). Note that there are several options for saving. You can save your file as a Works file or as an ASCII (Text) file. Save your file as a Works and Text file. You might use the names Fox.Wks and Fox.Text. It is good practice to keep a backup copy of your files. The Text file will be imported into Cricket and DataDesk.

Most Macintosh programs are similar and once you use one application it becomes easier to use another. Double click on one of the fields in the matrix. Note that you can change the format and the position of the value in the field. You should make sure that the thickness variables are in number format. In general, when you import a file as a text file all of the columns are in text format.

If you point at the top of the file (where the labels are located) you can drag the field boundaries. If you point at a column and hold down on the mouse button, you can drag the field and arrange the columns anyway you want.

Spend some time getting familiar with the application.

Under Data you can select a filter. Request all Sandstone values greater than 100. Remember that you have not eliminated the other rows of the matrix. Show All will display the entire matrix. Sort the data by ascending values of Sandstone thickness. Change the filter and obtain all samples with more that 100 feet of sandstone and less than 200 feet of shale. If you had a large data set and wanted to create a subset (a new file) of just those samples you can use the Save As command. If you want to save the subset then click on the box that says Save Selected Records Only and enter a suitable name. Do NOT use the name of the original file as this would replace the whole data set with the subset.

You can create new fields which are derived from existing fields. Under Form select New Field and call the field Total Thickness. Double click on the new field and select Number, General, 2 decimal places and Formula. Enter Sandstone + Shale + Carbonate + Evaporite. Create another new field which contains the sandstone/shale ratio. Save this new file as Fox8.wks and save another version as Fox8.Text (making sure that you select the text format.

Look at the map. Note that the U values are given so that they increase downwards on the U (Y) axis. Typically, scatter diagrams are constructed so that both the X-axis and Y-axis variables increase from the origin -- up and to the right. Create a new variable to transform the U-coordinate so that when a scatter diagram is prepared it will display the samples (observations) in their correct spatial positions. Save this file as fox.text. You can throw away the remaining files.

Open Cricket Graph. Select Open and click on the box "show all text files". Click on the icon for Fox6.Text. Proper column labels need to replace Column 1, Column 2, etc. With Cricket just click on each heading and type in the proper label. Use command D to delete the first row containing the labels. You will need to convert the data to a numeric format. Command A will select all of the matrix and Command F will let you change format to decimal. The Graph heading will let you select the type of graph you can produce. Plot Sandstone versus Shale (a scatter diagram). Plot the sample locations and confirm that you have the correct spatial positioning. Check out the options. You can change the symbol for the plotted points. You can do a least squares regression and get the equation of the best fit straight line. If you do a best fit with and without this point you will see a marked difference in the square of the correlation coefficient (r). The closer r approaches 1, the stronger the linear fit. Outliers (and how to treat them) are an important consideration in analyzing all data sets.

Click on the label for the Y-axis. Click on the values plotted on the X-axis. In general, you can change most of what you see on the screen. In cricket you can save a graph or plot. Type your name on the plot of sandstone versus shale - perhaps in the title. Use the PICT format and send me your plot as proof of completion.

Return