• No results found

Appendix A: The preparation of the genealogy for analysis

The raw data was in the form of a stack of sheets of A4 paper, each one (or in some cases more than one sheet) representing a male member of the population. An example is shown in Table A.1.

An advantage of this representation of the information was that it incorporated a degree of redundancy: the same information often appeared in several places. A man first appeared as a child on his father's page, which also showed who his mother was, his birth order place, date of birth and so on. Then he appeared on a page of his own, which again gave his parents, clan affiliation, date of birth etc., along with information on his marriages and children. He may also have appeared on other pages: as a father, as the man who passed on a wife, and so on.

A woman never appeared on a page of her own. She first appeared as a daughter on her father's page, usually along with information on her first marriage. She then appeared on her husbands' pages as wife, where her children were also found.

Pages were numbered and grouped by clan. No cross-referencing of the multiple appearances of people was built in: this would have been useful. While the degree of redundancy proved to be a great advantage, it did mean that a lot of work was involved in 'debugging' the data — making sure that information appearing in several places was consistent. This was a particular problem for women, where it was not possible to rely on their information appearing in 'master' form on one page.

Table A.l: The original form of the genealogy c 0,0 f. - \ - a ^ <Ü> F lV.tf.pUKi/ 0 1 & 4 p 1 < L j c0 01j 6 U / O .'S t> >J ' J 'L 14 ß F s h£KAj fi ■j-esj ur4L ^ vr ~2 2_. O ‘o,höw.;ri-i lU 3^" V lU 3 u f f U A c 2 / _ Hl _ 'L 'Uay, U* I l i W 4.4, 3> V':. -• . > G..olu^ilL0./r -7 f

0 Ia&vj /jjuff

V> -> U.<TUA ->

_ Lj . p \e>' Ot o^’v-UA 0 ^ ^ D)a& vaC Vs,i irwio.uSW'-j ~

v/ v j 2fi l‘cC)3vf‘'

D k' 1 K

Culur A Um t *^a(<ufrav-ovv^^~ f^4ry^cJ;<

N /& a ^ a . ^ atJ u ^ ^vJc/v^L. Ujaoj ryJS(

0

- D j^ ^ U a ^ <^I-UvVA_oo olCJq J \! MC f**^r ,(| , r- C o „ o ^ ^ » - C a r a w , ^ ' V a ^ U l - / s< La < r < 1 H X /1>vaJo 1 A \ ) J - “ l ^oa4- b f /v~ . CvJ <* f F I S & Lofrcjo. •■ * ;- - V « I • —- — - -■—« - . —.! . . . .'. v .•-' A.. H i CaWavrarr- S u w a aLq 0 F 1 S'd “ ( isch uj l' 1*1( w. \ ils^F >V) r *hHi

There were many instances of different people with the same name, sometimes referred to as John 1, John 2, etc, though not always consistently. Some people had more than one name, designated by separating them with a slash and were sometimes referred to by different parts of the name in different places. The orthography of names was also not consistent.

Ensuring that all the information on a person was consistent and that names identified uniquely the individuals in the population was a protracted process, and one which it was only possible to automate to a limited degree. Some of the inconsistency had already been resolved by Ian Keen in his work on the genealogy. In addition to the sheet form of the genealogy I referred to a copy of Keen's large tree-form chart of the genealogy, and a set of index cards for the individuals.

The procedure for translating the information from the sheets into a form in which computer analysis could be conducted on it was as follows.

First of all, a simple encoding of the information on the sheets was carried out. The basic information in Table 2.1 would be converted to the form of Table 2.2 for entry in the computer. This data entry format was designed to minimise the duplication of information while retaining in unambiguous form the contents of the pages. It was also designed in such a way as to avoid the repetitive entry of information which stayed the same for a group of people. For example, the clan of a group of people was entered on a line beginning with '%', and this was then taken by the computer to mean that all subsequent male page entries were of this clan, and hence that their children were also of this clan (unless they were marked as the children who came with a wife from her previous husband). Any annotation of a

commentary nature contained on the sheets which might be important (and at this stage it was really not known what might turn out to be important) was entered on the line preceded by 'V.

Table A. 2: The form of the genealogy as entered into the computer #43 buranday .14 f djapul m djapurru w wurrapali .22 > rarrtji f ? c gup s djatjamirrilil .35

d yililpawuy .37 = rraywala 2 c gup s burritnja 2/wayilil .41

s milika .46

w marrtjala .15 < gulundharr < yalanjgurrurr < yipiti s dhayininyawuy .29 < gulundharr = milgiyawuy 2

\dhaminu,langgurrk ch

d djarrakurramawuy .35 < gulundharr = manjindirri * > gawulay d yanggana .37 < dhawurrpurr = mamulanhawuy c birr

d wayngali .40 < dhawurrpurr = djembanju s gungayala .46 < yalanjgurrurr \not married w gapany 1 \not living with him

f garrawarrpa/melmarimirri m yungbinil

s yambal 2/larri .56 w gorrgay

s milmuru .51

d gakararr .56 = marangguy > milinginbilil s buwanba .59 s waypunba .64 < marrmarrpa d guruminnga .65 w djukulul .42 d burrana .59 d borruwa .63 d warrayi .65 d rrarraypum .69 d barambitj .71 < muduk w biyay'ngu 2 .54 s wanygaypum

Where it was known that a person existed, but no name was given, this was entered as and where a description rather than a name was given, this was entered in square brackets, for example '[mirrinjgars F]' for later clarification. The specification of the data entry format is given in Table A.3.

Table A.3: Specification of data entry format

The general format is one of a code (one character) which modifies and gives meaning to that which follows it, and which usually refers to the person last named. The syntax is order- based.

code followed by refers to

% 3-4 character abbreviation of clan name

clan name to be used for following people. Same line can contain other information — full clan name, moiety etc.

# number, then a name

New page, gives the page number, and then the man's name

number

date of birth of last named person * number or blank

date of death of last named person, or, if date not known, just notification that the person is dead

f name

father of last named person

m name

mother of last named person

w name

wife of current man (i.e. man whose page this is) c dan name

clan of last named person (this, as well as place name, help to identify wives, WFs etc. who appear elsewhere in the genealogy)

p place name

place name of last named person > name

used for a woman to mean *went to’ after husband's death < name

for a wife: 'came from' — previous husband.

for a child: 'genitor was' — sometimes children appear on the pages of their M's later husbands, especially where she was inherited while they were small children.

s name

son of man whose page this is and last named wife,

d name

daughter of man whose page this is and last named wife. \ textual comment

rest of this line is a comment

= name

following daughter's name, indicates who she married e name or ?

Cleaning up of the data involved several phases:

1. Extraction of all the person names and reference line numbers in which they appear. The names were then collated, and the list of reference line numbers for each name showed all the places that this name had

2. The information on a person/name was ’disassembled’ and compiled, so that the consistency of the information could be checked.

At this point all the people whose names were not known were given names generated as xxaa, xxab, and so on (there were more than 300 of these). In this way it was possible to sometimes put together information in separate places on these people and to use them as links between other people, for example between siblings whose mother's name was not known. These people were not used in the analysis as

3. It was verified that for everyone whose mother's and father's names were specified that the mother was in the list of the father's spouses and vice versa. It was also checked that spouses were of opposite sex. 4. Finally a check was conducted, mainly to verify clan, that the clans of

spouses were of opposite moiety.

Within this verification process, a variety of problems arose:

• where the father or mother was not the same, the clan different, or the date of birth or of death didn't match from one occurence of the name to another. Sometimes this was due to mis-spelling of the parent's name, sometimes it was due to the conflation of information on two distinct people with the same name

♦ a listing of the names of spouses and children revealed some cases in which the name of one of the spouses or children was spelt slightly

differently in two places. This was a useful check as sometimes the different spellings were not alphabetically near, since many of the names started with the same two or three letters.

This process of data verification was highly iterative: once it was established that the same person was referred to by names spelt slightly differently and the identification made, the listing was rerun until no further anomalies were thrown up.

Format of the main data file at this stage

A record was held in the database for each person, containing:

ID: Name: Clan: Sex: Birth: Death: Father: Mother: Spouses: Children:

sequence number of this person in alphabetical order person's name or names (separated by T)

abbreviated clan name m, f or?

year of birth, if known

year of death, if known, or -1 if it is known that this person is dead but not father [stored as an ID number]

mother [stored as an ID number] list of spouses (not ordered) list of children (not ordered)

Up to this point the master listing containing the information for all the people was in a highly redundant and space consuming form. The main printout gives references back to the original data pages, and all names of father, mother, spouses and children are printed in full. For over 1800 people, at about 18 people to a page, the listing totalled about 100 pages. This listing was kept as a hard-copy record of the master version of the data and from this point on any minor changes needed were written onto this listing by hand as well as being entered onto the computer.

At this point the subsection information, was gathered from the Register of Wards, for more information on this see Chapter 3.

The computer systems used

Initial data entry was done on the DECsystem-10 at the Australian Defence Force Academy Computer Centre where the author was then employed. Programs for data checking and preliminary analysis were written in SNOBOL, a high-level computer language designed to optimise the manipulation of strings of characters. During the greater part of the analysis a Macintosh Plus computer and programs purpose-written by the author in the C programming language (Consulair Professional Development Tools:

MacC Jr.) were used.

Other programming languages and database management products for the Macintosh were evaluated, such as Omnis 3+ and dBase Mac, but these products were found to be not only very expensive but were not well suited to the task of genealogy analysis. Commercially available relational database packages, typically designed for business applications, are not sufficiently flexible to facilitate the type of genealogical analysis that is of interest to anthropologists. No single system that the author is aware of would have been capable of providing all of the analyses conducted in this study, although all but the tracing of genealogical relationships would have been possible with a relational database along with a statistical package such as SPSS.

The program used to trace genealogical relationships between spouses is reprinted as Table A.4 of this Appendix. Permission is granted for the use of this program, provided that the source is acknowledged.

be t w e e n couples #include <stdio.h>

★ ★

* C program to trace genealogical relationship between pairs of *

* people (e.g. spouses) *

* Elaine Lally - March 1988 *

* ★

struct record

{

short id, dob, dod, mother, father; char sex, moiety, subsection, subcal;

} person[1900], *ptr_person; struct relnode { short id; char m f [2]; char rel[8]; } htree[50], wtree[50];

int ja, jb, k, 11, It, last, flag, marcnt; int max_person, *mxp;

short husb, wife;

FILE *infile, *outfile, *prfile; main()

{

mxp = &max_person;

if(outfile = fopen ("output.prt", "W")) /* file for printed

output */ { setFileType("output.prt", 'TEXT'); setFileCreator("output.prt", 'EDIT'); if(read_data()) {

/* read in pairs to be tested */

printf("\nPlease select file of pairs\n"); contin(); marcnt = 0;

if(prfile=fopen(".SFin", "R") )

{

while(readpr()!=EOF) checkgen(); /* test pairs one by one */ fclose (prfile) }; contin(); } else { printf("bombed out ..."); contin(); }; fclose(outfile); }

else {printf ("couldn't open output\n"); contin();};

read_data()

{

int count;

if (infile = fopen("data.bin", "R")) /* main genealogy in binary */

{

if (fread(mxp,sizeof(int),1,infile))

{

fread(person,sizeof(person[0]),max_person+l, infile);

}

else {printf("couldn’t read max_person\n"); contin();}; fclose(infile);

return(1);

}

else {printf("Could not open input\n"); return(O)};

}

contin()

{

/* small routine to stop the program while you look at the screen */ char c;

p r i n t f ("\nEnter any character to continue ..."); c=getchar() ;

if (c='q') exit(l); /* Allows user to ’bomb o u t ’ the program */ else return (c);

}

getnum()

{

int c, num=0, sign=l; do

{

c=fgetc(prfile); if (isdigit(c))

{

num = (num*10) + (c-’O ’);

}

else i f(c=— ’ ) sign = -1;

} while (c != EOF && c != && c != ’\ n ’); num *= sign;

return ( c = E O F ? EOF : num) ;

} readpr() { char c; getnum (); husb = g e t n u m (); wife = g e t n u m ();

do c=fgetc(prfile); w h i l e ((c != ’\ n ’) && (c != EOF)); marcnt++;

if ((husb=E0F) || (wife == EOF)) return(EOF); else return(l);

* Routines which do the work:

* Builds up, for each of the couple, a ’tree1 of ancestors, until * no more can be traced. Then checks each tree to see if there is * a common ancestor. Each branch of the tree ('relnode') includes * a label which, when concatenated with one on the other tree, * gives the standard anthropological abbreviation for the * relationship.

*

* N.B. Although the program is framed in terms of husbands and * wives, it will also work for other couples, for example to * trace the genealogical relationships between people in two * clans or subsections * ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ ★ I checkgen() { int count;

/* fill tree on husband's side first */ 11 = 1; It = 2;

htree[l].id = husb;

strcpy(htree[1].mf,""); strcpy(htree[1].rel, "");

/* call routine which fills in one level of husband's tree */ while (11 != It) h l evelO;

/* now do wife's side, checking for match at each level */ last = 11 - 1; flag = 0; wtree[l].id = wife; if (person [wife] . s e x = ' f ') { strcpy(wtree[l].rel,"D"); strcpy(wtree[2].rel,"Z"); strcpy(wtree[3].rel,"Z"); } else { strcpy(wtree[1].rel,"S"); strcpy (wtree[2].rel,"B"); strcpy(wtree[3].rel,"B"); }; 11 = 1; It = 2;

/* call routine which fills in one side of wife's tree */ while (11 != It) wle v e l O ;

if (flag = 1) {fprintf(outfile,"\n"); printf("\n");} if (flag = 0) return(O); else return(l);

hlevel ()

{

int k, count, ja, jb;

k = It - 1; /* k is pointer to new branches at next level */

for (count = 11; count <= It - 1; count++)

{ ja = htree[count].id; jb = pers o n [ja].father; i f (j b !=0) { k = k + 1;

/* Check that tree is not too big for memory allocated */ if(k>=50)

{

printf ("\nk=50 ... returning. Please expand array");

return(1); } htree[k].id = jb; strcpy(htree[k].rel,htree[count].rel); strcat(htree[k].rel,htree[count].mf); strcpy(htree[k].mf,"F"); } jb = p e r s o n [ja].mother; i f (j b !=0) { k = k + 1;

/* Check that tree is not too big for memory allocated */ if(k>=50) { printf("\nk=50 ... returning"); return(0); } htree[k].id = jb; strcpy(htree[k].rel,htree[count].rel); strcat (htree[k].rel,htree[count].mf); strcpy(htree[k].mf, "M") ; } } 11 = It; It = k + 1; return (1);

{

int count;

k = It - 1; /* k is pointer to new branches at next level */ for (count = 11; count <= It - 1; count++)

{

ja = wtree[count].id; if (ja == 0) break; jb = person[ja].father; if (jb!=0) { k = k + 1; if(k>=50){printf("\nk=50 ... returning");return (0);}; wtree[k].id= jb; strcpy(wtree[k].mf,"B"); if (count != 1) { strcpy(wtree[k].rel,wtree[count].mf); strcat(wtree[k].rel,wtree[count].rel); if(wtree[k].rel[1] = 'B') wtree[k].rel[1]='S’; else if(wtree[k].rel[1] = 'Z') wtree[k].rel [1] = ,D I;

}

chmatch(); /* call routine which tests for common ancestors */

} jb = person[ja].mother; if (jb!=0) { k = k + 1; if(k>=50){printf("\nk=50 ... returning");return (1);}; wtree[k].id= jb; strcpy(wtree[k].mf,"Z"); if (count != 1) { strcpy(wtree[k].rel,wtree[count].mf); strcat(wtree[k].rel,wtree[count].rel); if(wtree[k].rel[1] = 'B') wtree[k].rel [1] = ,S I; else if(wtree[k].rel[1] = ’Z') wtree[k].rel [1] = ,D';

}

chmatchO; /* call routine which tests for common ancestors */

} } 11 = It; It = k + 1; return(1); }

/* Routine to test for common ancestors and prints relationship */ chmatch()

{

int count;

for(count=l; count<=last; count++)

{

if(jb = htree[count].id)

{

wtree[k].id = 0; /* wont need to follow this branch further */ flag = 1; printf("%4d: (%4d,%4d) %s%s (%s)\n",marcnt,husb,wife, htree[count].rel,wtree[k].rel,htree[count].mf); fprintf(outfile,"%4d: (%4d %4d) %s%s (%s)\n",marcnt,husb, wife,htree[count].rel,wtree[k].rel, htree[count].mf); return(1); }; } return(0);

Related documents