UK DATA ARCHIVE DOCUMENTATION 4490 - Family Expenditure Survey, 2000-2001 Data Archive processing standards --------------------------------- The data were processed to the UK Data Archive's 'A' standard. A rigorous and comprehensive series of checks was carried out to ensure the quality of the data and documentation. The most important procedures were as follows. Firstly, checks were made that the number of cases and variables matched the depositor's records. Secondly, checks were made that all variables had variable labels and all nominal (categorical) variables had value labels. Where possible, either with reference to the documentation and/or in communication with the depositor, absent labels were created. Thirdly, logical checks were performed to ensure that nominal (categorical) variables had values within the range defined (either by value labels or in the depositor's documentation). Lastly, any data or documentation that breached confidentiality rules were altered or suppressed to preserve anonymity. All notable and/or outstanding problems discovered are detailed under the Data and documentation problems heading below. Conversion of data and documentation formats -------------------------------------------- DATA The data were supplied as 181 ascii .dat files, with 181 .cmd syntax files. The .cmd files were edited and converted to SPSS syntax .sps files, and then run on the 181 .dat files to create SPSS portable files. Frequencies, descriptives, list and labels files were created for each data file and checked. DOCUMENTATION The documentation was supplied in ascii .txt, MS Excel and MS Word format. files supplied were converted to Adobe PDF format using Acrobat 5.0, and compiled into a 9-part User Guide. Bookmarks and an index file were added to aid navigation. Data and documentation problems ------------------------------- 1. Files variousi.dat and vused.dat The code files supplied by the depositor for files variousi.dat and vused.dat proved problematic to run in Unix, and the SPSS versions of these files were generated using the 'Text Wizard' utility in SPSS 10 for Windows. The code files generated by this have been edited and are available to users, but users should be aware if they choose to use the two syntax files variousi.sps and vused.sps, they may need to further edit these files before use on their own systems, and are strongly advised to use the ready-constructed SPSS versions of these files, variousi.por and vused.por, instead. 2. Repeated variable names Users should note that variable names are repeated in many of the tables - this may have implications for file linkage during analysis. The documentation lists repeated variable names with table name in front of the variable, but for compatibility with the 8-character variable name format of SPSS, the table name is missing. Some long variable names listed in the documentation are also truncated to 8 characters in the data. 3. Missing variables Some files have variables missing according to the table specifications in the documentation: Table Missing variable(s) ----- ------------------- cc cardbrnd edf2/edf3 subject eout eocode, payeout expg giveby, itemtypo ilo prgtypo life linsoth, linstype set87 litern stord pabstor0, bstorpur 4. Confidentiality issues Several files were found to contain textual variables (mostly supplementary information covered elsewhere in the data files) which could have compromised respondent confidentiality. These variables have been removed from the dataset. The files affected and variables removed are as follows: person.dat/person.por Variables NAME (respondents' first names) and BIRTH (dates of birth) removed. jobmain.dat/jobmain.por variables INDD, INDT, OCCT and OCCD removed - specific job and occupational details. itemdea.dat/itemdea.por LITEMPUR 'item acquired with loan' removed. lastmth.dat/lastmth.por DESCRIP 'what were these club goods' removed. medins.dat/medins.por MINSOTH 'Insurance policy - undefined in minstype' removed. oddjob.dat/oddjob.por ODDJDESC 'Oddjob: What was the job?' removed. pay2o.dat/pay2o.por DEDOTYPE 'PAY2o:What was purpose of other deductn?' removed. paymaino.dat/paymaino.por DEDOTYPE 'What was the purpose of the other dedctn' removed. sejob.dat/sejob.por EXPO 'Please describe the other expenses' removed. variousi.dat/variousi.por INCSRCE 'What was the taxable incomes source?' removed. vused.dat/vused.por OTHPERS 'other person' removed, and variable MODEL retained but one case anonymised. 5. Date Format The following tables contained date variables in the format dd/mm/yyyy. These were altered so that dates are now in the format ddmmyyyy - i.e. the date 10/11/1999 has become 10111999. HHOLD1 - startdat dstart vintdate PAY2 - paydat ILO - dtjbl PAYMAIN - paydat SEJOB - date variables se1 and se2 have been set as string variables in file sejob.por, as characters other than '/' were found in these variables in file sejob1.dat. 6. EXPEND and SET130 Users should note that although tables EXPEND and SET130 are mentioned in the documentation, but were not supplied to UK Data Archive - EXPEND contains confidential information, and SET130 is a copyright table with codes set up for private marketing companies. Weighting/Differential Grossing Information ------------------------------------------- Users should note that the weighting variable WEIGHT can be found in file set1.por/set1.dat. The following information was received with the 1998-1999 FES, but is still relevant for the 2000-2001 survey: Weights are now included in the database. Their use is strongly recommended because they compensate, to some extent, for lower response rates among some kinds of household. They also allow grossed up estimates to be produced by an agreed standard method. Information on the weights is in "Family Spending" 2000-2001, Appendix F. NORTHERN IRELAND Northern Ireland has for many years had an enhanced sample to allow for separate analysis. The data supplied to users, however, included only the so-called UK sample, in which Northern Ireland households were represented in approximately the same proportion as in GB. FROM 1998-99 ONWARDS THE DATA SUPPLIED INCLUDES THE FULL NORTHERN IRELAND SAMPLE. If users do not take appropriate steps, Northern Ireland will be over- represented in any analyses, by a factor of 5. Users should adopt one of the following measures: *Use the weights supplied on the database for all analyses. This automatically compensates for the over-representation of Northern Ireland. *If unweighted analysis is wanted, avoid the enhanced element of the Northern Ireland sample by filtering out case numbers of 9000000 (9 million) and over *Or, adopt a simple weighting scheme with GB cases given a weight of 1.0 and Northern Ireland cases a weight of 0.2 Links and Publications ---------------------- Further information about the Family Expenditure Survey and links to publications may be found on the ONS website at: http://www.statistics.gov.uk/ssd/surveys/family_expenditure_survey.asp The 'Family Spending' report for the 2000-2001 survey is downloadable in PDF format from: http://www.statistics.gov.uk/downloads/theme_social/Family_Spending_2000-01/Family_Spending_2000-01.pdf Notes from data delivery and post-order corrections --------------------------------------------------- Variable SURVYR (in HHOLD1 file): Following a post-order query - and after consultation with the depositor - the variable SURVYR was set to 2000 for all cases. This is because - and in contrast to the years prior to 1998-99 - SURVYR relates to the FES year and NOT the calendar year in which the interview took place. Thus, ALL cases in the 2000-01 data have a SURVYR value of 2000. Variable P396 (in SET8 file): Following a user query, it was established that to obtain the correct values for age of HOH, the variable P396 should be multiplied by 1000. When the new variable is computed (P396*1000) and then banded to match the categories in A065 (banded age of HOH, also in SET8), the new variable matches A065 perfectly.