UK DATA ARCHIVE: IMPORTANT STUDY INFORMATION SN: 4736 - Labour Force Survey, 2002 : Teaching Dataset New Edition Information ----------------------- For the second edition, approximately 2300 cases were deleted from the dataset in order to avoid anomalies during use, and the documentation has been updated. Data Archive Processing Standards --------------------------------- The data were processed to the UK Data Archive's 'A*' standard. This is the Archive's highest standard, and means that an extremely rigorous and comprehensive series of checks was carried out to ensure the quality of the data and documentation. Briefly, the most important procedures were as follows. Firstly, checks were made that the number of cases and variables matched the depositor's records. Secondly, checks were made that all variables had comprehensible variable labels and all nominal (categorical) variables had comprehensible value labels. Where possible, either with reference to the documentation and/or in communication with the depositor, labels were accordingly edited or created. Thirdly, logical checks were performed to ensure that nominal (categorical) variables had values within the range defined (either by value labels or in the depositor's documentation). Lastly, any data or documentation that breached confidentiality rules were altered or suppressed to preserve anonymity. All notable and/or outstanding problems discovered are detailed under the 'Data and documentation problems' heading below. Data and Documentation Problems ------------------------------- No unresolved problems were encountered during processing. Conversion of Documentation --------------------------- All electronic and paper documentation supplied with this study is normally incorporated into the UKDA User Guide (in PDF format). The conversion programmes used are the latest versions of Adobe PDF Writer for electronic documentation and Adobe Paper Capture (Acrobat 'plugin' version) for paper documentation. Occasionally, some or all of the electronic documentation cannot be usefully converted to PDF (e.g. MS Excel files with wide worksheets) and this is supplied in other formats. All User Guides are fully bookmarked. Conversion of Data ------------------ Ingest format(s) of the data = SPSS .sav, Stata From January 2003 onwards, almost all data conversions have been performed using software developed by the UKDA. This enables standardisation of the conversion methods and ensures optimal data quality. In addition to its own data processing/conversion code, this software uses the SPSS and StatTransfer command processors to perform certain format translations. Although data conversion is automated using quality checked processes, all data files created are also subject to visual checking by a UKDA data processor. NOTE: With some format conversions data, and more especially internal metadata (i.e. variable labels, value labels, missing value definitions, data type information), will inevitably be lost or truncated. To see the internal metadata as it existed in the originating format, you will normally have been provided with a UKDA Data Dictionary file corresponding to each data file, these are called: _UKDA_Data_Dictionary.rtf Additional comment about each data format is given below, please read the wording corresponding to the data format you have been supplied with: SPSS portable If SPSS portable was not the ingest format, this format will generally either have been created via the SPSS command processor (e.g. if the ingest format is SPSS .sav, SAS, Excel, or dBase), or if the ingest format was STATA, the SPSS version will be created via the Stat/Transfer command processor. If the ingest format was undelimited text, the data will have been read into SPSS using a command file. Issues: There is very seldom any loss of data or internal metadata when importing data files into SPSS. Any problems will have been listed above in the 'Data and Documentation Problems' of this Read file. STATA If STATA is not the ingest format, all STATA files will have been created from SPSS .sav format via the Stat/Transfer command processor. Importantly, Stat/Transfer's optimisation routine is run so that variables with SPSS write formats narrower than the data (e.g. numeric variables with 10 decimal places of data formatted to FX.2) are not rounded upon conversion to STATA because they are converted to "doubles" rather than floats. "User missing" values are copied across into STATA (as opposed to being collapsed into a single system missing code). Issues: There are a number of data handling mismatches between SPSS and STATA. Where any data or internal metadata has been lost or truncated, this will have been automatically logged in this file, with which you will have been supplied: _SPSS_to_STATA_conversion.rtf Note that the complete internal metadata has been supplied to you in the UKDA Data Dictionaries: _UKDA_Data_Dictionary.rtf Tab-delimited text If tab-delimited text is not the ingest format, tab-delimited files are created from SPSS portable files via the SPSS command processor, and also from Excel and MS Access files. When exporting from Access data tables to tab-delimited text, the many undesirable embedded special characters allowed by access memo and text fields - tabs, carriage returns, line feds, etc., - are stripped out by the UKDA software. Issues: Date formats in SPSS are always exported to mm/dd/yyyy in tab- delimited text format - so you may note a mismatch with the documentation on such variables. Variables that include both date and time such as. mm-dd-yyyy hh:mm:ss (e.g. 18-JUN-2001 13:28:00), will lose the time information and become mm/dd/yyyy. If the time information is critical, a new variable will have been created in the tab-delimited data file by the UKDA. All users of the data in tab- delimited format should consult the UKDA Data Dictionary files: _UKDA_Data_Dictionary.rtf Note that these are only created when the data has been exported from SPSS, more limited information will have been supplied if the data was exported from MS Access, this will be in the files: _variableinformation.rtf These files may also contain SQL setup information. MS Excel If MS Excel is not the ingest format, Excel files are created via the SPSS command processor. The date and time issues noted under tab- delimited format above apply to SPSS to Excel conversion via the SPSS command processor. MS Access Due to the substantial incompatibilities between versions of MS Access, the UKDA only make data available in MS Access format if this is the ingest format and the database contains important information in addition to the data tables (coding information, forms, queries, etc.). Other formats Data are only made available in other formats on the rare occasion when there is no reliable method of extracting the data into a more accessible format. Other formats (e.g. SAS) will be created upon request.