From: Subject: read5698.htm Date: Tue, 11 Sep 2007 14:43:38 +0100 MIME-Version: 1.0 Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Location: http://www.data-archive.ac.uk:1212/Mirage/Readme.aspx?key=696 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138 =EF=BB=BF read5698.htm

UK DATA ARCHIVE: = IMPORTANT=20 STUDY INFORMATION

Study=20 Number 5698 - Entering the e-Society: Young Children's Development of=20 e-Literacies, 2005
DATA=20 PROCESSING NOTES
Data Archive Processing Standards=20

The data were processed to the UK Data Archive's B standard. A = substantial=20 series of checks was carried out to ensure the quality of the data and=20 documentation. Firstly, checks were made that the number of cases and = variables=20 matched the depositor's records. Secondly, logical checks were performed = on a=20 sample of30 + 10% of the remaining nominal (categorical) variables to = ensure=20 they hadvalues within the range defined (either by value labels or in = the=20 depositor's documentation). Thirdly, any data or documentation that = breached=20 confidentiality rules were altered or suppressed to preserve anonymity.=20

All notable and/or outstanding problems discovered are detailed under = the=20 'Data and documentation problems' heading below.

Data and=20 documentation problems=20

No problems found.

Data conversion information=20

From January 2003 onwards, almost all data conversions have been = performed=20 using software developed by the UKDA. This enables standardisation of = the=20 conversion methods and ensures optimal data quality. In addition to its = own data=20 processing/conversion code, this software uses the SPSS and = Stat/Transfer=20 command processors to perform certain format translations. Although data = conversion is automated, all data files are also subject to visual = inspection by=20 a UKDA data processing officer.=20

With some format conversions data, and more especially internal = metadata=20 (i.e. variable labels, value labels, missing value definitions, data = type=20 information), will inevitably be lost or truncated owing to the = differential=20 limits of the proprietary formats.A UKDA Data Dictionary file (in rich = text=20 format), corresponding to each data file, is usually provided for = viewing and=20 searching the internal metadata as it existed in the originating format. = These=20 files are called: [data file name]_UKDA_Data_Dictionary.rtf =

Important=20 information about the data format supplied=20

The links below provide important information about the format in = which you=20 have been supplied the data. Some of this information is specific to the = ingest format of the data, that is the format in which the UKDA = was=20 supplied the data in. The ingest format for this study was SPSS =

Please=20 click below to find out information about the format that you have = been=20 supplied the data in.

SPSS=20 (*.por)=20

SPSS portable (*.por files)
If SPSS portable was not the = ingest=20 format, this format will generally either have been created via the SPSS = command=20 processor (e.g. if the ingest format is SPSS .sav, SAS, Excel, or = dBase), or if=20 the ingest format was STATA, the SPSS version will be created via the=20 Stat/Transfer command processor. If the ingest format was undelimited = text, the=20 data will have been read into SPSS using an SPSS command file.=20

Issues: There is very seldom any loss of data or internal metadata = when=20 importing data files into SPSS. Any problems will have been listed above = in the=20 Data and Documentation Problems section of this file.

STATA = (*.dta)=20

STATA (*.dta files)
If STATA was not the ingest format, = all STATA=20 files will have been created from SPSS .sav format via the Stat/Transfer = command=20 processor. Importantly, Stat/Transfer's optimisation routine is run so = that=20 variables with SPSS write formats narrower than the data (e.g. numeric = variables=20 with 10 decimal places of data formatted to FX.2) are not rounded upon=20 conversion to STATA because they are converted to 'doubles ' rather than = floats.=20 User missing values are copied across into STATA (as opposed to being = collapsed=20 into a single system missing code).=20

Issues: There are a number of data and metadata handling mismatches = between=20 SPSS and STATA. Where any data or internal metadata has been lost or = truncated,=20 this will have been automatically logged in this file:=20 5698_SPSS_to_STATA_conversion.rtf Note that the complete internal = metadata has=20 been suppliedin the UKDA Data Dictionary file(s): [data file=20 name]_UKDA_Data_Dictionary.rtf


Tab-delimit= ed text=20 (*.tab)=20

If tab-delimited = text was=20 not the ingest format, tab-delimited fileswill have beencreated from = SPSS=20 portable files via the SPSS command processor, and also from Excel and = MS Access=20 files. When exporting from Access data tables to tab-delimited text,=20 thepotentially problematicspecial characters (tabs, carriage returns, = line=20 feeds, etc.) allowed by Access memo and text fields are stripped out by = the=20 UKDA.=20

Issues: Date formats in SPSS are always exported to mm/dd/yyyy in=20 tab-delimited text format - sothere be be amismatch with the = documentation on=20 such variables. Variables that include both date and time such as = dd-mm-yyyy=20 hh:mm:ss (e.g. 18-JUN-2001 13:28:00), will lose the time information and = become=20 mm/dd/yyyy. If the time information is critical, a new variable will = have been=20 created in the tab-delimited data file by the UKDA. All users of the = data in=20 tab-delimited format should consult the UKDA Data Dictionary file(s): = [data file=20 name]_UKDA_Data_Dictionary.rtf=20

If the data was exported from MS Access, more limited 'data = documenter'=20 information is suppiedin the file(s): [data table = name]_variableinformation.rtf=20 These files may also contain SQL setup information.


MS Excel = (*.xls=20 files)=20

If MS Excel = was not the=20 ingest format, Excel fileswill havebeencreated via the SPSS command = processor.=20 The date and time issues noted under tab-delimited formatapply to SPSS = to Excel=20 conversion via the SPSS command processor.

SAS = (supplied as=20 *.dat and *.sas)=20

If SAS was not the ingest format, all SAS files will have been = created from=20 SPSS .sav format via the Stat/Transfer command processor. The data files = are=20 provided as a fixed-width text file (*.dat) and a SAS command file = (*.sas),=20 which when run will create a SAS dataset. This enables the user to = recreate the=20 SAS dataset and formats library in almost all versions of SAS and all = operating=20 systems.=20

Issues: The main loss of information when converting from SPSS to SAS = is=20 user-missing value definitions. By editing the .sas file, the user can = choose=20 whether to collapse all user-missing values into system missing or = preserve=20 thevalue and lose the user-missing definition. To achieve the latterthe=20 following section of the .sas file should be removed before running it:=20

/* User Missing Value Specifications */=20

Note that the complete internal metadata has been suppliedin the UKDA = Data=20 Dictionary file(s): [data file name]_UKDA_Data_Dictionary.rtf =


MS = Access (*.mdb=20 files)=20

Due to the substantial incompatibilities between versions of MS = Access, the=20 UKDA only make data available in MS Access format if this is the ingest = format=20 and the database contains important information in addition to the data = tables=20 (coding information, forms, queries, etc.). =



Conversion of=20 documentation formats=20

Electronic and paper documentation supplied with this study is = usually=20 incorporated into the UKDA User Guide (in PDF format). The conversion = programmes=20 used are the latest versions of Adobe PDF Writer for electronic = documentation=20 and Adobe Paper Capture (Acrobat 'plugin' version) for paper = documentation.=20 Occasionally, someof the electronic documentation cannot be usefully = converted=20 to PDF (e.g. MS Excel files with wide worksheets) and this is supplied = ina more=20 appropriate format. All User Guides are fully bookmarked.=20