UK DATA ARCHIVE: IMPORTANT STUDY INFORMATION

Study Number 6487 - Internal Marketing and Performance Surveys of India Call Centres, 2007-2009


DATA PROCESSING NOTES


Data Archive Processing Standards

The data were processed to the UK Data Archive's B standard. A substantial series of checks was carried out to ensure the quality of the data and documentation. Firstly, checks were made that the number of cases and variables matched the depositor's records. Secondly, logical checks were performed on a sample of�30 + 10% of the remaining nominal (categorical) variables to ensure they had�values within the range defined (either by value labels or in the depositor's documentation). Thirdly, any data or documentation that breached confidentiality rules were altered or suppressed to preserve anonymity.

All notable and/or outstanding problems discovered are detailed under the 'Data and documentation problems' heading below.

Data and documentation problems

None.

Useful Notes

Missing data imputation
Users should note that the data contain some missing values within the scales for each variable. The depositor notes that the datasets have been treated for missing data and have been reverse coded where applicable (for all negatively worded items). Hence, the frequency tables will show most likely accounting for the missing data imputations. In order to maximise the use of completed questionnaires, the missing data were treated using the technique of expectation maximisation (EM). EM is a two step procedure which alternates between an E (Expectation) step and an M (Maximisation) step (Collins et al., 2000). The E step is where missing data are predicted and these predictions are used to estimate sums, sums of squares, and cross products and the M step is where parameters are re-estimated based on the predicted missing data, i.e. to calculate the covariance matrix (Graham and Hoffer, 2000). Following this the updated covariance matrix (from the initial M step) is used to estimate the missing values during the next E step. This cycle continues until the elements of the covariance matrix stop changing substantially (Graham and Hoffer, 2000). The major advantage of this technique is that the parameter estimates based on the analysis of the EM covariance matrix are excellent as they are unbiased and efficient (Graham and Hoffer, 2000). Once the missing data had been imputed, all negatively worded items were reverse coded and all subsequent analysis was undertaken. Hence, the frequency tables produced account for missing data. These frequencies are negligible and well within the acceptancy rates of missing data.

References:

Collins, L.M., Schafer, J.L. and Kam, C. (2001) 'A comparison of inclusive and restrictive strategies in modern missing data', Psychological Methods, 6, pp.330-351.

Graham, J.W. and Hofer, S.M. (2000) 'Multiple imputation in multivariate research', in , T.D. Little, K.U. Schnabel and J. Baumert (eds.) Modeling longitudinal and multilevel data: practical issues, applied approaches and specific examples, Mahwah, NJ: Lawrence Erlbaum Associates, pp. 201-218.

Data conversion information

From January 2003 onwards, almost all data conversions have been performed using software developed by the UKDA. This enables standardisation of the conversion methods and ensures optimal data quality. In addition to its own data processing/conversion code, this software uses the SPSS and Stat/Transfer command processors to perform certain format translations. Although data conversion is automated, all data files are also subject to visual inspection by a UKDA data processing officer.

With some format conversions data, and more especially internal metadata (i.e. variable labels, value labels, missing value definitions, data type information), will inevitably be lost or truncated owing to the differential limits of the proprietary formats.�A UKDA Data Dictionary file (in rich text format), corresponding to each data file, is usually provided for viewing and searching the internal metadata as it existed in the originating format. These files are called: [data file name]_UKDA_Data_Dictionary.rtf

Important information about the data format supplied

The links below provide important information about the format in which you have been supplied the data. Some of this information is specific to the ingest format of the data, that is the format in which the UKDA was supplied the data in. The ingest format for this study was SPSS

Please click below to find out information about the format that you have been supplied the data in.

SPSS (*.por)

STATA (*.dta)
Tab-delimited text (*.tab)
MS Excel (*.xls files)
SAS (supplied as *.dat and *.sas)
MS Access (*.mdb files)

Conversion of documentation formats

Electronic and paper documentation supplied with this study is usually incorporated into the UKDA User Guide (in PDF format). The conversion programmes used are the latest versions of Adobe PDF Writer for electronic documentation and Adobe Paper Capture (Acrobat 'plugin' version) for paper documentation. Occasionally, some�of the electronic documentation cannot be usefully converted to PDF (e.g. MS Excel files with wide worksheets) and this is supplied in�a more appropriate format. All User Guides are fully bookmarked.