How ever, not only must the data be trans ferred, but in ad di tion,the vari ables gen er ally must be re-de scribed for each pro gram with ad di tional in for ma tion, such asvari able
Trang 9In tro duc tion
What Stat/Trans fer does
Stat/Trans fer is de signed to sim plify the trans fer of sta tis ti cal data be tween dif fer ent pro grams
Data gen er ated by one pro gram is of ten needed in an other con text, ei ther for anal y sis, for clean ingand cor rec tion, or for pre sen ta tion How ever, not only must the data be trans ferred, but in ad di tion,the vari ables gen er ally must be re-de scribed for each pro gram with ad di tional in for ma tion, such asvari able names, miss ing val ues and value and vari able la bels This pro cess is not only time-con sum ing,
it is er rorprone For those in pos ses sion of data sets with many vari ables, it rep re sents a se ri ous im ped i ment to the use of more than one pro gram
-Stat/Trans fer re moves this bar rier by pro vid ing an ex tremely fast, re li able and au to matic way to move data Stat/Trans fer will au to mat i cally read sta tis ti cal data in the in ter nal for mat of one of the sup -ported pro grams and will then trans fer as much of the in for ma tion as is pres ent and ap pro pri ate to the
in ter nal for mat of an other
Stat/Trans fer pre serves all of the pre ci sion in your data by stor ing it in ter nally in dou ble pre ci sion for mat How ever, on out put, it will, where pos si ble, au to mat i cally min i mize the size of your out put data set by in tel li gently choos ing data stor age types that are only as large as nec es sary to pre serve the in -put pre ci sion Stat/Trans fer also al lows pre cise and easy man ual con trol over the stor age for mat ofyour out put vari ables, in case this is nec es sary
In ad di tion to con vert ing the for mats of vari ables, Stat/Trans fer also pro cesses miss ing val ues au to mat i cally
-Stat/Trans fer can save hours and even days of man ual la bor, while at the same time elim i nat ing er ror.Fur ther more, you gain this speed and ac cu racy with out los ing flex i bil ity, since Stat/Trans fer al lowsyou to se lect just the vari ables and cases you want to trans fer
In ad di tion to the stan dard Win dows in ter face, a com mand pro ces sor al lows you to run a trans fer inbatch mode, us ing a com mand file This makes it straight for ward to set up fully au to matic batch pro -
ce dures for re pet i tive tasks
File Types Sup ported by Stat/Trans fer
Ver sion 9 of Stat/Trans fer will sup port the fol low ing file types:
· 1-2-3
· Ac cess
· ASCII - De lim ited
· ASCII - Fixed For mat
· dBASE and com pat i ble for mats
Trang 10· SAS for Win dows and OS/2
· SAS for Unix
· SAS CPORT (read only)
· SAS Trans port
Trang 11What’s New in Stat/Trans fer
· SAS CPORT data sets and cat a logs (read only)
· SAS PC/DOS 6.04 (read only)
· S-Plus through Ver sion 7
· De lim ited ASCII with a Stat/Trans fer SCHEMA file
Other New Fea tures
· Worksheet pages can be con cat e nated into a sin gle out put file
· The com mand pro ces sor now al lows mul ti ple in put files to be com bined into a sin gle out putfile
· Value la bel tags and sets can be pre served
· SAS value la bels can now be read from trans port files, CPORT files, data sets and cat a logs
· JMP sup port has been ex panded
· Us ers can spec ify any de lim iter for ASCII files and can com bine ad ja cent blank de lim it ers
· Gen er ated pro grams and ASCII files can now pre serve in put widths
· In R and SPlus, fac tors can now be con verted to nu meric vari ables with la bels or to string vari ables
-· The com mand pro ces sor now has a flex i ble syn tax for spec i fy ing in put and out put ta bles si mul
-ta neously
· Sup port for lon ger string vari ables and lon ger value and vari able la bels has been added, so thatStat/Trans fer is com pat i ble with the lim its of all sup ported pro grams
· A built-in log ging and FTP fa cil ity is now avail able for trou ble shoot ing
· Full compatiblity with Microsoft Vista
Be cause we make ev ery ef fort to keep up with changes in the file for mats of pop u lar soft ware,
be sure to check the READ.ME file for the lat est in for ma tion on which ver sions of these pro grams are sup ported There is a short cut to theREAD.MEfile from the Win dows Start menu,
-in the Stat/Trans fer group.
You can also get cur rent in for ma tion about Stat/Trans fer by vis it ing our Web site at You can reach our Web site from the About Stat/Transfer screen.
Trang 12In stal la tion
In stall ing Stat/Trans fer
Sys tem re quire ments
· Any com puter ca pa ble of run ning a 32 bit ver sion of Win dows, such as Win dows 98, Win dows
2000, Win dows XP or Vista
· 5MB of free disk space
· At least 128MB of mem ory
· Any Win dows-com pat i ble dis play
In stal la tion
To in stall Stat/Trans fer, place the Stat/Trans fer CD-ROM in your drive The Setup pro gram should
start au to mat i cally If it does not, then from the Start menu, choose Run, and then Browse Go to
your CD-ROM drive and se lect the SETUP filefromthe disk.
You will next be asked for the drive and di rec tory in which to in stall Stat/Trans fer The de fault is
C:\PRO GRAM FILES\STATTRANSFER9. If you ac cept this di rec tory and the di rec tory does not ex ist, the
in stallation pro gram will cre ate it for you If you wish to in stall the pro gram on an other drive or in a
directory with a dif fer ent name, you can do so by click ing on Browse and then en ter ing the drive and
di rec tory
By de fault, the in stal la tion pro gram will in stall not only the base Stat/Trans fer ap pli ca tion, but alsothe Microsoft com po nents that are nec es sary to sup port ODBC and Microsoft Ac cess If you do notuse ei ther for mat, you can choose not to in stall these com po nents If you de cide later that you wouldlike to use ODBC or Ac cess, you can run the in stal la tion pro gram again and add the nec es sary com -
po nents
Down loading Stat/Trans fer from our Website - Trial Ver sion
If you have down loaded Stat/Trans fer from our Web site, or oth er wise ob tained a trial ver sion, it willwork as if it were a full ver sion, ex cept that one out of ap prox i mately six teen cases will be de letedfrom your out put file A mes sage box will warn you of this, and the bor der at the top of the
Stat/Trans fer win dow will say “Trial Mode”
To turn the trial ver sion into a fully functioning copy of the soft ware, go to our website and pur chase
a copy with the ap pro pri ate li cense As soon as the or der is pro cessed, you will be sent an ac ti va tioncode by email
Ac ti va tion
Af ter in stal la tion is com plete, you may ac ti vate your soft ware, us ing a code which will be found onyour CD en ve lope or emailed to you You must have an ac tive internet con nec tion dur ing the ac ti va tion pro cess
First, go the About tab and press the Ac ti vate On line but ton On the next screen, en ter your ac ti va tion code Then press Next You will be asked to en ter your name, your or ga ni za tion and your email
-ad dress Af ter you press Next again, you will be asked to en ter a pass word, which will be used if you
re-ac ti vate your soft ware on an other com puter (see be low) You should not use a valu able pass wordand you should write it down in your soft ware man ual or an other place where you can find it if youneed it
Trang 13Fi nally, when you press Next again, your in for ma tion will be sent to our server and, if your se rial num
-ber is valid, the ac ti va tion in for ma tion will be writ ten to your com puter Once ac ti va tion is com plete,you must re start Stat/Trans fer
The ac ti va tion pro cess will also send a “ma chine fin ger print” that iden ti fies your par tic u lar com puter
If you have a sin gle-com puter li cense for Stat/Trans fer, you are per mit ted to in stall it on up to twocom put ers, as long as you are the pri mary user of both com put ers (for in stance on a home and of ficecom puter)
In or der to in stall Stat/Trans fer on an other ma chine us ing the same ac ti va tion code, you will be asked
to re-ac ti vate the soft ware af ter you go through the in stal la tion pro ce dure The re-ac ti va tion pro cesswill ask you for your old pass word and a new pass word that iden ti fies your sec ond in stal la tion Ifyou have for got ten or mis placed your old pass word, you can have it sent to the email ad dress you
gave in your ini tial ac ti va tion ses sion by click ing on the For got your Pass word? but ton Re mem ber
to write down both your old and new pass words
If you have any trou ble with the ac ti va tion pro cess, please con tact for as sis
The READ.ME File
The in stal la tion pro ce dure may also copy a file called READ.ME, which will be a sup ple ment to thedocumentation There is a short cut to the READ.ME file from the Win dows Start menu, in the Stat/Trans fer group.
We make ev ery ef fort to keep up with changes in the file for mats of pop u lar soft ware and the
READ.ME file will con tain the lat est in for ma tion on which ver sions of these pro grams are sup ported The file will also con tain the lat est in for ma tion on other im prove ments to Stat/Trans fer
Demo Files
The dis tri bu tion disk con tains sam ple files in many of the sup ported for mats, which you may find use ful
in learn ing about Stat/Trans fer’s capabilities
The file name in di cates which pro gram for mat each file cor re sponds to In ad di tion there is a file,
DEMO.WK1, that il lus trates the way Stat/Trans fer treats dif fer ent kinds of vari ables The in stal la tionpro gram will copy these files to the same di rec tory cho sen for in stal la tion of Stat/Trans fer
Web Up date
We pe ri od i cally post main te nance re leases of Stat/Trans fer on our Web site to sup port new file for mats, add fea tures, or to fix prob lems that have come to our at ten tion How ever, we have found thatmany peo ple are not tak ing ad van tage of these re leases, so that they are us ing soft ware that is olderthan it should be To ad dress this prob lem, Stat/Trans fer will au to mat i cally check the Web for up -dates
By de fault, the pro gram will check for new ver sions once ev ery month, but you can change this op tion You can check im me di ately, daily, weekly, monthly, quar terly, or (if you are run ning on a com -puter that is not con nected to the Web) never
In or der to change the in ter val at which Stat/Trans fer checks the Web, click on the About tab and se
-lect one of the op tions
Sup pose you choose Ev ery Month (the de fault) Each time you start Stat/Trans fer, the pro gram will
com pare the cur rent date to the date at which the ver sion was last checked If the dif fer ence is less
Trang 14than thirty days, noth ing will hap pen If it is 30 days or more, the up date pro gram will ask you if you would like to check the Web.
If you choose to do so, the pro gram will check our Web site for the lat est ver sion If it finds a ver sion that is newer than yours, it will down load a de scrip tive file for you to read You can then choosewhether or not to down load the lat est re lease (these are gen er ally quite small — less than 800K) Ifyou choose to do so, it will be in stalled on your com puter
The READ-ME file will also be down loaded, so that you can check back to see what new fea tures have been added
If you wish, you can tell Stat/Trans fer to do an im me di ate check for a newer ver sion rather than waitfor an au to matic check
To do so, go to the About tab and se lect Right Now The up date pro gram will then check our Web
site for up dates as de scribed above
Uninstall Pro gram
If for some rea son you would like to re move Stat/Trans fer from your hard disk, sim ply se lect the
Uninstall op tion from the Start menu Stat/Trans fer folder.
Trang 15Tech ni cal Sup port
Our website can be found at You can reach our website from the About screen Our gen eral email ad dress, should you want reach us about any thing other than sup - port, is
Be fore you seek sup port, please check the on line help or look in the on line man ual and see if the so
lu tion to your prob lem can be found there Be sure to check the “Fre quently Asked Ques tions” sec
-tion You can also check to see if your prob lem is ad dressed in the Sup port sec tion of our website.
If you have a prob lem that you can not re solve by these meth ods, the best way to seek help is by email
(sup Please de scribe your prob lem and in clude any er ror mes sages you en coun
You can also seek sup port by us ing the Log menu tab This method is par tic u larly help ful if you think
you have found a bug in Stat/Trans fer, be cause it is pos si ble to au to mat i cally send us a com pressedand en crypted copy of the in put file that was caus ing you prob lems, as well as a com plete de scrip tion
of your en vi ron ment and your own de scrip tion of the prob lem
It is al ways good to make sure you are run ning the lat est ver sion of Stat/Trans fer You can go to the About
menu tab and look up the ex act ver sion of Stat/Trans fer that you are us ing You can also check for up dates from here
Trang 16Using the Stat/Trans fer Menus
Ver sion 9 of Stat/Trans fer has a stan dard Win dows in ter face and is ex tremely sim ple to use
If you are go ing to trans fer all of the vari ables and cases in a file, with default out put types, then youcan run the trans fer from a sin gle di a log box in which you need spec ify only the in put and out put filenames
If you wish to trans fer spe cific vari ables or cases, or change out put types, you are guided by ad di tional di a log boxes
-Us ing Win dows
In or der to use Stat/Trans fer for Win dows pro duc tively, you should be fa mil iar with ba sic Win dowstech niques This man ual as sumes that you know how to work in the Win dows en vi ron ment and par -tic u larly that you are fa mil iar with the di a log boxes for man ag ing files If you need help with Win -dows, see your Win dows user’s guide
Stat/Trans fer On line Help
The Stat/Trans fer on line help con tains all of the in for ma tion found in the man ual You can ac cess the
on line help by press ing the Help but tons or the ? but tons on the Stat/Trans fer menus.
Start ing Stat/Trans fer
The in stal la tion pro ce dure will in stall a folder for Stat/Trans fer on the Pro grams menu, and a short
-cut to Stat/Trans fer A short -cut to Stat/Trans fer will also be in stalled on your desktop
You can start Stat/Trans fer by click ing on the short cut on your desk top or by click ing on the Start but ton, then point ing to Pro grams Point to the Stat/Trans fer folder and when the folder con tents ap - pear, click on the StatTransfer short cut
You will see the Trans fer di a log box.
Trang 17Trans fer Di a log Box
When you start up Stat/Trans fer, you will see the Trans fer di a log box It is shown be low with a typ i
-cal trans fer job en tered:
Se lect ing the In put File For mat
The in put file for mat is se lected in the first line of the Trans fer di a log box, the In put File Type line Click on the In put File Type con trol ar row and you can browse through the list of sup ported file types Se lect an in put file type by click ing on it The file type will be en tered in the In put File Type
You can ob tain in for ma tion on a given file type by click ing on the ? but ton.
Se lect ing the In put Data File
The File Spec i fi ca tion Line
The in put data file is cho sen us ing the sec ond line of the Trans fer di a log box, the File Spec i fi ca tion
Click on the Browse but ton to open a stan dard Win dows file Open di a log box.
If your files are named us ing the Stat/Trans fer stan dard file ex ten sions, given be low, you can use the
Browse con trol to se lect a file.
To se lect the in put file, first make sure that the drive and di rec tory are the cor rect ones for your in putfile If not, change to the cor rect ones
Trang 18Next, you need to se lect the cor rect file Note that a wild card file spec i fi ca tion, ‘*.ext’ has been cre
-ated for the File Name en try, where ‘ext’ is the Stat/Trans fer stan dard ex ten sion for the type of in put
data file you have se lected
All of the files in the cur rent di rec tory with this ex ten sion will ap pear in a list box be low the File Name line Use this list and click on the name of the file you wish to use
If the file you wish to use as imput does not have a stan dard ex ten sion, then it will not au to mat i cally
ap pear and you will need to type the name on the File Name line.
Se lect ing Worksheet Pages
When ever you se lect a worksheet as in put, Stat/Trans fer will check to see if mul ti ple pages are pres ent
Stan dard File Ex ten sions
Trang 19If more than one page is found, Stat/Trans fer will dis play a Worksheet Page se lec tion line be low the
in put File Spec i fi ca tion line of the Trans fer di a log box If your worksheet pages are named, as they
are in Ex cel, for ex am ple, these names will be used Oth er wise, dummy names, ‘Sheetn‘, will be dis played, where n gives the num ber of the page.
-The name of the first page of the worksheet will ap pear on the Worksheet Page line and, un less you
se lect an other one, will be the page used as the in put data set by Stat/Trans fer If the data you wish to use are on a dif fer ent page, click on the con trol ar row and se lect the ap pro pri ate page from the listthat ap pears
The op tion Con cat e nate Worksheet Pages in the Op tions(3) di a log box al lows you to com bine
worksheet pages into a sin gle out put file If you check this op tion, when you name one page, then all
of the pages will be read and con cat e nated into an out put file of any type This op tion is ap pro pri ate
if your worksheet file con tains many sheets that are iden ti cal in struc ture These can be then be com bined into a sin gle out put file
-Se lecting Ta bles for Ac cess and ODBC In put
When ever you se lect ei ther an Ac cess file or an ODBC data source as in put, Stat/Trans fer will dis
-play a Ta ble se lec tion line be low the in put File Spec i fi ca tion line of the Trans fer di a log box.
The name of the first ta ble will ap pear on the Ta ble line and, un less you se lect an other one, will be the ta
-ble Stat/Trans fer uses as the in put data set If the data you wish to use are in a dif fer ent ta -ble, click on thecon trol ar row and se lect the ap pro pri ate ta ble from the list that ap pears
Se lecting Mem bers of SAS CPORT and Trans port Files
When ever you se lect a SAS CPORT or Trans port file as in put, Stat/Trans fer will dis play a Mem ber
se lec tion line be low the in put File Spec i fi ca tion line of the Trans fer di a log box.
The name of the first mem ber will ap pear on the Mem ber line and, un less you se lect an other one,
will be the mem ber of the SAS file used as the in put data set by Stat/Trans fer If the data you wish touse are in a dif fer ent mem ber, click on the con trol ar row and se lect the ap pro pri ate mem ber from thelist that ap pears
Most Re cently Used File Lists
If you of ten use the same in put file, you can use the “Most Re cently Used” file list to se lect the file.For each dif fer ent file type, Stat/Trans fer will main tain a list of the last ten files that have been opened
You can se lect any one of these files by first click ing on the con trol ar row of the File Spec i fi ca tion in
-put field to dis play the list and then click ing on the file you wish to use
View In put Data
You can now pre view your in put data by press ing the View button in the Trans fer di a log box Your
data will ap pear in a scrollable grid
The data can be sorted by any vari able by click ing on the vari able name You can navigate to any
row by en ter ing the row num ber in the Quick Nav i ga tion box and then press ing Go.
Col umns can be moved by click ing and hold ing the col umn head ing and then drag ging the col umn to the new
lo ca tion
To re turn to the Trans fer screen, press Close Viewer.
Trang 20Vari able Se lec tion In di ca tor
When the in put file has been spec i fied, Stat/Trans fer by de fault se lects all of the vari ables for trans fer A
mes sage will ap pear be low the in put File Spec i fi ca tion line, tell ing you that all of the vari ables in the
data set have been se lected and giv ing the to tal num ber of vari ables
If you wish to trans fer all of the vari ables of the in put data set, you need do noth ing more to spec ify
them If you want to se lect only some of the vari ables in the in put data set, click on the Vari ables tab
at the top of the Trans fer di a log box.
Se lect ing the Out put File For mat
The out put file for mat is se lected in the third line of the di a log box, Out put File Type It is al ways ad
-vis able to give the in put file type first, be fore se lect ing the out put file type
Click on the Out put File Type con trol to ob tain the list of sup ported file types Scroll through the list
and se lect a file type by click ing on it
Avail able Out put For mats
The list of out put file for mats will be the same as the list of in put file for mats, but with more choices of ver sion, and with the fol low ing ex cep tions:
· HTML ta bles will ap pear on the out put for mat list, since they are writ ten by Stat/Trans fer, al thoughthey can not be used as in put
· OSIRIS files will not ap pear, since they are only read by Stat/Trans fer
· SAS CPORT files will not ap pear, since they are only read by Stat/Trans fer
· When a worksheet has been cho sen as in put, then worksheets will not ap pear in the out put for mat list These types of con ver sions, such as a Lo tus 1-2-3 worksheet to an Ex cel worksheet,are not sup ported since it is usu ally pos si ble to do them within your spread sheet pro gram
-· Con ver sions from one xBASE file type to an other are not sup ported since the file for mats ofdBASE and FoxPro are iden ti cal Thus if a dBASE file is cho sen as in put, then FoxPro will not
ap pear on the out put for mat list and vice versa
Stata Out put
The two types of Stata files, Stata (Stan dard) and Stata/SE ap pear in the list of out put file for mats By
de fault the lastest ver sion of each type will be cho sen You can change the ver sion to be out put by us
-ing Out put Op tions in the Op tions(4) di a log box See Page 33.
SAS Out put
SAS V6 and SAS V7-9 will ap pear in the list of out put files types You can spec ify the plat form you
wish for the out put by us ing Out put Op tions in the Op tions(4) di a log box See Page 33.
Fixed For mat ASCII Choices
If you wish to write fixed for mat ASCII files, you will see sev eral choices in the list of out put file types:ASCII - Fixed For mat (S/T Schema)
ASCII - Fixed For mat + All Pro grams …
SAS Pro gram + ASCII Data File …
SPSS Pro gram +ASCII Data File …
Stata Pro gram + ASCII Data FileThese are de scribed on Pages 94 - 97
Trang 21Nam ing the Out put File
The out put file name is given on the fourth line of the Trans fer di a log box Since Stat/Trans fer sup
-plies a de fault spec i fi ca tion for the out put file us ing the in put file spec i fi ca tion, it is im por tant thatyou al ways spec ify the in put file name be fore the out put file name
De fault File Spec i fi ca tions
Once the in put file is cho sen, Stat/Trans fer will con struct an out put, or des ti na tion, file spec i fi ca tionwhich has the same drive, di rec tory and name as the in put file but which has the stan dard ex ten sion
ap pro pri ate for the out put file type (See Page 10 for Stat/Trans fer stan dard ex ten sions) This name
will ap pear in the fourth line, the out put File Spec i fi ca tion, of the Trans fer di a log box.
Chang ing the Name
If you do not wish to use the de fault name sup plied by Stat/Trans fer but in stead want the des ti na tion
file to have a dif fer ent name or ex ten sion, you can use the Browse con trol to call up the Save As di a log box, or you can type the name di rectly into the Trans fer di a log box.
-Chang ing the Di rec tory - the Most Re cently Used List
If you wish to be on a dif fer ent drive or directory, you can type in the drive or di rec tory di rectly How ever, Stat/Trans fer main tains a most re cently used list of the di rec to ries to which you have trans ferred files
You can re trieve this list (which will show the out put file name that ap pears in the out put File Spec i fi ca tion edit box) by click ing on the down ar row to the left of the Out put File Spec i fi ca tion box.
-Ta ble Names for Ac cess and ODBC
When ever you se lect ei ther an Ac cess file or an ODBC data source as out put, Stat/Trans fer will dis
-play a Ta ble se lec tion line be low the out put File Spec i fi ca tion line of the Trans fer di a log box.
The de fault name of the out put ta ble will be taken from the in put file name If you wish to use an
-other name, type it in the Ta ble line.
Nam ing Mem bers of SAS Trans port Files
When ever you se lect a SAS Trans port file as out put, Stat/Trans fer will dis play a Mem ber se lec tion line be low the out put File Spec i fi ca tion line of the Trans fer di a log box.
The de fault name of the out put mem ber will be taken from the in put file name To use an other name,
type it in the Mem ber line.
Run ning the Pro gram
The Trans fer But ton
When you have spec i fied in put and out put file types and names (with in for ma tion on ta ble, mem ber
or page, when needed) and, if you wish, you have cho sen vari ables and se lected cases, click on the
Trans fer but ton and the data will be trans ferred.
Over writing Out put Files
By de fault, Stat/Trans fer will check to see if the des ti na tion file al ready ex ists and warn you that an ex ist ing file is about to be over writ -
-ten You can sup press this warn ing in the Op tions(1) di a log box.
Trang 22Sim ple Trans fers
If you wish to trans fer ev ery thing in the in put data set and you use the out put tar get types as signed by
Stat/Trans fer, you need only spec ify the in put and out put file types and names in the Trans fer di a log box You can then click on the Trans fer but ton and run the job You do not need to en ter any thing in
ei ther the Vari ables or the Ob ser va tions di a log box.
Stop ping a Trans fer
While data are be ing trans ferred, the Trans fer but ton is la beled Stop If you click on it, the trans fer
job will be aborted This is use ful if you start a lengthy trans fer and then re al ize that some thing isamiss
When the trans fer is com plete, a mes sage will ap pear at the bot tom of the Trans fer di a log box in di
-cat ing that the trans fer is fin ished and tell ing you how many cases were trans ferred
Re set ting Stat/Trans fer
Use the Re set con trol at the bot tom of the Trans fer di a log box when you wish to do more than one
trans fer dur ing a Stat/Trans fer ses sion
Once a trans fer has been com pleted, sim ply click on the Re set con trol and the in put and out put file
spec i fi ca tions will be re moved, while the in put and out put file types re main
Since Stat/Trans fer will sup ply a de fault spec i fi ca tion for the out put file us ing the in put file spec i fi ca tion, it is im por tant that you al ways spec ify the in put file name be fore the out put file name
-If you wish to change the in put and out put file types for a new data trans fer, it is ad vis able to changethe in put file type first and then the out put file type
Trang 23Vari ables Di a log Box
Vari able Se lec tion
Au to matic Se lec tion of All Vari ables in the Data Set
When the in put file has been spec i fied in the Trans fer di a log box, by de fault Stat/Trans fer se lects all
of the vari ables for trans fer A mes sage will ap pear in the Trans fer di a log box be low the in put File Spec i fi ca tion line, tell ing you that all of the vari ables in the data set have been se lected and giv ing
you the to tal num ber of vari ables
If you wish to trans fer all of the vari ables of the in put data set, you need do noth ing more to spec ifythem
Man u ally Se lect ing Par tic u lar Vari ables
If you want to se lect only some of the vari ables in the in put data set, click on the Vari ables tab at the top of the Trans fer di a log box The Vari ables di a log box will ap pear with a list of all of the vari -
ables in the in put data set, as shown be low
By de fault, all of the vari ables are se lected Con trol but tons SelectAll and UnSelectAll al low you to
se lect or unselect all of the vari ables
You can se lect only some of the vari ables by go ing to a par tic u lar vari able and tog gling se lec tion ei ther on or off for that vari able
-You can se lect a range of vari ables by hold ing down the SHIFT key, then click ing on the check box
of the first vari able of the range and then click ing on the check box of the last vari able of the range
Trang 24Quick Vari able Se lec tor
The box in the up per right cor ner en ables you to spec ify se lec tion cri te ria for the vari ables dis played
in the listbox at the left of the page This is con sid er ably less te dious for long lists of vari ables thanman u ally check ing or unchecking them
Se lec tion con di tions can take the form of the wild card char ac ters ‘*’ or ‘?’ or you can use vari ableranges The ques tion mark matches ex actly one char ac ter, while the as ter isk matches more than one
Un like DOS wildcards, more than one as ter isk can be in cluded in a spec i fi ca tion For in stance:
‘*inc*’ will match any vari able with the string ‘inc’ in any po si tion Ranges of con tig u ous vari ablescan be spec i fied with a dash (with out spaces) be tween two vari able names For in stance ‘dis tance-a9’ will se lect (or drop) vari ables ‘dis tance’ through ‘a9’, in clu sive
Space or comma de lim ited lists of con di tions can be en tered at one time For ex am ple:
factor1,clus ter,a2-a10,L1*
fol lowed by a click on the Drop but ton, will uncheck the vari ables ‘factor1’, ‘clus ter’, ‘a2’ through
‘a10’, and any vari able which starts with the string ‘l1’
If needed, you can suc ces sively re fine your se lec tion by en ter ing con di tions and then click ing on ei
-ther the Drop or Keep but tons, or, al ter na tively, by man u ally check ing or unchecking vari ables in the
list box
Se lect all of the vari ables you want to trans fer When you have fin ished, you can click on the Trans fer tab at the top of the di a log box and you will re turn to the Trans fer di a log box, where you will see
-a mes s-age tell ing you how m-any v-ari -ables h-ave been se lected
Con trol ling the Types of Out put Vari ables
Tar get Out put Vari able Type
Sys tems dif fer widely in the num ber and va ri ety of vari able types they sup port When data are trans ferred from one file type to an other, a vari able type in the out put for mat must be as signed to each ofthe vari ables be ing trans ferred
Note that with Stat/Trans fer, nu mer i cal pre ci sion is never lost in the trans fer pro cess, since all nu mer
-i cal var-i ables are stored -in ter nally as dou ble pre c-i s-ion float -ing po-int num bers and are then wr-it ten out
ac cord ing to the as signed vari able type
Stat/Trans fer will au to mat i cally as sign out put types when you se lect vari ables for trans fer or whenyou choose to have the out put types op ti mized, as de scribed be low In most cases it is ap pro pri ate to
ac cept the out put types that Stat/Trans fer chooses How ever, there are times that you may wish toover ride these de faults and set the out put types man u ally
Tar get Types As signed by Stat/Trans fer
When as sign ing de fault out put vari able types, Stat/Trans fer at tempts to use all of the in for ma tion at its dis posal about the in put data vari ables in or der to pre serve nu meric pre ci sion and, at the same time, min i -mize the size of the out put data set If you do not choose to have out put types op ti mized, then
-in for ma tion about the vari ables gen er ally co mes from the -in put file “dic tio nary,” which de scribes thevari ables If out put types are op ti mized, the de fault be hav ior, then ad di tional in for ma tion is ob tained by
ex am in ing the val ues of vari ables This is dis cussed on the fol low ing page
When read ing nu mer i cal vari ables, Stat/Trans fer se lects a tar get out put vari able type based on the in for ma tion avail able to it This tar get vari able type is not used for in ter nal stor age dur ing the trans fer,but is sim ply the pre ferred out put type If this type is not sup ported in the cho sen out put file type, the best ap prox i ma tion will be cho sen
Trang 25-The var i ous tar get out put vari able types used by Stat/Trans fer are given be low.
The tar get type as signed to each vari able can be seen in the Vari ables di a log box.
To see the tar get type for a par tic u lar vari able, click once on the vari able name, so that the vari able isthe ac tive one The vari able name will ap pear above the list of tar get types and a black dot will ap -pear next to the as signed tar get type
If you turn off op ti mi za tion of tar get types, then when in suf fi cient in for ma tion is given about thevari ables in your in put data set to make a spe cific as sign ment, Stat/Trans fer will gen er ally as sign
‘float’ as the out put vari able type This is dis cussed be low
Re mem ber that the tar get type will not nec es sar ily be the ac tual out put type If the tar get type as signed to a vari able by Stat/Trans fer is avail able as one of the vari able types of the out put file for mat,then that type will be used for the out put If the as signed tar get type is not one of the avail able out put types, then a for mat of the next larger size will be used
-Op ti miz ing Tar get Types
Stat/Trans fer at tempts to pro duce the small est pos si ble out put data set On the first pass through thedata, in for ma tion from the data file dic tio nary will be used Un for tu nately, for some in put data types, this in for ma tion is not suf fi cient to do any thing other than set all of the out put vari able types to
By de fault, Stat/Trans fer will make an ad di tional op ti mi za tion pass through your data to de ter minemore in for ma tion about each vari able This pass will only be per formed if the se lected out put filetype is such that the out put file could be made smaller by op ti mi za tion Some out put file types, such
as Stata, have a rich as sort ment of stor age types and ben e fit from op ti mi za tion Oth ers, such asworksheets or files that have only one nu meric type (SPSS for ex am ple), do not ben e fit
Stat/Trans fer can de ter mine whether any vari ables can be rep re sented as in te gers, and, for those cases,
it can de ter mine the small est pos si ble in te gral type that can be used to rep re sent the data Fur ther, if a vari able can not be rep re sented by an in te gral type, Stat/Trans fer can au to mat i cally de ter mine whether
it can be rep re sented by a float in stead of a dou ble with out a loss of in for ma tion In for ma tion on themax i mum length of string vari ables is also ac cu mu lated, so that these can be stored in vari ables of the small est pos si ble length
You can change this be hav ior so that Stat/Trans fer does not per form an op ti mi za tion pass by set ting
the op tion Au to mat i cally Op ti mize Tar get Types in the Op tions(1) di a log box to Off.
Stat/Trans fer Tar get Out put Vari able Types
byte one byte signed in te ger (-128 to 127)int two byte signed in te ger (-32768 to 32767)long four byte signed in te ger
float four byte IEEE sin gle pre ci sion float ing point num berdou ble eight byte IEEE dou ble pre ci sion float ing point num berdate date stored as se rial day num ber (the num ber of days
since De cem ber 30, 1899)time frac tion of a day (12:00 noon = 5)date/time float ing point num ber (in te ger part - se rial day num ber,
frac tional part - time)string character string of a max i mum length spec i fied by
the in put file
Trang 26In ver sions of Stat/Trans fer prior to Ver sion 7, the de fault set ting for the Au to mat i cally Op ti mize Tar get Types op tion was ‘Off’ In or der to read ten digit num bers such as So cial Se cu rity num bers cor rectly, the op tion had to be changed to ‘On’, or the Op ti mize but ton had to be clicked, or the out -
put vari able type had to be changed man u ally
To elim i nate this prob lem, au to matic op ti mi za tion of tar get types is the de fault be hav ior in Ver sion 7and above This should cause very lit tle dif fer ence in Stat/Trans fer’s per for mance It will sim plytake a lit tle lon ger (usu ally a few sec onds) to trans fer your data, as Stat/Trans fer has to read it twice How ever, with au to matic op ti mi za tion, you are as sured that you will never lose pre ci sion in yourtrans fer Fur ther more, Stat/Trans fer will only make an op ti mi za tion pass when out put file vari abletypes will ben e fit from the ad di tional in for ma tion For file types such as worksheets or de lim itedASCII, it will not bother There fore, we suggest you leave optimization turned on
Note that if you do set Au to mat i cally Op ti mize Tar get Types to ‘Off’, you can still op ti mize for any given trans fer job by click ing on the Op ti mize but ton in the Vari ables di a log box.
Use Dou bles Op tion
Whether you choose to op ti mize au to mat i cally or to do it by press ing the Op ti mize but ton in the Vari ables di a log box, you still need to de cide whether to check the Use Dou bles op tion in or der to
tell Stat/Trans fer to put vari ables with frac tional parts into ‘dou ble’ or ‘float’ on out put
If you choose to use dou bles, Stat/Trans fer will still eval u ate each vari able to see if it can be rep re sented as a ‘float’ with out a loss of in for ma tion and will put only those vari ables that re quire it into a
-‘dou ble’ How ever, un less your data are mea sured with more than eight or nine dig its of pre ci sion
(sur vey data, for ex am ple, never are), this is an idle ex er cise and you should not check the Use Dou bles op tion.
-Au to matic Drop ping of Con stants from Out put File
You can tell Stat/Trans fer to au to mat i cally drop vari ables that are con stant or miss ing for a se lected
sub set of data You se lect this op tion by check ing the Drop Con stants check box and then press ing the Op ti mize but ton.
This fea ture is use ful when the part of a data set se lected for trans fer con tains vari ables with val uesthat are ei ther con stant or miss ing (such as a preg nancy vari able when only male sub jects are se lected
or vari ables in yearly sur veys where the same ques tions do not ap pear for each year.)
This fea ture is not likely to be used of ten, but is ex tremely valu able when it is needed, since if thedata set has a large num ber of vari ables, it can be ex ceed ingly te dious to se lect only the mean ing fulones man u ally
Chang ing the Types of the Out put Vari ables
In most cases it is ap pro pri ate to ac cept the out put type that Stat/Trans fer chooses How ever, theremay be times when you wish to spec ify the output types for some vari ables, since some times
Stat/Trans fer does not have enough in for ma tion to make the best as sign ment of an out put vari able.For ex am ple, if so cial se cu rity num bers are stored as num bers in stead of strings in the in put file,
Stat/Trans fer will gen er ally con vert them into floats on out put, pos si bly re sult ing in the loss of sev eral dig its inthe out put data set You can avoid this loss of key val ues by spec i fy ing that so cial se cu rity num bers be stored as longs or dou bles on out put
In some trans fers, you may pre fer a larger data set than Stat/Trans fer chooses, with more pre ci sion insome or all of the vari ables For in stance, in the ab sence of spe cific in for ma tion to the con trary,Stat/Trans fer will usu ally chose ‘float’ (four-byte, float ing-point) for mat for nu meric vari ables How -ever, you may wish to con vert these into dou ble pre ci sion num bers if you know that they rep re sentlarge mon e tary amounts
Trang 27In other cases, you may be able to cre ate a smaller data set than Stat/Trans fer chooses For ex am ple,
if you know that your data rep re sent small in te gers, you may wish to put them into ‘byte’ or ‘in te ger’vari ables
Manually Changing the Out put Types
To choose the out put stor age type of se lected vari ables your self, rather than have it au to mat i cally as
-signed, click on the Vari ables tab at the top of the Trans fer di a log box The Vari ables di a log box
will ap pear This screen dis plays a list of all of the vari ables in the input data set
When you choose any one of these vari ables, the out put type au to mat i cally as signed by Stat/Trans fer
is dis played on the but tons on the right of the screen If you wish to change the out put type for a par tic u lar vari able, click on the new type you want to as sign that vari able
-Out put vari able types can be changed freely for ASCII files and worksheet files For all other filefor mats, you can change freely among the nu meric types of ‘byte’, ‘in te ger’, ‘long’, ‘float’ and ‘dou -ble’ and you can change among the time types How ever, con ver sions be tween any of the nu merictypes and dates or strings are not sup ported
You should be care ful not to choose a smaller type than that cho sen by Stat/Trans fer un less you aresure you know more about your data than Stat/Trans fer does
Re mem ber that you are se lect ing a “tar get” type If the out put data for mat does not sup port the spe cific type you have se lected, then Stat/Trans fer will use the best match to the type you have se lected.You can de ter mine the out put vari able types sup ported for each out put file type by con sult ing the ta -ble given in the sec tion of this man ual de scrib ing that pro gram
-Han dling Mixed Data
If you have mixed data in which some vari ables need dou bles and oth ers do not (for ex am ple, youmight have pre cisely mea sured dol lar amounts, which should be in dou bles, along with scales of sur -
vey items, which should be in floats) you should press the Op ti mize but ton in or der to des ig nate in te
-gers for the right vari ables and then des ig nate floats and dou bles to re flect the ap pro pri ate level ofmea sure ment for each vari able
Value La bels for Strings
Both SAS and SPSS sup port the la bel ing of string vari ables Stat/Trans fer will au to mat i cally trans fer such value la bels, both to the in ter nal file for mats of SAS or SPSS and to the pro gram files writ ten by Stat/Trans fer to cre ate fixed for mat ASCII files
Note: Stat/Trans fer stores all num bers in ter nally as eightbyte dou ble pre ci sion num bers, so that nu mer i cal pre ci sion of dou ble pre ci -sion in put will be re tained if you man u ally change a tar get typefrom ‘float’ to ‘dou ble’
Trang 28-Ob ser va tions Di a log Box
To reach the di a log box that al lows you to se lect spe cific cases or re cords from your data set, click on
the Ob ser va tions tab at the top of the di a log boxes This will bring the Ob ser va tions di a log box to
the front, shown be low with an ex am ple data set
Se lect ing Cases from the In put File
The scroll ing text box in the up per left cor ner pro vides brief, onscreen doc u men ta tion on how to se lect par tic u lar data re cords based on con di tions that you spec ify The vari ables of the in put data setare listed in the box at the right of the screen
-At the bot tom of the screen is the case-se lec tion field in which you en ter the case se lec tion, or
WHERE, ex pres sion that will spec ify cases This ex pres sion gives the con di tions on the vari ablesthat will de fine the sub group of the data set that you wish t o se lect
Vari able names can be en tered in this field by se lect ing their names from the vari able list box Whenyou dou ble-click on a vari able name it will be cop ied to the case-selection box
Case-Se lec tion Ex pres sions
The WHERE state ment is used to give the con di tions on the vari ables that will de fine the sub group ofthe data set that you wish to se lect
The case-se lec tion, or WHERE, ex pres sion, has the fol low ing form:
WHERE vari able ex pres sion re la tional op er a tor se lec tion con di tion
Here, vari able ex pres sion con sists of a sin gle vari able or an ex pres sion in volv ing sev eral vari ables,
re la tional op er a tor is one of the op er a tors listed be low, while se lec tion con di tion gives spec i fi ca tions
for the vari ables to be se lected
Trang 29Vari able Ex pres sion
All of the usual arith me tic op er a tors [+ - / * ( ) ] are avail able for use in this ex pres sion
If vari able names used in WHERE ex pres sions con tain em bed ded blanks or char ac ters such as re la tional or arithmetic op er a tors like ‘/’, then they must be en closed in sin gle quotes
-In ter nal Vari able
An in ter nal vari able, ‘_rownum’ is avail able which al lows spe cific rows or re cords of the data set to
<= less than or equal
>= greater than or equal
& and
, or (used in a se ries)
The modu lus op er a tor is also avail able:
% the re main der af ter di vi sion by the op er and following
Se lec tion Con di tions
If vari able val ues con sist of strings, then when they con tain blanks or char ac ters such as ‘/’, they must
be en closed in dou ble quotes
Ex am ples
Ex am ples of se lec tion con di tions given by WHERE ex pres sions are:
where educ = 12 & rate > 2
where (income1 + income2)/famsize < 20000
where income1 >= 20000 | income2 >= 20000
where acct != 2001
where name = smith
where ‘dept-sales’ = “auto loan”
where id % 2 = 0 (which se lects all even val ues of ‘ID’)
where _rownum < 200 (which se lects rows 1 - 199)
Wildcards in Se lec tion Con di tions
Wild cards ( * or ? ) are avail able to se lect sub groups of string vari ables For ex am ple:
where ac count = ?3*
Trang 30where name = mc* | name = mac*
Note that when the wild card ‘?’ is used, it re places a sin gle char ac ter, while the wild card ‘*’ re places
an un spec i fied num ber of char ac ters Thus the spec i fi ca tion ‘?3*’ will se lect ac count num bers of anylength that have a three in the sec ond place
Comma Op er a tor
The comma op er a tor ‘,’ is used to list dif fer ent val ues of the same vari able name that will be used as
se lec tion cri te ria It al lows you to by pass po ten tially lengthy OR ex pres sions when se lect ing lists ofval ues For ex am ple, the WHERE ex pres sion above can be more eas ily writ ten:
where name = mc*,mac*
Other ex am ples are:
where age = 21,31,41,51,61
which will se lect only the listed ages, and
where caseid != 22*,30??,4?00
which will se lect all cases ex cept those id’s start ing with ‘22’, or four char ac ter id’s start ing with ‘30’,
or start ing with ‘4’ and end ing with ‘00’
Miss ing Val ues
You can test to see if the value of any vari able is miss ing by com par ing it to the spe cial in ter nal vari able ‘_miss ing.’
-For ex am ple
where in come != _miss ing & age != _miss ing
Pre serving WHERE ex pres sions
Or di narily the WHERE ex pres sion is cleared af ter a trans fer op er a tion If you wish to ap ply the same
ex pres sion to sev eral in put files, you can check the box Pre serve ex pres sion be tween trans fers and
your ex pres sion will be avail able for re-use or ed it ing for your next trans fer run
Sam pling Func tions
Three func tions are avail able for sam pling
Ran dom Sam ples
The first function
al lows for sim ple ran dom sam pling Each case is se lected with a prob a bil ity equal to prop.
For ex am ple, for a ran dom sam ple of one tenth of a data set, use:
where samp_rand(.1)
Ran dom Sam ples of Fixed Size
The sec ond func tion
samp_fixed(sam ple_size,to tal_ob ser va tions)
Trang 31al lows a ran dom sam ple of fixed size to be drawn When us ing this func tion, the first case is drawn
with a prob a bil ity of sam ple_size/to tal_ob ser va tions, and the suc ceed ing i’th case is drawn with a prob a bil ity of (sam ple_size - hits) / (to tal_ob ser va tions - I).
For ex am ple, if you had a data set of 1000 cases and wished for a ran dom sam ple of 25 cases, youwould spec ify:
where samp_fixed(25,1000)
Sys tem atic Ran dom Sam ples
Finally, a third func tion
samp_syst(in ter val)
per forms a sys tem atic sam ple of ev ery n’th case af ter a ran dom start For in stance, to take ev ery 6’th
case, use:
where samp_syst(6)
Sam pling Sub sets of the In put Data
Ex pres sions are eval u ated from left to right You can thus sam ple from a sub set of your cases bysubsetting them first and then sam pling For ex am ple, to take a ran dom half of high school grad u ates, use:
where school ing >= 12 & samp_rand(.5)
Sam pling Seed and Re pro duc ible Sam ples
The ran dom num ber gen er a tor that pro vides the ba sis of these sam pling rou tines is ‘rand_port()’ in
Jerry Dwyer, “Quick and Por ta ble Ran dom Num ber Gen er a tors.” C Users Jour nal, June, 1995, pp.
3344 By de fault, it is seeded us ing a per mu ta tion of the time of day, and will yield a dif fer ent sam ple on each run
-If you need a re pro duc ible sam ple, you can gen er ate it by us ing the same seed each time The seed is
en tered in the Op tions(1) di a log box and should be a pos i tive in te ger in the range of one through
Trang 32Op tions(1) Di a log Box
To reach the first of the four di a log boxes that al low you to set dif fer ent op tions, click on the Op tions(1) tab at the top of the di a log boxes
-Gen eral Op tions
Ask Per mis sion be fore Over writing Files
The op tion Ask Per mis sion Be fore Over writing Files is on by de fault If a file, or a da ta base ta ble
ex ists, you will be prompted for per mis sion be fore it is over writ ten
If you wish to sup press these warn ing mes sages, click on the box to re move the check mark
Pre serve Vari able Name and La bel Case if Pos si ble
Stat/Trans fer al ways fol lows the vari able-naming rules of the out put file type and will con vert in putnames so that they will con form to those rules It also, by de fault, tries to con vert la bels in keep ingwith the “spirit” of the tar get pack age This means that for pack ages such as S-Plus and Stata,
Stat/Trans fer will write out vari able names and la bels in lower case
For S-Plus and Stata only, if you want to over ride this be hav ior, click on the box to se lect the op tion
Pre serve Vari able Name and La bel Case if Pos si ble and the case of your in put vari ables will be
pre served on out put
Write New, Nu meric Vari able Names
When you go from one for mat to an other, by de fault Stat/Trans fer will cre ate le gal vari able names for you, based as much as pos si ble on the orig i nal names In par tic u lar, when you trans fer from sys temssuch as Par a dox, or JMP, which al low long vari able names with em bed ded spaces, to a sys tem such
as SPSS, which re stricts vari able names to eight char ac ters, by de fault Stat/Trans fer will trun cate foryou How ever, these trun cated names of ten have lit tle re sem blance to the names you started with.Stat/Trans fer will use the vari able names as vari able la bels, so that your orig i nal names are avail able
Trang 33If you check the op tion Write new, nu meric vari able names (VN), in stead of the de fault vari able
names, Stat/Trans fer will cre ate new vari able names of the form V1 VN This is chiefly use ful when deal ing with trun cated names If your out put sys tem sup ports vari able la bels, it is some times better to check this op tion and have Stat/Trans fer sim ply cre ate nu meric names for your vari ables You canthen use the vari able la bels for the de scrip tion
Be cause this op tion is likely to be use ful only in spe cial cir cum stances, it re verts to the de fault be tween ses sions
-Pre serve value la bel tags and sets
Many soft ware pack ages al low us ers to as sign the same set of value la bels to more than one vari able.(In SAS, the term for value la bels is “user-de fined for mat”) For ex am ple, a sur vey with a list ofques tions with “Yes” and “No” re sponses could use the same set of value la bels for the vari ables as -
so ci ated with each of these ques tions
If the op tion Pre serve value label tags and sets is checked, the map ping of value la bel sets to mul ti ple
vari ables will be pre served on out put If tags are used in the in put file to iden tify value la bels sets,these will be pre served Oth er wise, tags will be con structed by Stat/Trans fer (LABA-LABZ and so on)
If this op tion is not checked, each la beled vari able will have a unique value la bel set and the tag used
to iden tify the set will be con structed from the name of the vari able
Au to matically Op ti mize Tar get Types
The op tion Au to mat i cally Op ti mize Tar get Types al lows you to choose whether or not Stat/Trans fer
will per form a sep a rate op ti mi za tion pass on your data be fore it is ac tu ally trans ferred to your out putdata set This pass will only be per formed if the se lected out put file type is such that the out put filecould be made smaller by op ti mi za tion Op ti mi za tion should cause very lit tle dif fer ence in
Stat/Trans fer’s per for mance It will sim ply take a lit tle lon ger (usu ally a few sec onds) to trans feryour data, as Stat/Trans fer has to read it twice
This op tion is set to ‘On’ by de fault If you do not wish to have ap pro pri ate out put file types au to mat i cally op ti mized, uncheck the Au to mat i cally Op ti mize Tar get Types box.
If you turn the Au to mat i cally Op ti mize Tar get Types op tion to ‘Off’, then num bers with many sig
-nif i cant dig its may lose pre ci sion and your out put file may be larger than nec es sary
With au to matic op ti mi za tion, you are as sured that you will never lose pre ci sion in your trans fer Fur ther more, Stat/Trans fer will only make an op ti mi za tion pass when out put file vari able types will ben -
-e fit from th-e ad di tional in for ma tion For fil-e typ-es such as worksh-e-ets or d-e lim it-ed ASCII, it will notbother There fore, we sug gest you leave op ti mi za tion turned on
Note that if you do set Au to mat i cally Op ti mize Tar get Types to Off, you can still op ti mize for any given trans fer job by click ing on the Op ti mize but ton in the Vari ables di a log box.
Use Dou bles
The Use Dou bles box can be checked here only if your choose the Au to matically Op ti mize Tar get Types op tion By de fault it is off.
Check Use Dou bles op tion if the pre ci sion of mea sure ment of any of your vari ables is greater than
eight or nine dec i mal dig its
Seed for Sam pling Func tions
By de fault, the Seed for Sam pling Func tions op tion has the value ‘Autogenerate’ In this case, the
sam pling func tions in WHERE ex pres sions will gen er ate a start ing seed ran domly based on the clocktime This means that each time you run a trans fer on a given file you will se lect a dif fer ent sam ple
If, in con trast, you need a re pro duc ible sam ple, you can en ter a seed for the ran dom sam pling pro cess The seed should be a pos i tive in te ger in the range of one through 2,147,483,646
Trang 34User Miss ing Val ues
You have some con trol over the way miss ing val ues are treated for in put files con tain ing more than
one type At pres ent the User Miss ing Val ues op tions ap ply to SPSS files (both Data and Por ta ble) and OSIRIS The op tions are se lected with the but tons Use All, Use First or Use None.
Some sta tis ti cal sys tems dis tin guish be tween “sys tem miss ing,” such as the re sult of a di vide by zero,and “user-miss ing,” a nu meric value which is de fined as a miss ing value by the user Fur ther, par tic u -larly in sur vey re search, dis tinc tions are made be tween user-de fined miss ing val ues that rep re sentstruc tur ally miss ing data (such as an swers to preg nancy his tory ques tions from male re spon dents),and those that rep re sent cat e go ries of non-re sponse or sim ply the fail ure of the in ter viewer to prop -erly col lect the data
Con ven tion ally, zero is used to rep re sent “in ap pli ca ble” miss ing val ues, and higher num bers are used
to rep re sent such re sponses as “don’t know,” “re fused” and “not as cer tained” While in ap pli ca bledata is an a lyt i cally equiv a lent to “sys tem miss ing”, there can be le git i mate re search in ter est in thepat terns of non-re sponse rep re sented by the other cat e go ries of miss ing data
Use All
By de fault, when mul ti ple miss ing val ues are al lowed (as in SPSS, for ex am ple) they are all mapped
onto miss ing values on out put This cor re sponds to se lec tion of the Use All but ton.
The map ping to ex tended miss ing val ues on out put is de ter mined by the op tion be low, Map to ex tended (a-z) miss ing.
-Use First
If you se lect Use First, the first user-de fined miss ing value will be mapped to a miss ing value and the rest will be treated as data and trans ferred in tact to the tar get data set Use First will of ten be the most
use ful of the op tions, since it will al low tab u la tions in the tar get pack age of pat terns of non-re sponse
The map ping to a miss ing value on out put is de ter mined by the op tion be low, Map to ex tended (a-z) miss ing.
Use None
If you choose Use None, then all of the user-de fined miss ing val ues will be trans ferred to the tar get
data set re tain ing their in put values
Note that we be lieve these op tions are po ten tially dan ger ous To avoid the chance of us ers check ing
one of these op tions and then for get ting about it, Stat/Trans fer does not save the set tings when op
-tions are au to mat i cally saved at the end of a ses sion.
Map to ex tended (a-z) miss ing val ues
By de fault (when this op tion is left un checked), all user miss ing val ues that are se lected ac cord ing the
op tions above (Use All/ Use First/Use None) will go to a sin gle miss ing value which will then be
con verted to the “sys tem” miss ing value in the tar get package ( ‘.’ in SAS or Stata, for ex am ple,)
If the op tion Map to ex tended (az) miss ing is checked, user miss ing val ues will be mapped, if pos
-si ble, to ex tended miss ing val ues in formats that sup port them (SAS, ASCII, or Stata)
If pos si ble, the first let ter of the value la bel will be used as the miss ing value For in stance, if thevalue ‘0’ is a user miss ing value and is la beled as “in ap pli ca ble”, it will be mapped to ‘.I’ This map -ping will only oc cur for miss ing val ues that are com puted with an equal op er a tor
If there is no la bel, or if the miss ing let ter has al ready been used, the miss ing value will be mapped
se quen tially to ‘.a’ - ‘.c’
Trang 35Date/Time For mats
Date/Time For mats - Writ ing
Stat/Trans fer gives you con sid er able con trol over how dates and times writ ten to out put ASCII files(see be low for con trol ling how dates and times are read.) You can con trol the for mat ting for date val -
ues, time val ues and com bined date/time val ues in the Date, Time and Date/Time edit boxes.
Out put for mats that you sup ply are used to con vert date and time val ues to char ac ter strings Eachdate or time part of the out put for mat has the form ‘%char’ Leading ze ros cause the value to printedwith lead ing ze ros For ex am ple ‘%0d’ will print the day of the month with a lead ing zero
The char ac ters be low are used to cre ate the out put for mats Any thing to be printed in the out put char
ac ter string that is not in the list be low, such as com mas, spaces or other de lim it ers, must be given ex plic itly in the out put for mat
-%a ab bre vi ated week day
%A full name of week day
%b ab bre vi ated name of month
%B full name of month
%d day of the month (1 - 31)
%D day of the year (1 - 366)
%H hour (24 hour clock) (0 - 23)
%I hour (12 hour clock) (1 - 12)
%m month as num ber (1 - 12)
%M min utes (0 - 59)
%N mil li sec onds (0 - 999)
%1N tenths of sec onds (0 - 10)
%2N hun dredths of sec onds (0 - 99)
%S sec onds (0 - 59)
%y year as two dig its
%Y year as four dig its
%% % char ac ter The de fault for mats for con vert ing dates and times to strings are:
Date: %m/%d/%Y (5/18/1945)
Time: %0H:%0M:%0S (14:05:48)
Date/Time: %m/%d/%Y %0H:%0M:%0S (10/1/1990 02:20:09)
Date/Time For mats - Read ing
You can use these op tions to control how Stat/Trans fer reads dates and time for de lim ited ASCII in put files In most cases, us ers will not need to change the de fault set tings in or der to read date andtime vari ables How ever, if you do need to do so, you pro vide a gen eral “scan ning” in put for mat in
-the Scan edit box, which is used when an ASCII file is opened for read ing The in put for mat given in Scan is used in the ini tial look at the file, which will de ter mine vari able types and also spe cific for -
mats for dates, times, and date/time vari ables The de fault in put for mat is con structed so that it will de
-ci pher a num ber of dif fer ent date and time pos si bil i ties
Note that if you are us ing a SCHEMA file to de scribe your data, the date and time for mats given there will over ride the for mats set here
Trang 36The sep a rate in put for mats for date, time and date/time vari ables are given in the Date:, Time:, and Date/Time: edit boxes and must match those given in the gen eral scan ning in put for mat (The en tries
in these edit boxes are used when the data file is be ing used in a trans fer and al low for much more ef
-fi cient read ing of the -file.)
The in put for mat strings given in the Scan:, Date:, Time:, and Date/Time: edit boxes are used to
con vert char ac ter strings to date/time vari ables The for mat strings are read from left to right If awidth is given ex plic itly for a par tic u lar vari able, that will be used when read ing the char ac ter string.Oth er wise, the width will be de ter mined by the pres ence of de lim iter char ac ters Char ac ters in the list
be low al low char ac ters in a string to be skipped, if nec es sary
If the en tire in put string is not matched by the for mat string or if the re sult ing time or date is notvalid, the vari able will be set to miss ing
Each date and time part of the in put for mat, as well as some spe cial char ac ters, have the form ‘%char’
or ‘%Xchar‘, where the mod i fier X is used to de ter mine field widths.
The two dif fer ent cases of ‘%Xchar‘ are:
%numberchar When a num ber pre cedes the spec i fi ca tion char ac ter, char, it spec i fies the field width to be used The next num ber char ac ters in the in put string are scanned for the spec i fied
date or time
%:delimchar If there is a co lon and any sin gle char ac ter, delim, pre ced ing the spec i fi ca tion char ac ter, char, then the field to be read is taken to be all the char ac ters up to but not in clud ing the given de lim iter char ac ter, delim The de lim iter it self is not scanned or skipped by the for -
mat, and there fore must be en tered ex plic itly in the in put for mat or ex plic itly skipped (Note
that the mod i fier :delim need not be used rou tinely, since nu meric and al pha for mats will au to
-mat i cally stop read ing when they reach a de lim iter.)
The char ac ters used to cre ate the in put for mats are listed be low Any thing to be read from the in putchar ac ter string that is not in the list be low, such as com mas or other de lim it ers, must be given ex plic -itly in the in put for mat White space (spaces, tabs, car riage re turns, and so on) is ig nored in the in putfor mat string
%c skip a sin gle char ac ter (see also %w)
%Nc skip N char ac ters
%$c skip the rest of the in put string
%d in put day of the month
%H in put hour
%m in put month, as in te ger or as al pha string
(If al pha string, case does not mat ter, and anysubstringof a month that dis tin guishes it from theother months will be ac cepted.)
%M in put min ute
%n in put mil li sec onds
%N in put mil li sec onds or tenths or hun dredths of sec onds
(If no field width is given and the in put string has
a field width of three, then in put will be mil li sec onds
A field width of 1, ei ther given ex plic itly or
in ferred from the in put string, will cause input
of 10ths of a sec ond; a width of 2 will cause
in put of 100ths of a sec ond.)
%p in put strings de fin ing ‘am’ and ‘pm’
(Matching is the same as for months.)
%S in put sec onds
%w skip a whitespace de lim ited word (see also %c)
Trang 37%y in put year.
(If less than 100, the cen tury change over year isused to de ter mine the ac tual year.)
%Y in put year as found in the in put string
%%,%[,%] in put the ‘%’, ‘[’, and ‘]’ char ac ters from the in put string
[ ] op tional spec i fi ca tion
(Text and spec i fi ca tions within the brack ets will be read
if pres ent in the in put string, but need not be there )The de fault for mat for scan ning times in text files is:
[%m[/]%d[/][,]%y] [%H:%M[:%S[.%N]][%p][[(]%3c[)]]]
which will rec og nize such di verse strings as:
May 18 1945 May 18, 1945 5/18/45 05/18/45 2:16 PM
Cen tury Change over Year
When you are read ing two-digit years, some may fall in the twenty- first cen tury and some in thetwen ti eth You can use this op tion to con trol how two digit years are read Any two-digit year lessthan the change over year will have the first two dig its of the com plete four digit year set to 20 Anyyear greater than or equal to the change over year will have the first two dig its set to 19
The de fault for the op tion is ‘20’, so that the change over year from one cen tury to an other is 1920.Thus the date 1/1/01 will be in ter preted as Jan u ary 1, 2001, while the date 1/1/31 will be in ter preted
as Jan u ary 1, 1931
If your data re fer to dates ear lier than 1920, such as birth dates, you will need to over ride the de fault
be hav ior and spec ify a dif fer ent change over year If, for ex am ple, you spec ify ‘00’, this would causeall two digit dates to be in ter preted as years in the twen ti eth cen tury
Re stor ing and Sav ing Op tions
Re store De faults
Re store De faults re sets all of the op tions listed in the Op tions(1), Op tions(2), Op tions(3),and Op tions(4) di a log boxes to their de fault val ues.
-Re store Saved
Re places the cur rently se lected op tions in the Op tions(1), Op tions(2), Op tions(3), and Op tions(4)
di a log boxes with those stored the last time the op tions were saved, ei ther from an ex plicit save with
the Save but ton or from your last exit from Stat/Trans fer.
Saves all of the cur rent op tions in the Op tions(1), Op tions(2), Op tions(3), and Op tions(4)) di a log boxes, with the ex cep tion of the Write New, nu meric vari able name op tion, the User Miss ing Value se lec tions, and the op tions for In put Worksheets: Data Range and In put Worksheets: Field Name Row This is the same be hav ior that oc curs when you quit a Stat/Trans fer ses sion
Calls con text-sensitive help
Trang 38Op tions(2) Di a log Box
To reach the sec ond of the four di a log boxes that al low you to set dif fer ent op tions, click on the Op tions(2) tab at the top of the di a log boxes
-ASCII File Read
De lim iter
This op tion will give you a list of pos si ble de lim it ers for in put ASCII files By de fault, Stat/Trans fer will
au to mat i cally sense which de lim iter to use or you can choose one from the list: com mas, tabs, spaces orsemi co lons If you have a delim iter that is not on the list, click on ‘Other’ and en ter the de lim iter you wish
to use
Com bine ad ja cent blanks
This op tion is avail able for space de lim ited files only It is use ful if data val ues are de lim ited by one
or more blanks or tabs
The de fault for Com bine ad ja cent blanks is ‘off’ If you turn this op tion on, you can se lect
‘Spaces’ in which case mul ti ple blanks are treated as one blank, or you can se lect ‘Spaces and tabs’,
in which case mul ti ple in stances of tabs and blanks are con verted to a sin gle space
Variable Names
By de fault, Stat/Trans fer will sense whether the first line of your in put data set con tains field names
or data You may, if you wish, ex plic itly over ride this de fault
AutoSense: If this op tion is set to ‘AutoSense’, Stat/Trans fer will look at the first and sec ond rows ofdata If there is a change from a string to a num ber for one or more vari ables be tween these rows,Stat/Trans fer will use the first row as the field names This will fail if your first row con tains the field
Trang 39names, and all of your vari ables are of the string type In that case you should choose one of the fol low ing two op tions:
-First Row: When this op tion is set to ‘-First Row’, Stat/Trans fer uses the data found in your first row
as the field names
Make Up: When this op tion is set to ‘Make Up”, Stat/Trans fer treats the first row in your file as data
and as signs the field names ‘col1’ … ‘coln’.
Nu meric Missing Value
It is pos si ble to spec ify a string that will be in ter preted as a miss ing value when Stat/Trans fer readsASCII files For ex am ple, your in put data set may use the string ‘NA’ to rep re sent miss ing val ues or
it may use a pe riod
En ter the string that rep re sents miss ing val ues in the in put data in the Nu meric Missing Value field.
If you wish to read ex tended miss ing val ues for ei ther de lim ited or fixed ASCII files, use the op tion
be low or en ter the word ‘ex tended’
Con vert ex tended (a-z) miss ing val ues
If this op tion is checked, the key word ‘ex tended’ will be en tered into the Nu meric Miss ing Value
field When ‘ex tended’ is en tered, ex tended miss ing val ues (‘.a’ - ‘.z’, ‘.’, and ‘._’) in ei ther de lim ited
or fixed ASCII files will be read from the file Note that read ing miss ing val ues is case-in sen si tive(that is, ‘.a’ and ‘.A’, for ex am ple, are equiv a lent)
These ex tended miss ing val ues will be au to mat i cally writ ten to the out put file for out put for mats thatsup port them (SAS and Stata)
String Quote Char ac ter
This is the char ac ter that is used to en close string fields in the in put data set The de fault char ac ter isset as dou ble quotes You can choose the ap pro pri ate char ac ter for the in put data How ever, if stringvari ables are not en closed by any char ac ter, you can leave this op tion set at the de fault dou ble quote
Max i mum Num ber of Lines to Ex am ine
Stat/Trans fer first reads your ASCII data to de ter mine what type of vari able is pres ent in each de lim ited po si tion By de fault it will read your en tire data set If you data are con sis tent, so that the firstfew lines suf fice to show each vari able type, and your data have enough rows that it ac tu ally takesmore than a few sec onds to ex am ine them all, you might want to set this op tion to a nu meric limit,such as 50
-Dec i mal Point
If your data set uses a sym bol other than the de fault, ‘pe riod’, to in di cate the dec i mal point in a num
-ber (a comma, for ex am ple), en ter the char ac ter on the Dec i mal Point line.
Thou sands Sep a ra tor
If your data set uses a sym bol other than the de fault, ‘comma’, to mark thou sands in a num ber (a pe
-riod, for ex am ple), en ter the char ac ter on the Thou sands Sep a ra tor line.
Trang 40ASCII File Write
De lim iter
The De lim iter op tion will give you a list of pos si ble de lim it ers for out put de lim ited ASCII files You
can choose a de lim iter from the list: com mas, tabs, spaces, and semi co lons, or if you have a de lim iterthat is not on the list, click on ‘Other’ and en ter the de lim iter you wish to use The de fault is a
String Quote Char ac ter
The char ac ter spec i fied here will be writ ten be fore and af ter string vari ables on out put It is typ i cally
a dou ble quote This char ac ter is only strictly nec es sary if your fields have an em bed ded de lim iter Ifyou en ter a blank here, string fields will not be en closed by any char ac ter
Nu meric Missing Value
It is pos si ble to spec ify the string that will be used to rep re sent miss ing val ues when Stat/Trans ferwrites ASCII files For ex am ple, you may want to cre ate an out put file for a pro gram that ex pects aspe cific miss ing value, such as a pe riod for SAS
If you want to write a string other than the de fault blank for miss ing values, then en ter that string in
the Nu meric Missing Value field
If you wish to write ex tended miss ing val ues for ei ther de lim ited or fixed ASCII files, use the op tion
be low, or en ter the word ‘ex tended’
Write ex tended (a-z) miss ing val ues
If this op tion is checked the key word ‘ex tended’ will be en tered into the Nu meric Miss ing Value
field When ‘ex tended’ is en tered, ex tended miss ing val ues (‘.a’ - ‘.z’, ‘.’, and ‘._’) will be writ ten to
ei ther de lim ited or fixed for mat ASCII files
Line End ings
The line end ings of ASCII files for Win dows dif fer from those for Unix or OS-X Win dows fileshave a car riage re turn and a line feed at the end of each line, while Unix and Mac files have only aline feed If you wish to write an out put file for use on a Unix ma chine or a Mac, then you must tellStat/Trans fer to write the cor rect kind of line end ing
The de fault is ‘Win dows’ To write Unix and Mac files, se lect ‘Unix & OS-X’ from the drop-down
Write vari able names in first row
The op tion Write Vari able Names in First Row is on by de fault If you turn it off, field names will
not be writ ten in the first row of de lim ited ASCII out put files
Re stor ing and Sav ing Op tions
See Page 29 for a de scrip tion of Re store De faults, Re store Saved, and Save.