-------------------------------------------------------------------------------------------------- log: e:\stata611.log log type: text opened on: 15 Sep 2010, 10:27:13 . end of do-file . do "C:\DOCUME~1\tgindlin\LOCALS~1\Temp\STD02000000.tmp" . *INTRODUCTION TO STATA, ECON611, UMBC, T. H. GINDLING, Fall, 2010 (using Stata10) . . *STATA CAN BE USED INTERACTIVELY OR RUN FROM AN OUTSIDE PROGRAM. . *LET'S START INTERACTIVELY. . *THE INTERACTIVE COMMANDS MUST BE WRITTEN IN THE STATA COMMAND WINDOW. . . *COMMENTS BEGIN WITH A * . *IF THE LINE BEGINS WITH A * IT WILL NOT BE IMPLEMENTED . . *STATA COMMANDS must be in lower case letters. . . *THE FIRST THING YOU NEED TO DO IS TO CREATE A LOG FILE. IF YOU DO NOT, . *THEN THERE WILL BE NO RECORD OF YOUR WORK!!!! . end of do-file . do "C:\DOCUME~1\tgindlin\LOCALS~1\Temp\STD02000000.tmp" . *STATA PUTS ALL DATA IT USES IN MEMORY, AND YOU NEED TO MAKE SURE . *THAT IT HAS ENOUGH MEMORY AVAILABLE FOR THE DATA YOU ARE TO USE. . . set memory 34m Current memory allocation current memory usage settable value description (1M = 1024k) -------------------------------------------------------------------- set maxvar 5000 max. variables allowed 1.909M set memory 34M max. data space 34.000M set matsize 400 max. RHS vars in models 1.254M ----------- 37.163M . . *INPUTING DATA: YOU MAY INPUT DATA DIRECTLY, INPUT DATA FROM AN EXTERNAL . *FILE, OR USE THE DATA EDITOR. LET'S BEGIN BY INPUTING DIRECTLY. . . *USE THE "DO" FILE WINDOW . *WITH "DO" THE COMMANDS SHOW UP IN THE RESULTS WINDOW . *WITH "RUN" YOU DO NOT SEE THE COMMANDS IN THE RESULTS WINDOW . . input family person salary hours tall family person salary hours tall 1. 1 1 10 5 4 2. 1 2 20 5 5 3. 2 1 30 6 5.5 4. 2 2 30 7 5 5. end . . *OTHER WINDOWS . *RESULTS WINDOW . * PRINT RESULTS (ON FILE MENU) . *VARIABLES WINDOW . *REVIEW WINDOW (IF YOU CLICK ON A COMMAND IN ANY WINDOW, IT SHOWS UP IN . *THE COMMAND WINDOW). . *HELP MENU . * CONTENTS . * SEARCH . * STATA COMMAND . * WEB SITE: USER SUPPORT, RESOURCES AND CLASSES FOR LEARNING MORE, . * tech-support@stata.com IS VERY GOOD. . . *DATA EDITOR . *YOU CAN USE THE DATA EDITOR TO EXAMINE THE DATA, AND TO . . save data1.dta file data1.dta saved . . * WILL SAVE THE DATA AS DIR:\FNM.FTP (IT WILL . * NOT WRITE OVER THE CURRENT DATA SET. . . *USE EXPLORE TO SEE LOCATION OF data1.dta . . *CHANGING THE DEFAULT DIRECTORY WHERE STATA LOOKS FOR AND WRITES DATA FILES, LOG FILES, . *AND PROGRAM FILES CAN SAVE YOU SOME TYPING. . . dir e:\*.dta file not found . . cd e:\ e:\ . save data1 file data1.dta saved . . *TO OVER-WRITE AN EXISTING DATA FILE, YOU MUST USE . . save data1, replace file data1.dta saved . . dir e:\*.dta 1.2k 9/15/10 10:27 data1.dta . . *YOU CANNOT INPUT 2 DATA SETS AT ONCE. YOU MUST CLEAR THE DATA SET YOU ARE . *WORKING WITH FROM MEMORY BEFORE INPUTTING A NEW DATA SET. . *NOTE THAT USING CLEAR WILL GET RID OF ANY CHANGES THAT YOU MADE TO THE DATA . *SINCE THE LAST "SAVE." . . clear . . . *INPUTING A STATA DATA SET FROM AN EXISTING FILE . *(--THE DEFAULT FTP IS .DTA) . . use e:\data1.dta . . *YOU DO NOT NEED .dta. SINCE e:\ IS THE DEFAULT DIRECTORY, YOU DO NOT NEED e:\ . . clear . use data1 . . *OR, YOU CAN USE THE "OPEN" COMMAND ON THE FILE MENU . . * EXAMINING THE DATA--, , , , . . describe Contains data from data1.dta obs: 4 vars: 5 15 Sep 2010 10:27 size: 96 (99.9% of memory free) ------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------- family float %9.0g person float %9.0g salary float %9.0g hours float %9.0g tall float %9.0g ------------------------------------------------------------------------------- Sorted by: . . list +-----------------------------------------+ | family person salary hours tall | |-----------------------------------------| 1. | 1 1 10 5 4 | 2. | 1 2 20 5 5 | 3. | 2 1 30 6 5.5 | 4. | 2 2 30 7 5 | +-----------------------------------------+ . . *FIRST, USE PULL-DOWN "STATISTICS" MENU . *NOTE THAT IN THE RESULTS WINDOW YOU WILL BE SHOWN THE FORMAT OF THE COMMAND, . *THIS IS USEFUL, BECAUSE YOU CAN USE IT TO SEE WHAT YOU NEED TO WRITE IN THE .DO FILES . . summarize salary Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- salary | 4 22.5 9.574271 10 30 . summarize salary, detail salary ------------------------------------------------------------- Percentiles Smallest 1% 10 10 5% 10 20 10% 10 30 Obs 4 25% 15 30 Sum of Wgt. 4 50% 25 Mean 22.5 Largest Std. Dev. 9.574271 75% 30 10 90% 30 20 Variance 91.66667 95% 30 30 Skewness -.4933822 99% 30 30 Kurtosis 1.628099 . summarize salary hours Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- salary | 4 22.5 9.574271 10 30 hours | 4 5.75 .9574271 5 7 . . tabulate salary salary | Freq. Percent Cum. ------------+----------------------------------- 10 | 1 25.00 25.00 20 | 1 25.00 50.00 30 | 2 50.00 100.00 ------------+----------------------------------- Total | 4 100.00 . tabulate salary tall | tall salary | 4 5 5.5 | Total -----------+---------------------------------+---------- 10 | 1 0 0 | 1 20 | 0 1 0 | 1 30 | 0 1 1 | 2 -----------+---------------------------------+---------- Total | 1 2 1 | 4 . tabulate salary hours | hours salary | 5 6 7 | Total -----------+---------------------------------+---------- 10 | 1 0 0 | 1 20 | 1 0 0 | 1 30 | 0 1 1 | 2 -----------+---------------------------------+---------- Total | 2 1 1 | 4 . . tabulate salary, summarize(hours) | Summary of hours salary | Mean Std. Dev. Freq. ------------+------------------------------------ 10 | 5 0 1 20 | 5 0 1 30 | 6.5 .70710678 2 ------------+------------------------------------ Total | 5.75 .95742711 4 . . *CORRELLATION COEFFICIENTS . corr salary tall (obs=4) | salary tall -------------+------------------ salary | 1.0000 tall | 0.8992 1.0000 . . *CREATING GRAPHS (AND PLOTS) . *GRAPHING IS COMPLEX, HERE ARE SOME EXAMPLES . . *THE BASIC GRAPH IS A FREQUENCY DISTRIBUTION OR HISTOGRAM . . hist salary (bin=2, start=10, width=10) . . *OR, USE THE PULL-DOWN "GRAPHICS" MENU . . *YOU CAN SPECIFY THE NUMBER OF CATEGORIES (AT MOST 50) . . hist salary, bin(5) (bin=5, start=10, width=4) . . *SCATTER PLOTS . . twoway scatter salary tall . *OR . plot salary tall 30 + | * * | | | | | s | a | l | a | r | * y | | | | | | | | 10 + * +----------------------------------------------------------------+ 4 tall 5.5 . . *LINE GRAPHS . twoway line salary tall . *I DO NOT LIKE THE WAY THAT GRAPH LOOKS. . sort tall . twoway line salary tall . . *YOU CAN PRINT THE GRAPH FROM THE GRAPH MENU . *SAVE GRAPH FROM GRAPY MENU (SAVE GRAPH1) . *OR YOU CAN USE THE COMMAND . . graph save graph1 (file graph1.gph saved) . . *TO SEE GRAPH AGAIN . graph use graph1 . . *YOU CAN ALSO USE TO GRAPH TWO VARIABLES . *(STATA ALMOST ALWAYS GIVES YOU TWO OR MORE WAYS TO DO ANYTHING) . plot salary tall 30 + | * * | | | | | s | a | l | a | r | * y | | | | | | | | 10 + * +----------------------------------------------------------------+ 4 tall 5.5 . . . *CREATING NEW VARIABLES--, , , . . gen wage=salary/hours . l wage salary hours +---------------------------+ | wage salary hours | |---------------------------| 1. | 2 10 5 | 2. | 4 20 5 | 3. | 4.285714 30 7 | 4. | 5 30 6 | +---------------------------+ . . save, replace file data1.dta saved . . *TO DISCOVER WHAT YOU CAN DO WITH GEN, . *LOOK IN THE HELP MENU, SEARCH FOR FUNCTIONS. . . gen big=1 . *I DID NOT WANT TO DO THAT . drop big . . gen big=0 . replace big=1 if tall==5.5 (1 real change made) . . *NOTE THE DOUBLE EQUALS SIGN AFTER THE "IF" STATEMENT . l +----------------------------------------------------------+ | family person salary hours tall wage big | |----------------------------------------------------------| 1. | 1 1 10 5 4 2 0 | 2. | 1 2 20 5 5 4 0 | 3. | 2 2 30 7 5 4.285714 0 | 4. | 2 1 30 6 5.5 5 1 | +----------------------------------------------------------+ . . drop big . gen big=0 . replace big=1 if tall>5 (1 real change made) . l +----------------------------------------------------------+ | family person salary hours tall wage big | |----------------------------------------------------------| 1. | 1 1 10 5 4 2 0 | 2. | 1 2 20 5 5 4 0 | 3. | 2 2 30 7 5 4.285714 0 | 4. | 2 1 30 6 5.5 5 1 | +----------------------------------------------------------+ . . drop if family==3 (0 observations deleted) . l +----------------------------------------------------------+ | family person salary hours tall wage big | |----------------------------------------------------------| 1. | 1 1 10 5 4 2 0 | 2. | 1 2 20 5 5 4 0 | 3. | 2 2 30 7 5 4.285714 0 | 4. | 2 1 30 6 5.5 5 1 | +----------------------------------------------------------+ . . *, AND , . . sort person . l +----------------------------------------------------------+ | family person salary hours tall wage big | |----------------------------------------------------------| 1. | 2 1 30 6 5.5 5 1 | 2. | 1 1 10 5 4 2 0 | 3. | 1 2 20 5 5 4 0 | 4. | 2 2 30 7 5 4.285714 0 | +----------------------------------------------------------+ . sort family . l +----------------------------------------------------------+ | family person salary hours tall wage big | |----------------------------------------------------------| 1. | 1 2 20 5 5 4 0 | 2. | 1 1 10 5 4 2 0 | 3. | 2 1 30 6 5.5 5 1 | 4. | 2 2 30 7 5 4.285714 0 | +----------------------------------------------------------+ . . by family: summarize wage -------------------------------------------------------------------------------------------------- -> family = 1 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- wage | 2 3 1.414214 2 4 -------------------------------------------------------------------------------------------------- -> family = 2 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- wage | 2 4.642857 .5050764 4.285714 5 . . *YOU CAN USE THE "by" COMMAND WITH MOST OTHER STATA COMMANDS ALSO, . *SOMETIMES THE COMMAND GOES FIRST, SOMETIMES LAST, YOU NEED . *TO CONSULT THE DOCUMENTATION. . . . *USING "EGEN" PROVIDES ANOTHER WAY TO CALCULATE MEAN WAGE . *BY FAMILY. EGEN SAVES THE VALUES WHILE SUMMARIZE DOES NOT. . . sort family . . egen msal=mean(wage), by(family) . tabulate msal family | family msal | 1 2 | Total -----------+----------------------+---------- 3 | 2 0 | 2 4.642857 | 0 2 | 2 -----------+----------------------+---------- Total | 2 2 | 4 . . *ANOTHER WAY TO DO THIS, USING "COLLAPSE" . . collapse (mean) wage, by(family) . l +-------------------+ | family wage | |-------------------| 1. | 1 3 | 2. | 2 4.642857 | +-------------------+ . . *EGEN VS. GEN . *EGEN CAN DO SUMS, MINIMUMUMS, MAXIMUMS, ETC. . *SOMETIMES EGEN AND GEN HAVE THE SAME FUNCTION NAMES, . *BE CAREFUL, EGEN AND GEN OFTEN DO DIFFERENT THINGS EVEN . *THOUGH THE FUNCTION NAMES ARE THE SAME--READ THE MANUALS . *BEFORE USING EGEN!!!! . *EXAMPLE . . clear . use data1 . . gen twage1=sum(wage) . egen twage2=sum(wage) . l twage1 twage2 +---------------------+ | twage1 twage2 | |---------------------| 1. | 2 15.28571 | 2. | 6 15.28571 | 3. | 10.28571 15.28571 | 4. | 15.28571 15.28571 | +---------------------+ . clear . . . *MORE COMMENTS ON DO-FILES . . *YOU CAN EDIT THE FILE USING ANY WORD . *PROCESSOR, BUT YOU MUST SAVE IT AS A TEXT FILE. . . *A USEFUL FEATURE OF STATA--I CAN SAVE THE LOG FILE, EDIT IT, . * AND THEN USE THE EDITED FILE AS A PROGRAM FILE. . . *WITHIN A PROGRAM FILE, . *THE END OF THE LINE IS THE DEFAULT FOR THE END OF THE COMMAND-YOU SHOULD NOT . *END YOUR COMMAND WITH A ";", AS YOU DID IN SAS. IF YOUR COMMAND GOES BEYOND THE . *END OF THE LINE, YOU NEED TO SET A DIFFERENT END OF LINE DELIMITER. . *FOR EXAMPLE, <# DELIM ;> WILL TELL STATA THAT THE CHARACTER ";" INDCATES THE . *END OF A COMMAND. AFTER THIS COMMAND IS INPUT, YOU WILL NEED TO END EACH . *COMMAND WITH A ";" (AS IN SAS). TO CHANGE BACK, USE THE COMMAND <# DELIM CR>. . . log close log: e:\stata611.log log type: text closed on: 15 Sep 2010, 10:27:47 --------------------------------------------------------------------------------------------------