*STATA CAN BE USED INTERACTIVELY OR RUN FROM AN OUTSIDE PROGRAM. *LET'S START INTERACTIVELY. *THE INTERACTIVE COMMANDS MUST BE WRITTEN IN THE STATA COMMAND WINDOW. *COMMENTS BEGIN WITH A * *IF THE LINE BEGINS WITH A * IT WILL NOT BE IMPLEMENTED *STATA COMMANDS must be in lower case letters. *THE FIRST THING YOU NEED TO DO IS TO CREATE A LOG FILE. IF YOU DO NOT, *THEN THERE WILL BE NO RECORD OF YOUR WORK!!!! log using e:\class.log *STATA PUTS ALL DATA IT USES IN MEMORY, AND YOU NEED TO MAKE SURE *THAT IT HAS ENOUGH MEMORY AVAILABLE FOR THE DATA YOU ARE TO USE. set memory 34m *INPUTING DATA: YOU MAY INPUT DATA DIRECTLY, INPUT DATA FROM AN EXTERNAL *FILE, OR USE THE DATA EDITOR. LET'S BEGIN BY INPUTING DIRECTLY. input family person salary hours 1 1 10 5 1 2 20 5 2 1 30 6 2 2 30 7 end *WINDOWS *COMMAND WINDOW *RESULTS WINDOW * PRINT RESULTS (ON FILE MENU) *VARIABLES WINDOW *REVIEW WINDOW (IF YOU CLICK ON A COMMAND IN ANY WINDOW, IT SHOWS UP IN *THE COMMAND WINDOW). *HELP MENU * CONTENTS * SEARCH * STATA COMMAND * WEB SITE: USER SUPPORT, RESOURCES AND CLASSES FOR LEARNING MORE, * tech-support@stata.com IS VERY GOOD. *DATA EDITOR *YOU CAN USE THE DATA EDITOR TO EXAMINE THE DATA, AND TO *CHANGE DATA ON A CASE-BY-CASE BASIS *"SORT" AND "PRESERVE" WITHIN THE DATA EDITOR ALLOW YOU TO SORT AND SAVE *THE CHANGES YOU MAKE * EDITING * SORT * DELETE * RESTORE (an "undo" command) * PRESERVE save data1.dta * WILL SAVE THE DATA AS DIR:\FNM.FTP (IT WILL * NOT WRITE OVER THE CURRENT DATA SET. *USE EXPLORE TO SEE LOCATION OF data1.dta *CHANGING THE DEFAULT DIRECTORY WHERE STATA LOOKS FOR AND WRITES DATA FILES, LOG FILES, *AND PROGRAM FILES CAN SAVE YOU SOME TYPING. dir e:\*.dta cd e:\ save data1 dir e:\*.dta *YOU CANNOT INPUT 2 DATA SETS AT ONCE. YOU MUST CLEAR THE DATA SET YOU ARE *WORKING WITH FROM MEMORY BEFORE INPUTTING A NEW DATA SET. *NOTE THAT USING CLEAR WILL GET RID OF ANY CHANGES THAT YOU MADE TO THE DATA *SINCE THE LAST "SAVE." clear *COMBINING DATA SETS WITH APPEND AND MERGE * ADDING A NEW OBSERVATION WITH "APPEND." * "APPEND" APPENDS OR STACKS THE DATA (ADDS NEW OBSERVATIONS TO * DATA SETS WITH THE SAME VARIABLES. input family person salary hours 3 1 60 25 end save data2 clear *INPUTING A STATA DATA SET FROM AN EXISTING FILE (--THE DEFAULT FTP IS .DTA) use e:\data1.dta *YOU DO NOT NEED .dta. SINCE e:\ IS THE DEFAULT DIRECTORY, YOU DO NOT NEED e:\ clear use data1.dta *OR, YOU CAN USE THE "OPEN" COMMAND ON THE FILE MENU append using data2 save *OOPS. SAVE WILL NOT AUTOMATICALLY WRITE OVER THE CURRENT DATA SET. save, replace *LOOK AT DATA WITH DATA EDITOR, OR USE OR list * "MERGE" ADDS VARIABLES TO THE OBSERVATIONS IN THE DATA SETS. clear input family income 1 100 2 300 3 100 end merge family using data1 *OOPS. BOTH DATA SETS MUST BE SORTED BY THE MERGE VARIABLE(S). *REMEMBER, THE MERGE VARIABLES MUST HAVE THE SAME NAME ON BOTH DATA SETS. sort family save data3 clear use data1 sort family merge family using data3 l *MERGE CREATES A VARIABLE CALLED _MERGE THAT YOU MUST *DROP BEFORE YOU CAN MERGE ANY OTHER DATA SET. *THE "drop" COMMAND ERASES A VARIABLE OR VARIABLES, *"drop" works with the "if" statements that we will talk about later. *THE "keep" COMMAND WILL KEEP ONLY THE VARIABLES YOU PUT AFTER "keep." drop _merge save, replace *YOU CAN MERGE BY MORE THAN ONE VARIABLE clear input family person tall 1 1 4 1 2 5 2 1 5.5 2 2 4.5 3 1 6 end save data4 sort family person save, replace clear use data1 sort family person merge family person using data4 drop _merge save, replace l *CREATING GRAPHS *GRAPHING IS COMPLEX, HERE ARE SOME EXAMPLES *THE BASIC GRAPH IS A FREQUENCY DISTRIBUTION OR HISTOGRAM graph salary * OR graph salary, hist *YOU CAN SPECIFY THE NUMBER OF CATEGORIES (AT MOST 50) graph salary, bin(2) *SLIGHTLY MORE COMPLICATED GRAPHS graph salary tall graph salary tall, c(ll) *I DO NOT LIKE THE WAY THAT GRAPH LOOKS. sort tall graph salary tall, c(ll) *YOU CAN PRINT THE GRAPH FROM THE FILE MENU *SAVE GRAPH FROM FILE MENU (SAVE GRAPH1) *OR YOU CAN USE THE COMMAND clear *TO SEE GRAPH AGAIN graph using graph1 * EXAMINING THE DATA--, AND clear use data1 describe summarize salary hours tabulate salary tabulate salary tall tabulate salary hours *SIMPLE OLS REGRESSIONS reg salary hours tall *TESTING HYPOTHESES (VERY EASY AND LOGICAL IN MOST STATA APPLICATIONS) test hours test hours=0 test hours=1 test hours=tall test hours tall testparm hours tall *RUNNING A PROGRAM FROM AN EXTERNAL FILE (A .do FILE). *OPEN DO-FILE EDITOR *COPY AND PASTE IN THE LAST 7 LINES OF STATA COMMANDS *DO AND RUN WILL BOTH RUN COMMANDS, *WITH DO THE COMMANDS SHOW UP IN THE RESULTS WINDOW, *WITH RUN THE RESULTS DO NOT SHOW UP IN THE RESULTS WINDOW *SAVE THIS FILE AS *EXIT DO-FILE EDITOR *YOU CAN RUN A DO-FILE DIRECTLY FROM THE COMMAND MENU do e:\test.do *OR do test *MORE COMMENTS ON DO-FILES *YOU CAN EDIT THE FILE USING ANY WORD *PROCESSOR, BUT YOU MUST SAVE IT AS A TEXT FILE. *A USEFUL FEATURE OF STATA--I CAN SAVE THE LOG FILE, EDIT IT, * AND THEN USE THE EDITED FILE AS A PROGRAM FILE. *WITHIN A PROGRAM FILE, *THE END OF THE LINE IS THE DEFAULT FOR THE END OF THE COMMAND-YOU SHOULD NOT *END YOUR COMMAND WITH A ";", AS YOU DID IN SAS. IF YOUR COMMAND GOES BEYOND THE *END OF THE LINE, YOU NEED TO SET A DIFFERENT END OF LINE DELIMITER. *FOR EXAMPLE, <# DELIM ;> WILL TELL STATA THAT THE CHARACTER ";" INDCATES THE *END OF A COMMAND. AFTER THIS COMMAND IS INPUT, YOU WILL NEED TO END EACH *COMMAND WITH A ";" (AS IN SAS). TO CHANGE BACK, USE THE COMMAND <# DELIM CR>. *CREATING NEW VARIABLES--, , , gen wage=salary/hours l wage salary hours *TO DISCOVER WHAT YOU CAN DO WITH GEN, *LOOK IN THE HELP MENU, SEARCH FOR FUNCTIONS. gen big=1 *I DID NOT WANT TO DO THAT drop big gen big=0 replace big=1 if height==6 *NOTE THE DOUBLE EQUALS SIGN AFTER THE "IF" STATEMENT l drop big gen big=0 replace big=1 if height>5 l drop if family==3 l *, AND , sort person l sort family l by family: summarize wage *YOU CAN USE THE "by" COMMAND WITH MOST OTHER STATA COMMANDS ALSO, *SOMETIMES THE COMMAND GOES FIRST, SOMETIMES LAST, YOU NEED *TO CONSULT THE DOCUMENTATION. *USING "EGEN" PROVIDES ANOTHER WAY TO CALCULATE MEAN WAGE *BY FAMILY. EGEN SAVES THE VALUES WHILE SUMMARIZE DOES NOT. sort family egen msal=mean(wage), by(family) tabulate msal family *ANOTHER WAY TO DO THIS, USING "COLLAPSE" collapse (mean) wage, by(family) l *EGEN VS. GEN *EGEN CAN DO SUMS, MINIMUMUMS, MAXIMUMS, ETC. *SOMETIMES EGEN AND GEN HAVE THE SAME FUNCTION NAMES, *BE CAREFUL, EGEN AND GEN OFTEN DO DIFFERENT THINGS EVEN *THOUGH THE FUNCTION NAMES ARE THE SAME--READ THE MANUALS *BEFORE USING EGEN!!!! *EXAMPLE gen twage1=sum(wage) egen twage2=sum(wage) l twage1 twage2 exit *OOPS. STATA WILL NOT LET YOU EXIT WITH DATA IN MEMORY. clear exit