Scenario
You are employed as a research assistant in the United States Department of Transportation and your supervisor, S. Anestori, asks you to study commuter characteristics in two major cities, St. Louis and Atlanta. For this purpose you take a random sample of 1000 commuters 500 in each city.
For each commuter you collect data on several variables (see the Variable INFO tab) but for the reports you will be preparing you decide to focus on one key numerical variable, Time (i.e., the commuting time in minutes), and one key categorical variable, Gender (i.e., whether the commuter is a male or a female). You also decide to use a grouping categorical variable City since each trip was made either in Atlanta or in St. Louis. This will enable you to make comparisons of both commuting time and the gender of the commuter based on the city in which the commuter resides Atlanta or St. Louis. In addition, you have also selected the numerical variable Distance (i.e., commuting distance in miles) to develop a simple linear regression model to predict commuting time in minutes.
