Project_C
项目C
Zagat’s Restaurant Ratings
Zagat的餐馆评级
Zagat’s publishes restaurant ratings for various locations in the United States. The file RESTRATE.xls contains the Zagat rating for food, décor, service, and the price per person for a sample of 53 restaurants located in New York City and 53 restaurants located in Long Island. Suppose you wanted to develop a regression model to predict the price per person based on a variable that represents the sum of the ratings for food, décor, and service. Using EXCEL or PHStat2, answer the following: The following is a minimum guideline about what you should analyze. You need to perform more in-depth analysis for a better grade than a C. For example, you may have to use such tools as confidence interval estimates, one or two-sample tests on the data to improve the quality of your report.
Zagat出版的餐厅等级是为位于在美国的不同地点。RESTRATE.xls文件包含食物,装潢,服务,举例位于纽约市53家餐馆的每人的价格和位于长岛的53家餐馆Zagat评级。假设你想开发一个回归模型预测价格每人一个变量的基础上,代表食品,装饰和服务的收视率的总和。使用EXCEL或PHSTAT2,回答以下问题:以下是什么,你应该分析的最低指引。你需要进行更深入的分析,得到比C 更好的成绩。例如,您可能需要使用这些工具的置信区间估计,一个或两个样本检验的数据质量提高你的报告。
http://www.ukassignment.org/mgzydx/
a) State your statistical objective for this data set.
国家统计的这组数据的目的
b) Perform exploratory data analysis, such as numerical measures or the box-and-whisker plot for this data set. 进行探索性数据分析,如数值措施和/或该数据集箱须图。
c) Construct a scatter diagram of price against summated rating. Describe the relationship that you may see. Does this appear to have some association (linear or non-linear)? 构建对变量的散点图。描述的关系,你可能会看到。做这些出现有一定的关联(线性的或非线性的)
d) Construct a scatter diagram of price against each of food, décor, and service separately. Describe the relationship that you see from each diagram. Does any of these appear to have some association?
单独构建散点图价格对每个食品,装饰和服务。描述的关系,你看从每个图。是否有这些似乎有某种关联?
e) From (a) and (b), does any simple linear model appear to hold? You may want to run some testing to substantiate why or why not.
从(a)和(b)中,没有出现任何简单的线性模型持有?您可能需要运行一些测试,以证明为什么能或不能。
f) Does multiple regression model appear to hold? You may want to run some testing to substantiate why or why not. If so, find the regression equation to predict price from location.
多元回归模型是否出现持有?您可能需要运行一些测试,以证明为什么能或不能。如果是这样,找到回归方程来预测价格从位置。
g) Suppose now that you want to develop a regression model to predict the price per person based on a variable that represents the sum of the ratings for food, décor, and service, and on location (New York City (Locate = 0) or Long Island (Locate =1)).
假设现在要制定一个回归模型来预测一个变量,它代表了食品,装饰和服务的收视率总和的基础上人均价格,地点(纽约市(找到=0)或长岛(找到=1))。
Is the regression significant? Report the results of the appropriate test, and interpret its meaning.
Are any other variables (Neighborhood or Cuisine) useful for the regression analysis? For example, when you classify Cuisine as Asian, American, Others, can you use them as dummy variables?
是回归显着吗?相应的测试报告的结果,并解释其含义。任何其他变量(邻居或美食)回归分析有用吗?例如,当您分类美食,亚洲,美洲,其他人,你可以使用它们作为虚拟变量?
h) Does summated rating have significant impact on price, following adjustment for location? In particular, are New York City’s restaurants significantly more expensive or significantly less on average than those in Long Island?
不累加等级,调整位置后,对价格的影响有显着吗?特别是纽约市的餐厅明显更昂贵的或明显比在长岛平均少吗?
i) Include an interaction term in the model and, at the 0.05 level of significance, determine whether it makes a significant contribution to the model.
将交互项模型,并在0.05的显着性水平,确定它是否使模型的一个重大贡献。
j) Summarize and comment on your results.
总结和评论你的结果。
Project_D
项目D
Salary Survey after a MBA?
MBA后的薪资调查
The file salaryof newmba.xls contains a recent survey about new MBA graduates from the top 40 MBA schools in the U.S. Suppose you wanted to develop a regression model to predict the salary/bonus based on variables that represent various factors about each MBA school. For example, if a student wants to go to a specific MBA institution and finances his/her education through student loans, is it worth doing it? Or looking at the employment rate within 6 month after his/her graduation is it worth doing it? Using EXCEL or PHStat2, answer the following: The following is a minimum guideline about what you should analyze. You need to perform more in-depth analysis for a better grade than a C. For example, you may have to use such tools as confidence interval estimates, one or two-sample tests on the data to improve the quality of your report.
在美国前40名的MBA学校的文件salaryof newmba.xls包含最近的一项调查的新MBA毕业生,如果你想开发一个回归模型预测变量,代表每个MBA学校有关的各种因素的基础上的工资/奖金。例如,如果一个学生想要去一个特定的MBA机构,他/她的教育提供资金,通过助学贷款,是值得做吗?或寻找他/她毕业后6个月内的就业率是值得做吗?使用EXCEL或PHSTAT2,回答以下问题:以下是什么,你应该分析的最低指引。你需要进行更深入的分析,得到比C更好的成绩。例如,您可能需要使用这些工具的置信区间估计,一个或两个样本检验的数据质量提高你的报告。
a) State statistical objective(s) for the project.
国家统计目标(s)的项目。
b) Perform EDA (Section 3.4) including numerical descriptive measures.
执行EDA(3.4节),包括数值描述措施。
c) Construct scatter diagrams for pairs of variables. Do any of these appear to have some association?
构建对变量的散点图。做任何这些似乎有某种关联?
d) From (b) and (c), does any simple linear model appear to hold? You may want to run some testing to substantiate your findings.
在(b)和(c)中,没有出现任何简单的线性模型持有?您可能需要运行一些测试,以证明你的发现。
e) Does multiple regression model appear to hold? You may want to run some testing to substantiate why or why not. If so, is there more than one variable that may be used as a dependent variable?
多元回归模型是否出现持有?您可能需要运行一些测试,以证明为什么能或不能。如果是的话,有可能被用来作为一个因变量中的一个以上的变量?
f) Is the regression significant? Report the results of the appropriate test, and interpret its meaning.
为回归是否显着?相应的测试报告的结果,并解释其含义。
g) Suppose now that you want to develop a regression model based on your choice of dependent variables against various independent variables (of your choice), do the data on region play any role in your model? Did you have to modify the region data in such a way that the location of each school makes a significant contribution to your model? How about including the data on the type of schools (public vs. private)?
假设现在要制定一个回归模型,根据您选择的因变量对各种独立变量(你的选择)上的数据区,不要在你的模型中扮演任何角色?你是否也有这样一种方式,每所学校的位置,使你的模型的一个重大贡献,该区域中的数据进行修改吗?包括学校的类型(公共与私人)上的数据如何?
h) Do you find any interaction term in the model that makes a significant contribution to the model?
你发现任何交互项的模型,使模型的一个重大贡献?
i) Summarize and comment on your results.
总结和评论你的结果。
|