版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
QuantitativeDataAnalysis:
Statistics第一頁,共89頁。SherlockHolmes"...whilemanisaninsolublepuzzle,intheaggregatehebecomesamathematicalcertainty.Youcan,forexample,neverforetellwhatanyonemanwilldo,butyoucansaywithprecisionwhatanaveragenumberwillbeupto.Individualsvary,butpercentagesremainconstant.Sosaysthestatistician"
第二頁,共89頁。OverviewGeneralStatisticsTheNormalDistributionZ-TestsConfidenceIntervalsT-Tests第三頁,共89頁。GeneralStatistics
~THEGOLDENRULE~StatisticsNEVERreplacethejudgmentoftheexpert.第四頁,共89頁。ApproachtoStatisticalResearchFormulateaHypothesisStatepredictionsofthehypothesisPerformexperimentsorobservationsInterpretexperimentsorobservationsEvaluateresultswithrespecttohypothesisRefinehypothesisandstartagain(Basicallythesameasallotherresearch)第五頁,共89頁。HypothesisTestingH0:
NullHypothesis,statusquoHA:AlternativeHypothesis,researchquestionSo,either:"ThedatadoesnotsupportH0"or"WefailtorejectH0"第六頁,共89頁。TypesofDataContinuousheight,age,timeDiscrete#ofdaysworkedthisweek,#leavesonatreeOrdinal{Good,O.K.,Bad}Nominal{Yes/No},{Teacher/Chemist/Haberdasher}第七頁,共89頁。PicturingTheData
第八頁,共89頁。PieChartsNominal/OrdinalOnlysuitablefordatathataddsupto1Hardtocomparevaluesinthechart第九頁,共89頁。BarChartsNominal/OrdinalEasiertocomparevaluesthanpiechartSuitableforawiderrangeofdata第十頁,共89頁。DotPlotsNominal/Ordinal
RepresentsallthedataDifficulttoread第十一頁,共89頁。BoxPlotsNominal/Ordinal1IQR,3IQROutliers第十二頁,共89頁。ScatterPlotsExcellentforexaminingassociationbetweentwovariables第十三頁,共89頁。HistogramsContinuousDataDivideDataintoranges第十四頁,共89頁。Time-SeriesPlotsTimerelatedDatae.g.StockPrices第十五頁,共89頁。Question1Inatelephonesurveyof68households,whenaskeddotheyhavepets,thefollowingweretheresponses:16:NoPets28:Dogs32:CatsDrawtheappropriategraphictoillustratetheresults!!第十六頁,共89頁。Question1-SolutionTotalnumbersurveyed=68Numberwithnopets=16=>Totalwithpets=(68-16)=52Buttotal28dogs+32cats=60=>Sosomepeoplehavebothcatsanddogs第十七頁,共89頁。Question1-SolutionHowmany?Itmustbe(60-52)=8peopleNopets=16Dogs=20Cats=24Both=8-------------------------Total=68第十八頁,共89頁。Question1-SolutionGraphic:PieChartorBarChart第十九頁,共89頁。TheLiteraryDigestPoll1936USPresidentialElectionAlfLandon(R)vs.FranklinD.Roosevelt(D)第二十頁,共89頁。TheLiteraryDigestPollLiteraryDigesthadbeenconductingsuccessfulpresidentialelectionpollssince1916Theyhadcorrectlypredictedtheoutcomesofthe1916,1920,1924,1928,and1932electionsbyconductingpolls.Thesepollswerealucrativeventureforthemagazine:readerslikedthem;newspapersplayedthemup;andeach“ballot”includedasubscriptionblank.第二十一頁,共89頁。TheLiteraryDigestPollTheysentout10millionballotstotwogroupsofpeople:prospectivesubscribers,“whowerechieflyupper-andmiddle-incomepeople”alistdesignedto"correctforbias"fromthefirstlist,consistingofnamesselectedfromtelephonebooksandmotorvehicleregistries第二十二頁,共89頁。TheLiteraryDigestPollResponserate:approximately25%,or2,376,523responsesResult:Landoninalandslide(predicted57%ofthevote,Rooseveltpredicted40%)Electionresult:Rooseveltreceivedapproximately60%ofthevote第二十三頁,共89頁。TheLiteraryDigestPollPOSSIBLECAUSESOFERRORSelectionBias:Bytakingnamesandaddressesfromtelephonedirectories,surveysystematicallyexcludedpoorvoters.Republicansweremarkedlyoverrepresentedin1936,Democratsdidnothaveasmanyphones,
notaslikelytodrivecars,anddidnotreadtheLiteraryDigest“SamplingFrame”istheactualpopulationofindividualsfromwhichasampleisdrawn:Selectionbiasresultswhensamplingframeisnotrepresentativeofthepopulationofinterest第二十四頁,共89頁。TheLiteraryDigestPollPOSSIBLECAUSESOFERRORNon-responseBias:Becauseonly20%of10millionpeoplereturnedsurveys,non-respondentsmayhavedifferentpreferencesfromrespondentsIndeed,respondentsfavoredLandonGreaterresponseratesreducetheoddsofbiasedsamples第二十五頁,共89頁。TerminologyPopulation:isasetofentitiesconcerningwhichstatisticalinferencesaretobedrawn.Sample:anumberofindependentobservationsfromthesameprobabilitydistributionParameter:thedistributionofarandomvariableasbelongingtoafamilyofprobabilitydistributions,distinguishedfromeachotherbythevaluesofafinitenumberofparametersBias:afactorthatcausesastatisticalsampleofapopulationtohavesomeexamplesofthepopulationlessrepresentedthanothers.第二十六頁,共89頁。Outliers(andtheirtreatment)An"outlier"isanobservationthatdoesnotfitthepatternintherestofthedataCheckthedataCheckwiththemeasurerIfreasontobelieveitisNOTreal,changeitifpossible,otherwiseleaveitout(butnote).Ifreasontobelieveitisreal,leaveitoutandnote.第二十七頁,共89頁。TheMeanTheMean(Arithmetic)Themeanisdefinedasthesumofalltheelements,dividedbythenumberofelements.Thestatisticalmeanofasetofobservationsistheaverageofthemeasurementsinasetofdata第二十八頁,共89頁。TheVarianceButtherecanbealotofvarianceinindividualelements,e.g.teachersalariesAverage=€22,000Lowest=€12,000Difference=12,000-22,000=-10,000第二十九頁,共89頁。TheVarianceSumof(Sample-Average)=0,thusweneedtodefinevariance.Thevarianceofasetofdataisacumulativemeasureofthesquaresofthedifferenceofallthedatavaluesfromthemeandividedbysamplesizeminusone.第三十頁,共89頁。StandardDeviationThestandarddeviationofasetofdataisthepositivesquarerootofthevariance.-1-1第三十一頁,共89頁。Question2Findthemeanandvarianceofthefollowingsamplevalues:36,41,43,44,46第三十二頁,共89頁。Question2Mean:(36+41+43+44+46)/5=42Variance
DifferenceSquare36–42=-63641–42=-1143–42=1144–42=2446–42=416----------------------------------------5858/(5-1)=58/4=14.5第三十三頁,共89頁。TheNormalDistribution第三十四頁,共89頁。第三十五頁,共89頁。DensityCurves:Properties第三十六頁,共89頁。TheNormalDistributionThegraphhasasinglepeakatthecenter,thispeakoccursatthemeanThegraphissymmetricalaboutthemeanThegraphnevertouchesthehorizontalaxisTheareaunderthegraphisequalto1第三十七頁,共89頁。CharacterizationAnormaldistributionisbell-shapedandsymmetric.Thedistributionisdeterminedbythemeanmu,m,andthestandarddeviationsigma,s.Themeanmucontrolsthecenterandsigmacontrolsthespread.第三十八頁,共89頁。第三十九頁,共89頁。第四十頁,共89頁。第四十一頁,共89頁。第四十二頁,共89頁。第四十三頁,共89頁。第四十四頁,共89頁。第四十五頁,共89頁。第四十六頁,共89頁。第四十七頁,共89頁。TheNormalDistributionIfavariableisnormallydistributed,then:withinonestandarddeviationofthemeantherewillbeapproximately68%ofthedatawithintwostandarddeviationsofthemeantherewillbeapproximately95%ofthedatawithinthreestandarddeviationsofthemeantherewillbeapproximately99.7%ofthedata第四十八頁,共89頁。TheNormalDistribution第四十九頁,共89頁。Why?Onereasonthenormaldistributionisimportantisthatmanypsychologicalandorgansationalvariablesaredistributedapproximatelynormally.Measuresofreadingability,introversion,jobsatisfaction,andmemoryareamongthemanypsychologicalvariablesapproximatelynormallydistributed.Althoughthedistributionsareonlyapproximatelynormal,theyareusuallyquiteclose.第五十頁,共89頁。Why?Asecondreasonthenormaldistributionissoimportantisthatitiseasyformathematicalstatisticianstoworkwith.Thismeansthatmanykindsofstatisticaltestscanbederivedfornormaldistributions.Almostallstatisticaltestsdiscussedinthistextassumenormaldistributions.Fortunately,thesetestsworkverywellevenifthedistributionisonlyapproximatelynormallydistributed.Sometestsworkwellevenwithverywidedeviationsfromnormality.第五十一頁,共89頁。OneTail/TwoTailImagineweundertookanexperimentwherewemeasuredstaffproductivitybeforeandafterweintroducedacomputersystemtohelprecordsolutionstocommonissuesofworkAverageproductivitybefore=6.4Averageproductivityafter=9.2第五十二頁,共89頁。OneTail/TwoTailBefore=6.4After=9.2010第五十三頁,共89頁。OneTail/TwoTailIsthisasignificantdifference?Before=6.4After=9.2100第五十四頁,共89頁。OneTail/TwoTailorisitmorelikelyasamplingvariation?Before=6.4After=9.2100第五十五頁,共89頁。OneTail/TwoTailBefore=6.4After=9.2100第五十六頁,共89頁。OneTail/TwoTailBefore=6.4After=9.2100第五十七頁,共89頁。OneTail/TwoTailBefore=6.4After=9.2Howmanystandarddevaitionsfromthemeanisthis?100第五十八頁,共89頁。OneTail/TwoTailBefore=6.4After=9.2Howmanystandarddevaitionsfromthemeanisthis?100andisitstatisticallysignificant?第五十九頁,共89頁。OneTail/TwoTailBefore=6.4After=9.2100σσσ第六十頁,共89頁。OneTail/TwoTailOne-TailedH0:m1>=m2HA:m1<m2Two-TailedH0:m1=m2HA:m1<>m2第六十一頁,共89頁。STANDARDNORMALDISTRIBUTIONNormalDistributionisdefinedasN(mean,(Stddev)^2)StandardNormalDistributionisdefinedasN(0,(1)^2)第六十二頁,共89頁。STANDARDNORMALDISTRIBUTIONUsingthefollowingformula:willconvertanormaltableintoastandardnormaltable.第六十三頁,共89頁。ExerciseIftheaverageIQinagivenpopulationis100,andthestandarddeviationis15,whatpercentageofthepopulationhasanIQof145orhigher?第六十四頁,共89頁。AnswerP(X>=145)P(Z>=((145-100)/15))P(Z>=3)Fromtables:99.87%arelessthan3=>0.13%ofpopulation第六十五頁,共89頁。TrendsinStatisticalTestsusedinResearchPapersHistoricallyCurrentlyTestingEstimationHypothesisTestsQuotingP-ValuesConfidenceIntervalsResultsin:Accept/RejectResultsin:p-ValueResultsin:Approx.Mean第六十六頁,共89頁。ConfidenceIntervals
Aconfidenceintervalisusedtoexpresstheuncertaintyinaquantitybeingestimated.Thereisuncertaintybecauseinferencesarebasedonarandomsampleoffinitesizefromapopulationorprocessofinterest.Tojudgethestatisticalprocedurewecanaskwhatwouldhappenifweweretorepeatthesamestudy,overandover,gettingdifferentdata(andthusdifferentconfidenceintervals)eachtime.第六十七頁,共89頁。ConfidenceIntervals
Ifweknowthetruepopulationmeanandsamplenindividuals,weknowthatifthedataisnormallydistributed,Averagemeanofthesensampleshasa95%chanceoffallingintotheinterval第六十八頁,共89頁。ConfidenceIntervals
wherethestandarderrorfora95%CImaybecalculatedasfollows;第六十九頁,共89頁。Example1第七十頁,共89頁。Example1DoesFF-PD-GhavemoreofthepopularvotethanFG-L?Inarandomsampleof721respondents:382FF-PD-G339FG-LCanweconcludethatFF-PD-Ghasmorethan50%ofthepopularvote?第七十一頁,共89頁。Example1-SolutionSampleproportion=p=382/721=0.53Samplesize=n=721StandardError=(SqRt((p(1-p)/n)))=0.0295%ConfidenceInterval0.53+/-1.96(0.02)0.53+/-0.04[0.49,0.57]Thus,wecannotconcludethatFF-PD-Ghadmoreofthepopularvote,sincethisintervalspans50%.So,wesay:"thedataareconsistentwiththehypothesisthatthereisnodifference"
第七十二頁,共89頁。Example2第七十三頁,共89頁。Example2DidObamahavemoreofthepopularvotethanMcCain?Inarandomsampleof1000respondents532Obama468McCainCanweconcludethatObamahadmorethan50%ofthepopularvote?第七十四頁,共89頁。Example2–95%CISampleproportion=p=532/1000=0.532Samplesize=n=1000StandardError=(SqRt((p(1-p)/n)))=0.01695%ConfidenceInterval0.532+/-1.96(0.016)0.532+/-0.03136[0.5006,0.56336]Thus,wecanconcludethatObamahadmoreofthepopularvote,sincethisintervaldoesnotspan50%.So,wesay:"thedataareconsistentwiththehypothesisthatthereisadifferenceina95%CI"
第七十五頁,共89頁。Example2–99%CISampleproportion=p=532/1000=0.532Samplesize=n=1000StandardError=(SqRt((p(1-p)/n)))=0.01699%ConfidenceInterval0.532+/-2.58(0.016)0.532+/-0.041[0.491,0.573]Thus,wecannotconcludethatObamahadmoreofthepopularvote,sincethisintervaldoesspan50%.So,wesay:"thedataareconsistentwiththehypothesisthatthereisnodifferenceina99%CI"
第七十六頁,共89頁。Example2–99.99%CISampleproportion=p=532/1000=0.532Samplesize=n=1000StandardError=(SqRt((p(1-p)/n)))=0.01699.99%ConfidenceInterval0.532+/-3.87(0.016)0.532+/-0.06[0.472,0.592]Thus,wecannotconcludethatObamahadmoreofthepopularvote,sincethisintervaldoesspan50%.So,wesay:"thedataareconsistentwiththehypothesisthatthereisnodifferenceina99.99%CI"
第七十七頁,共89頁。T-Tests
第七十八頁,共89頁。OneTail/TwoTailT-testZ-test第七十九頁,共89頁。T-Testspowerfulparametrictestforcalculatingthesignificanceofasmallsamplemeannecessaryforsmallsamplesbecausetheirdistributionsarenotnormalonefirsthastocalculatethe"degreesoffreedom"第八十頁,共89頁。T-TestsThet-testisoftencalledtheStudent'st-test.ItwascreatedbyachiefbrewernamedWilliamS.GossettwhoworkedfortheGuinnessBrewery.Hediscoveredthisstatisticaspartofhisworkinthebrewerytocomparethedifferentbrewingprocessesforchangingrawmaterialsintobeer.GuinnessdidnotallowitsemployeestopublishresultsbutthemanagementdecidedtoallowGossetttopublishitunderapseudonym-Student.Hencewehaveth
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 2025年浙江省嘉興市海關(guān)公開招聘人員備考題庫及一套完整答案詳解
- 2025年中山大學(xué)腫瘤防治中心放療科何立儒教授課題組自聘技術(shù)員招聘?jìng)淇碱}庫及一套答案詳解
- 2025中信國(guó)安實(shí)業(yè)集團(tuán)有限公司專業(yè)技術(shù)人員常態(tài)化招聘11人模擬筆試試題及答案解析
- 貴陽市觀山湖區(qū)第八中學(xué)2026年春季學(xué)期臨聘教師招聘?jìng)淇碱}庫及1套參考答案詳解
- 2025年阿拉爾市匯農(nóng)市場(chǎng)運(yùn)營(yíng)管理有限公司招聘?jìng)淇碱}庫含答案詳解
- 2025年杭州之江灣股權(quán)投資基金管理有限公司招聘?jìng)淇碱}庫及答案詳解1套
- 2025年天津北海油人力資源咨詢服務(wù)有限公司招聘外包工作人員備考題庫含答案詳解
- 2025浙江寧波國(guó)富商業(yè)保理有限公司招聘1人筆試備考重點(diǎn)題庫及答案解析
- 2025貴州黔西南州人民醫(yī)院秋季赴省內(nèi)外高校引進(jìn)高層次人才和急需緊缺人才16人筆試備考重點(diǎn)題庫及答案解析
- 2025年武漢國(guó)有企業(yè)招聘泛半導(dǎo)體產(chǎn)業(yè)園招商運(yùn)營(yíng)專業(yè)人才5人備考題庫含答案詳解
- 《臺(tái)式香腸烤制方法》課件
- 常用計(jì)量值控制圖系數(shù)表
- 馬克思主義經(jīng)典著作選讀智慧樹知到課后章節(jié)答案2023年下四川大學(xué)
- 慢性阻塞性肺疾病急性加重期機(jī)械通氣
- 傳染病學(xué)智慧樹知到課后章節(jié)答案2023年下溫州醫(yī)科大學(xué)
- 濕熱滅菌驗(yàn)證方案及報(bào)告
- 工業(yè)區(qū)位因素及其變化高一地理人教版(2019)必修二
- 2022年5月CATTI英語三級(jí)口譯實(shí)務(wù)真題(最全回憶版)
- 畫法幾何知到章節(jié)答案智慧樹2023年浙江大學(xué)
- 少年宮剪紙社團(tuán)活動(dòng)記錄
- 生命科學(xué)前沿技術(shù)智慧樹知到答案章節(jié)測(cè)試2023年蘇州大學(xué)
評(píng)論
0/150
提交評(píng)論