從你的數(shù)據(jù)倉庫發(fā)掘隱藏財(cái)富_第1頁
從你的數(shù)據(jù)倉庫發(fā)掘隱藏財(cái)富_第2頁
從你的數(shù)據(jù)倉庫發(fā)掘隱藏財(cái)富_第3頁
從你的數(shù)據(jù)倉庫發(fā)掘隱藏財(cái)富_第4頁
從你的數(shù)據(jù)倉庫發(fā)掘隱藏財(cái)富_第5頁
已閱讀5頁,還剩20頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡介

AnIntroductiontoDataMininginmining,theextractionhiddenpredictiveinformationfromlarge

databases,isapowerfulnewtechnologywithpotentialtohelp

paniesfocusonthemostimportantinformationintheirdatawarehouses.mining

toolspredictfuturetrendsbehaviors,allowingbusinessestomakeproactive,

knowledge-drivendecisions.Theautomated,prospectiveanalysesofferedbydata

miningmovebeyondtheanalysesofeventsprovidedbyretrospectivetools

typicalofdecisionsupportsystems.miningtoolscananswerbusinessquestionsthattraditionallywere

tootimeconsumingtoresolve.Theyscourdatabasesforhiddenpatterns,findingpredictiveinformationthatexpertsmaymissbecauseitoutsidetheirexpectations.感謝閱讀Mostpaniesalreadycollectandrefinemassivequantitiesofdata.miningtechniquescanbeimplementedrapidlyonexistingsoftwareand

hardwareplatformstoenhancethevalueofexistinginformationresources,can

beintegratednewproductssystemsastheybroughton-line.When

implementedonhighperformanceclient/serverparallelprocessingputers,dataminingtoolscananalyzemassivedatabasestodeliver

answerstoquestionssuchas,"Whichclientsaremostlikelytorespondtomypromotionalmailing,why?"精品文檔放心下載Thiswhitepaperprovidesanintroductiontothebasictechnologiesdata

mining.Examplesofprofitableapplicationsillustrateitsrelevance

tobusinessenvironmentaswellasadescriptionofhow

warehousearchitecturescanevolvetodeliverthevalueofdataminingto

users.精品文檔放心下載感謝閱讀miningtechniquesaretheresultofaprocessofresearchand

productdevelopment.Thisevolutionbeganwhenbusinessdatafirst

storedputers,continuedwithimprovementsindataaccess,more

recently,generatedtechnologiesthatallowuserstonavigatethrough

theirinrealtime.miningtakesthisevolutionaryprocessbeyond

retrospectivedataaccessandnavigationtoprospective感謝閱讀謝謝閱讀謝謝閱讀proactiveinformationdelivery.miningisreadyforapplicationin

thebusinessmunitybecauseitissupportedthreetechnologiesthat

arenowsufficientlymature:感謝閱讀感謝閱讀謝謝閱讀Massivedatacollection?

?Powerfulmultiprocessorputers

?miningalgorithms謝謝閱讀mercialdatabasesaregrowingunprecedentedrates.ArecentMETAGroup

surveyofdatawarehouseprojectsfoundthat19%ofrespondentsarebeyondthegigabytelevel,59%expecttotherebysecondquarter1996.1Insomeindustries,suchasretail,thesenumberscanbemuchlarger.The

acpanyingneedforimprovedputationalenginescanbeinacost-

effectivemannerparallelmultiprocessorputertechnology.Datamining

algorithmsembodytechniquesthatexistedforleast

10years,butonlyrecentlybeenimplementedasmature,reliable,

understandabletoolsthatconsistentlyoutperformstatistical

methods.謝謝閱讀謝謝閱讀Intheevolutionfrombusinessdatatobusinessinformation,eachnewstepbuilt

uponthepreviousone.Forexample,dynamicaccesscriticalfordrill-throughindatanavigationapplications,andtheabilitytostore

largedatabasesiscriticaltodatamining.Fromtheuser’spointview,thefourstepslistedinTable1wererevolutionarybecausethey

allowedbusinessquestionstoansweredaccuratelyand

quickly.精品文檔放心下載EvolutionaryBusinessQuestionEnablingProductCharacteristics

StepTechnologiesProviders精品文檔放心下載"Whatmytotalputers,tapes,IBM,CDCRetrospective,

Collectionrevenueinthelastdisksstaticfiveyears?"delivery

(1960s)謝謝閱讀Access"WhatunitRelationalOracle,Retrospective,salesindatabasesSybase,dynamic(1980s)England感謝閱讀(RDBMS),Informix,deliveryat感謝閱讀March?"StructuredQueryIBM,recordlevel精品文檔放心下載Language(SQL),Microsoft

ODBC"WhatunitOn-lineanalyticPilot,share,Retrospective,

WarehousingsalesinNewprocessingArbor,dynamic&England(OLAP),Cognos,deliveryMarch?DrilldownmultidimensionalMicrostrategymultiplelevels

Decision謝謝閱讀SupporttoBoston."databases,謝謝閱讀warehouses(1990s)Mining"What’slikelytoAdvancedPilot,Prospective,happentoBostonalgorithms,Lockheed,proactive

(Emergingsalesnext謝謝閱讀multiprocessorIBM,SGI,information

Today)month?Why?"謝謝閱讀puters,massivenumerousdeliverydatabasesstartups

(nascent精品文檔放心下載industry)Table1.StepsintheEvolutionofDataMining.感謝閱讀Thecoreponentsofdataminingtechnologybeenunderdevelopment

fordecades,inresearchareassuchasstatistics,artificial謝謝閱讀intelligence,andmachinelearning.Today,thematurityofthesetechniques,

coupledwithhigh-performancerelationaldatabaseenginesbroaddata

integrationefforts,makethesetechnologiespracticalfor感謝閱讀currentdatawarehouseenvironments.精品文檔放心下載TheScopeofDataMiningDataminingderivesitsnamefromthesimilaritiesbetweensearchingfor

valuablebusinessinformationinalargedatabase—forexample,finding

linkedproductsingigabytesofstorescannerdata—andminingamountainfora

veinvaluableore.Bothprocessesrequireeithersiftingthroughanimmense

amountofmaterial,orintelligentlyprobingittofindexactlywherethevalueresides.

Givendatabasessufficientsizeandquality,miningtechnologycan

generatenewbusinessopportunitiesbyprovidingthesecapabilities:感謝閱讀Automatedpredictiontrendsbehaviors.mining

automatestheoffindingpredictiveinformationinlarge

databases.Questionsthattraditionallyrequiredextensive

hands-onanalysiscannowbeanswereddirectlyfromthedata—

quickly.Atypicalexampleofapredictiveproblemistargeted

marketing.Dataminingusesdataonpromotionalmailingsto精品文檔放心下載感謝閱讀感謝閱讀精品文檔放心下載identifythetargetsmostlikelytomaximizereturninvestmentinfuturemailings.Otherpredictiveproblemsincludeforecastingbankruptcyandotherformsdefault,andidentifyingsegmentsapopulationlikelytorespondsimilarlytogivenevents.精品文檔放心下載謝謝閱讀?Automateddiscoverypreviouslyunknownpatterns.Datamining

toolssweepthroughdatabasesidentifypreviouslyhidden

patternsinstep.Anexamplepatterndiscoveryisthe

analysisofretailsalesdatatoidentifyseeminglyunrelatedproductsthat

areoftenpurchasedtogether.Otherpatterndiscoveryproblemsinclude

detectingfraudulentcreditcardtransactionsand感謝閱讀謝謝閱讀identifyinganomalousdatathatcouldrepresententrykeyingerrors.謝謝閱讀miningtechniquescanyieldthebenefitsofautomationexisting

softwareandhardwareplatforms,beimplementedonsystemsas

existingplatformsareupgradedandnewproductsdeveloped.Whendata

miningtoolsareimplementedonhighperformanceparallelprocessingsystems,

theycananalyzemassivedatabasesinminutes.Fasterprocessingmeansthat

userscanautomaticallyexperimentwithmoremodelstounderstandplexdata.Highspeedmakesitpracticalforuserstoanalyzehuge

quantitiesofdata.databases,inturn,yieldimproved

predictions.感謝閱讀Databasescanbelargerinbothdepthandbreadth:感謝閱讀?Morecolumns.Analystsmustoftenlimitthenumberofvariablesthey

examinewhendoinghands-onanalysisduetotimeconstraints.variablesthatarediscardedbecausetheyseemunimportantmay

carryinformationaboutunknownpatterns.Highperformancedata

miningallowsuserstoexplorethefulldepthofadatabase,without

preselectingasubsetofvariables.感謝閱讀謝謝閱讀謝謝閱讀精品文檔放心下載Morerows.samplesyieldlowerestimationerrorsand?精品文檔放心下載variance,andallowuserstomakeinferencesaboutsmallimportantsegmentsofapopulation.精品文檔放心下載謝謝閱讀ArecentGartnerGroupAdvancedTechnologyResearchNotelistedminingandartificialintelligencethetopthefivekeytechnology

areasthat"willaimpactacrossawiderangeindustrieswithinthenext3to5years."2Gartneralsolistedparallelarchitectures

dataminingastwoofthetop10newtechnologiesinwhichpanieswill

investduringthenext5years.AccordingtoarecentGartnerHPCResearchNote,"Withtherapidadvanceindatacapture,

transmissionandstorage,large-systemsusersincreasinglyneedto精品文檔放心下載implementinnovativewaystominetheafter-marketvaluetheir

vaststoresofdetaildata,employingMPP[massivelyparallelprocessing]

systemstocreatesourcesbusinessadvantage(0.9probability)."3精品文檔放心下載精品文檔放心下載精品文檔放心下載Themostmonlyusedtechniquesindataminingare:感謝閱讀?Artificialneuralnetworks:Non-linearpredictivemodelsthat

learnthroughtrainingandresemblebiologicalneuralnetworksin

structure.精品文檔放心下載感謝閱讀Decisiontrees:Tree-shapedstructuresthatrepresentsetsof?精品文檔放心下載decisions.Thesedecisionsgeneraterulesfortheclassification

ofadataset.SpecificdecisiontreemethodsincludeClassificationand

RegressionTrees(CART)andSquareAutomaticInteraction

Detection(CHAID).謝謝閱讀?Geneticalgorithms:Optimizationtechniquesthatuseprocessessuchasgeneticbination,mutation,andnaturalselectionina謝謝閱讀designbasedontheconceptsofevolution.感謝閱讀NearestneighborAtechniquethatclassifieseachrecord?

inadatasetbasedonabinationoftheclassesthekrecord(s)

mostsimilartoitinahistoricaldataset(wherek1).Sometimes謝謝閱讀謝謝閱讀calledthek-nearestneighbortechnique.精品文檔放心下載Ruleinduction:Theextractionofusefulif-thenfromdata?感謝閱讀basedstatisticalsignificance.精品文檔放心下載ofthesetechnologieshavebeeninuseformorethanadecadespecializedanalysistoolsthatworkwithrelativelysmallvolumesofdata.These

capabilitiesareevolvingtointegratedirectlywithindustry-standarddata

warehouseandOLAPplatforms.Theappendixtothiswhitepaperprovidesa

glossarydataminingterms.精品文檔放心下載DataMiningWorksexactlyisdataminingabletotellyouimportantthingsthatyou

didn'torisgoingtohappennext?Thetechniquethatisto

performthesefeatsindataminingiscalledmodeling.Modelingissimplytheact

ofbuildingamodelinonesituationwhereyouknowtheanswerandthen

applyingittosituationthatyoudon't.Forinstance,ifyoulookingforasunkenSpanishgalleononthehighseasthe

firstthingmightdoistoresearchthetimeswhenSpanishtreasurehadbeenfoundothersinthepast.Youmightnotethatthese謝謝閱讀shipsoftentendtofoundoffthecoastofBermudaandthattherecertaincharacteristicstotheoceancurrents,certainroutesthathavelikely

beentakentheship’scaptainsinthatera.Younotethese謝謝閱讀similaritiesandbuildamodelthatincludesthecharacteristicsthataremontothe

locationsthesesunkentreasures.Withthesemodelsinhandyousailofflooking

fortreasurewhereyourmodelindicatesitmostlikelymightbegivenasimilar

situationinthepast.Hopefully,ifyou'veagoodmodel,youfindyourtreasure.感謝閱讀Thisofmodelbuildingisthussomethingthatpeoplehavebeendoingforalong

time,certainlybeforetheadventofputersordataminingtechnology.Whathappensonputers,however,ismuchdifferentthanthepeoplebuildmodels.putersareloadedupwithlotsinformationavarietyof

situationswhereanswerisknownandthentheminingsoftwaretheputermustrunthroughthatanddistillthe

characteristicsthedatathatshouldintothemodel.Oncemodel

isbuiltitcantheninsituationswhereyoudon'ttheanswer.

Forexample,saythatyouarethedirectorofmarketingforatelemunications

andyou'dliketoacquiresomenewlongdistancephonecustomers.Youcouldrandomlygoandmailcouponstothe

generalpopulation-justyoucouldrandomlysailtheseaslookingfor

sunkentreasure.Inneithercasewouldyouachievetheresultsyoudesiredandof

courseyouhavetheopportunitytomuchbetterthanrandom-youcoulduse

businessexperiencestoredinyourdatabasetobuildamodel.謝謝閱讀Asthemarketingdirectoryouhaveaccesstoaofinformationabout

ofcustomers:theirage,sex,credithistorylongdistancecalling

usage.Thegoodnewsisthatyoualsoaofinformation精品文檔放心下載aboutyourprospectivecustomers:theirage,sex,credithistoryetc.Your

problemisthatyoudon'tthelongdistancecallingusageofthese

prospects(sincetheyaremostnowcustomersofpetition).You'dlike

toconcentrateonthoseprospectshavelargeamountsoflongdistance

usage.Youcanacplishthisbybuildingamodel.Table2illustratesthedatausedforbuildingamodelforcustomerprospectingina

warehouse.謝謝閱讀CustomersProspectsGeneralinformation(e.g.demographicdata)精品文檔放心下載Proprietaryinformation(e.g.Targetcustomertransactions)謝謝閱讀Table2-DataMiningforProspecting謝謝閱讀Thegoalinprospectingistomakesomecalculatedguessesthe

informationinthelowerrighthandquadrantbasedthemodelthatbuildgoingfromCustomerGeneralInformationtoCustomerProprietary

Information.Forinstance,asimplemodelforatelemunicationspanymight謝謝閱讀mycustomerswhomakemorethan$60,000/yearspendmorethan

$80/monthonlongdistance精品文檔放心下載Thismodelcouldthenbeappliedtotheprospectdatatotrytell

somethingabouttheproprietaryinformationthatthistelemunicationspany

notcurrentlyhaveaccessto.Withthismodelinhandcustomers

canbeselectivelytargeted.感謝閱讀Testmarketingisanofdataforthiskindofmodeling.

Miningtheresultsofatestrepresentingabroadrelativelysmall

sampleprospectscanprovideafoundationforidentifyingprospectsintheoverallmarket.Table3showsanothermonscenariofor

buildingmodels:isgoingtohappeninthefuture.精品文檔放心下載感謝閱讀YesterdayTodayTomorrowStaticinformationandKnownKnowncurrentplans(e.g.demographicdata,marketingplans)感謝閱讀精品文檔放心下載information(e.g.KnownKnownTargetcustomertransactions)精品文檔放心下載Table3-DataMiningforPredictions感謝閱讀Ifsomeonetoldyouthathadamodelthatcouldpredictcustomerusage

wouldyouknowifreallyagoodmodel?Thefirstthingyoumighttrywould

betohimtoapplymodeltoyourcustomerbase-where

youalreadyknewtheanswer.Withdatamining,thebestwaytothisisby

settingasidesomeofdatainavaulttoitfromtheminingprocess.

Oncetheminingistheresultscanbetestedagainstthedataheldinthe

vaulttoconfirmthemodel’svalidity.If精品文檔放心下載themodelworks,itsobservationsshouldholdforthevaulted謝謝閱讀AnArchitectureforDataMining謝謝閱讀Toapplytheseadvancedtechniques,theymustbefullyintegrated

adatawarehouseaswellasflexibleinteractivebusinessanalysistools.

Manydataminingtoolscurrentlyoutsidethewarehouse,

requiringextrastepsforextracting,importing,andanalyzingthedata.Furthermore,

wheninsightsrequireoperationalimplementation,精品文檔放心下載integrationwiththewarehousesimplifiestheapplicationofresultsfromdatamining.

Theresultinganalyticdatawarehousecanbeappliedtoimprovebusiness

processesthroughouttheorganization,inareassuchpromotionalcampaignmanagement,frauddetection,newproductrollout,andso

Figure1illustratesarchitectureforadvancedanalysisin

alargedatawarehouse.精品文檔放心下載Figure1-IntegratedMiningArchitecture感謝閱讀Theidealstartingpointisadatawarehousecontainingabinationinternaldatatrackingcustomercontactcoupledwithexternalmarketaboutpetitoractivity.Backgroundinformationpotentialcustomersalso

providesanexcellentbasisforprospecting.Thiswarehousecanimplementedinavarietyrelationaldatabasesystems:精品文檔放心下載Sybase,Oracle,Redbrick,soshouldoptimizedforflexiblefastdataaccess.謝謝閱讀AnOLAP(On-LineAnalyticalProcessing)serverenablesamore

sophisticatedend-userbusinessmodeltobeappliedwhennavigatingthedata

warehouse.Themultidimensionalstructuresallowtheusertoanalyze

thedataastheytoviewtheirbusiness–summarizingbyline,

region,andotherperspectivestheirbusiness.TheData謝謝閱讀MiningServermustbeintegratedthedatawarehousetheOLAPserver

toembedROI-focusedbusinessanalysisdirectlythisinfrastructure.An

advanced,process-centricmetadatatemplatedefinesthedataminingobjectivesforspecificbusinessissueslikecampaign

management,prospecting,andpromotionoptimization.Integrationwiththedata

warehouseenablesoperationaldecisionstodirectlyimplementedand

tracked.Asthewarehousegrowswithnewdecisionsandresults,the

organizationcancontinuallyminethepracticesandapplythemtofuturedecisions.精品文檔放心下載Thisdesignrepresentsafundamentalshiftconventionaldecisionsupport

systems.thansimplydeliveringdatatotheuserthroughqueryreportingsoftware,theAdvancedAnalysisServerapplies

users’businessdirectlytothewarehouseanda

proactiveanalysisofthemostrelevantinformation.Theseresultsenhancethe

metadataintheOLAPServerbyprovidingadynamicmetadatalayerthat

representsadistilledofthedata.Reporting,visualization,andotheranalysis

toolscanthenbeappliedtoplanfutureactionsandconfirmtheimpactofthoseplans.感謝閱讀ProfitableApplicationsArangeofdeployedsuccessfulapplicationsofdatamining.

Whileearlyadoptersofthistechnologyhavetendedtoinformation-intensiveindustriessuchfinancialservicesdirectmail

marketing,thetechnologyisapplicabletoanypanylookingtoleveragealarge

warehousetobettermanagetheircustomerrelationships.Twocritical

factorsforsuccessdataminingalarge,well-integrateddata

warehouseandawell-definedunderstanding謝謝閱讀ofthebusinessprocesswithinwhichdataminingistobeapplied(such

ascustomerprospecting,retention,campaignmanagement,soon).謝謝閱讀Somesuccessfulapplicationareasinclude:感謝閱讀Apharmaceuticalpanyanalyzeitsrecentsalesforceactivity

andtheirresultstoimprovetargetingofhigh-valuephysiciansand謝謝閱讀determinewhichmarketingactivitieshavethegreatestimpact

inthenextfewmonths.Thedataneedstoincludepetitormarketactivity

aswellasinformationaboutthelocalhealthcaresystems.謝謝閱讀Theresultscandistributedtotheforceviaawide-areanetwork

thatenablestherepresentativestoreviewtheremendationsfromthe

perspectivethekeyattributesinthedecisionprocess.Theongoing,

dynamicanalysisofthedatawarehouseallowspracticesfromthroughouttheorganizationtobeappliedinspecificsales

situations.感謝閱讀?Acreditcardcanleverageitsvastwarehousecustomer

transactiondatatoidentifycustomersmostlikelytobeinterested

inacreditproduct.Usingasmalltestmailing,theattributesof

customersanaffinityfortheproductcanbeidentified.

Recentprojectshaveindicatedmorethana20-folddecreaseincostsfor

targetedmailingcampaignsoverconventionalapproaches.

?Adiversifiedtransportationwithalargedirectsalesforce

canapplydataminingtoidentifythebestprospectsforits

services.Usingdataminingtoanalyzeitsowncustomerexperience,

thiscanbuildauniquesegmentationidentifyingtheattributeshigh-valueprospects.Applyingthissegmentationto謝謝閱讀精品文檔放心下載感謝閱讀感謝閱讀謝謝閱讀ageneralbusinessdatabasesuchthoseprovidedbyDun&

Bradstreetcanyieldaprioritizedlistofprospectsbyregion.謝謝閱讀?Alargeconsumerpackagegoodspanycanapplydataminingtoimprove

itssalesprocesstoretailers.Datafromconsumerpanels,

shipments,petitoractivitycanbeappliedtounderstandthe

reasonsforbrandandstoreswitching.Throughthisanalysis,the

manufacturerselectpromotionalstrategiesthatreach

theirtargetcustomersegments.謝謝閱讀謝謝閱讀精品文檔放心下載精品文檔放心下載Eachoftheseexampleshaveaclearmonground.Theyleveragetheknowledge

aboutcustomersimplicitinadatawarehousetoreducecostsimprovethe

valueofcustomerrelationships.Theseorganizationscannowfocustheirefforts

onthemostimportant(profitable)customersandprospects,designtargetedmarketingstrategiestoreachthem.精品文檔放心下載Conclusionprehensivedatawarehousesthatintegrateoperationaldatawithcustomer,

supplier,andmarketinformationhaveresultedinanexplosioninformation.

petitionrequirestimelysophisticatedanalysisan謝謝閱讀integratedviewthedata.However,thereisagrowinggapbetweenmore

powerfulstorageandretrievalsystemstheusers’abilityto

effectivelyanalyzeactontheinformationtheycontain.BothrelationalOLAPtechnologieshavetremendouscapabilitiesfor感謝閱讀navigatingmassivedatawarehouses,bruteforcenavigationdata

isenough.Anewtechnologicalleapisneededtostructureandprioritize

informationforspecificend-userproblems.Thedatamining感謝閱讀toolscanmakethisleap.Quantifiablebusinessbenefitshavebeenproventhrough

theintegrationdataminingwithcurrentinformationsystems,newproducts

areonthehorizonthatwillbringthisintegrationto感謝閱讀anevenwideraudienceusers.精品文檔放心下載METAGroupApplicationDevelopmentStrategies:"DataMiningforWarehouses:UncoveringHiddenPatterns.",7/13/95.感謝閱讀1GartnerGroupAdvancedTechnologiesandApplicationsResearchNote,品文檔放心下載2/1/95.2精3GartnerGroupPerformanceputingResearchNote,1/31/95.精品文檔放心下載GlossaryofDataMiningTermsanalyticalmodelAstructureprocessforanalyzingadataset.For

example,adecisiontreeisamodelforthe

classificationofadataset.謝謝閱讀精品文檔放心下載謝謝閱讀anomalousdataDatathatresultfromerrors(forexample,dataentrykeyingerrors)orthatrepresentunusualevents.精品文檔放心下載Anomalousdatashouldbeexaminedcarefullyitmaycarryimportantinformation.謝謝閱讀精品文檔放心下載artificialpredictivethatlearnthroughneuralnetworksandresemblebiologicalneuralnetworksinstructure.感謝閱讀ClassificationandRegressionTrees.Adecisiontree

techniqueusedforclassificationofadataset.謝謝閱讀Providesasetofrulesthatyouapplytoa(unclassified)datasettopredictwhichrecordshaveagivenoute.Segmentsadatasetbycreating2-way

splits.RequireslessdatapreparationthanCHAID.精品文檔放心下載精品文檔放心下載謝謝閱讀謝謝閱讀CHAIDSquareAutomaticInteractionDetection.A感謝閱讀decisiontreetechniqueforclassificationofa

dataset.Providesasetofrulesthatyoucanapply

toa(unclassified)datasettowhichwill

agivenoute.Segmentsadatasetbyusingchisquare

teststocreatemulti-waysplits.Preceded,andrequires

moredatapreparationthan,精品文檔放心下載CART.感謝閱讀classificationTheprocessofdividingadatasetmutually

exclusivegroupssuchthattheofgroup

areas"close"aspossibletoanother,anddifferent

groups"far"aspossiblefromanother,where

distanceismeasuredrespecttospecificvariable(s)

youaretryingtopredict.For精品文檔放心下載example,atypicalclassificationproblemistodivide

adatabaseintogroupsthatarehomogeneousaspossiblewithtoa

creditworthinessvariablewithvalues"Good""Bad."精品文檔放心下載感謝閱讀clusteringTheprocessdividingadatasetintomutually

exclusivegroupssuchthattheofgroup

areas"close"aspossibletoanother,and謝謝閱讀謝謝閱讀differentgroupsareas"far"aspossiblefromanother,

wheredistanceismeasuredwithtoavailablevariables.感謝閱讀cleansingTheprocessofensuringthatvaluesinadatasetareconsistentcorrectlyrecorded.謝謝閱讀精品文檔放心下載miningTheextractionhiddenpredictiveinformationfromlargedatabases.精品文檔放心下載navigationTheprocessofviewingdifferentdimensions,slices,

levelsofdetailofamultidimensionaldatabase.

OLAP.感謝閱讀感謝閱讀Thevisualinterpretationplexrelationshipsinvisualizationmultidimensionaldata.精品文檔放心下載謝謝閱讀warehouseAsystemforstoringanddeliveringmassivequantitiesofdata.精品文檔放心下載decisiontreeAtree-shapedstructurethatrepresentsasetof感謝閱讀decisions.Thesedecisionsgeneraterulesforthe

classificationofadataset.SeeCARTCHAID.謝謝閱讀謝謝閱讀dimensionInaorrelationaldatabase,eachfieldina

recordrepresentsadimension.Inamultidimensional

database,adimensionisasetofsimilarentities;謝謝閱讀感謝閱讀forexample,amultidimensionalsalesdatabasemight

includethedimensionsProduct,Time,City.感謝閱讀謝謝閱讀exploratorydataTheuseofgraphicalanddescriptivestatistical

analysistechniquestolearnaboutthestructureadataset.精品文檔放心下載geneticOptimizationtechniquesthatuseprocessessuchalgorithmsgeneticbination,mutation,andnatura

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論