{"id":6629,"date":"2022-05-21T14:24:00","date_gmt":"2022-05-21T12:24:00","guid":{"rendered":"http:\/\/joapen.com\/blog\/?p=6629"},"modified":"2022-09-23T12:46:26","modified_gmt":"2022-09-23T10:46:26","slug":"crisp-dm-methodology","status":"publish","type":"post","link":"https:\/\/joapen.com\/blog\/2022\/05\/21\/crisp-dm-methodology\/","title":{"rendered":"CRISP-DM methodology"},"content":{"rendered":"\n<p>The<strong>&nbsp;cross-industry standard process for data mining<\/strong>&nbsp;or&nbsp;<strong>CRISP-DM<\/strong>&nbsp;is an open standard process framework model for data mining project planning, created in 1996.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"830\" height=\"679\" src=\"http:\/\/joapen.com\/blog\/wp-content\/uploads\/2022\/09\/image-21.png\" alt=\"\" class=\"wp-image-6630\" srcset=\"https:\/\/joapen.com\/blog\/wp-content\/uploads\/2022\/09\/image-21.png 830w, https:\/\/joapen.com\/blog\/wp-content\/uploads\/2022\/09\/image-21-300x245.png 300w, https:\/\/joapen.com\/blog\/wp-content\/uploads\/2022\/09\/image-21-768x628.png 768w\" sizes=\"auto, (max-width: 830px) 100vw, 830px\" \/><\/figure>\n\n\n\n<p id=\"7263\">The process of CRISP-DM is into 6 phases or components:<\/p>\n\n\n\n<ol class=\"wp-block-list\"><li>Business understanding\u00a0\u2013 What does the business need?<\/li><li>Data understanding\u00a0\u2013 What data do we have \/ need? Is it clean?<\/li><li>Data preparation\u00a0\u2013 How do we organize the data for modeling?<\/li><li>Modeling\u00a0\u2013 What modeling techniques should we apply?<\/li><li>Evaluation\u00a0\u2013 Which model best meets the business objectives?<\/li><li>Deployment\u00a0\u2013 How do stakeholders access the results?<\/li><\/ol>\n\n\n\n<h2 class=\"wp-block-heading\">1. Business understanding<\/h2>\n\n\n\n<p>Basic checklist:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Understand business requirements<\/li><li>Form a business question<\/li><li>Turn the business question into ML question\/s<\/li><li>Define criteria for successful outcome of the project <\/li><li>Highlight project&#8217;s critical features<\/li><li>List assumptions<\/li><li>List resources (specially data resources)<\/li><li>List risks and potential mitigation actions<\/li><li>Build a return of investment (ROI) calculation<\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">2. Data understanding<\/h2>\n\n\n\n<p>Basic checklist:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Data collection:<ul><li>Define steps to extract data<\/li><li>Analyze data for detecting additional requirements<\/li><li>Consider other data sources<\/li><\/ul><\/li><li>Data properties<ul><li>Describe data: amount, metadata, properties<\/li><li>Find features and relationships in the data<\/li><\/ul><\/li><li>Quality<ul><li>Verifying attributes<\/li><li>Identifying missing data<\/li><li>Reveal inconsistencies<\/li><li>List problems<\/li><\/ul><\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">3.- Data preparation<\/h2>\n\n\n\n<p>Basic checklist:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Final dataset selection<ul><li>Analyze constraints: size, included\/excluded columns, record selection, data types<\/li><\/ul><\/li><li>Final dataset preparation<ul><li>Steps: clean, transform, merge, format<\/li><li>Make decisions about what to do with missing data. Take notes about these decisions.<\/li><li>Make decisions about what to do with missing properties. Take notes about these decisions.<\/li><\/ul><\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">4.- Modelling<\/h2>\n\n\n\n<p>Basic checklist:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Select a model, create it<\/li><li>Create a testing plan for that model<ul><li>Split training and testing datasets<\/li><li>model evaluation criterium<\/li><\/ul><\/li><li>Tune and test the different available parameters<ul><li>Tweak the model for better performance<\/li><li>Describe the trained models and report findings<\/li><\/ul><\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">5.- Evaluation<\/h2>\n\n\n\n<p>Basic checklist:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>What is the accuracy of the model?<\/li><li>Model generalization on unseen\/unknown data<\/li><li>Perform quality assurance checks<ul><li>Was any important criteria overlooked?<\/li><li>What is the model performance using determined data?<\/li><li>Is data available for future training?<\/li><\/ul><\/li><li>Does the model meets the success criteria? yes \/ no<\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">6.- Deployment<\/h2>\n\n\n\n<p>Basic checklist:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Planning deployment<ul><li>Understand infrastructure deployment requirements and processes<\/li><li>Understand applications deployment requirements and processes<\/li><li>Understand required environments to deploy (testing, QA, production)<\/li><li>Determine how code is going to be managed between environments (probably done already)<\/li><\/ul><\/li><li>Maintenance and monitoring<ul><li>Enable the required monitoring tools, views and reports.<\/li><li>Review data quality thresholds.<\/li><\/ul><\/li><li>Final report<ul><li>Document all processes used in the project<\/li><li>Review all goals defined for the project and if they met\/not-met them (and reasons)<\/li><li>Detail findings of the project<\/li><li>Explain the model used and the reason this model is selected as the right one.<\/li><li>Explain the models tested but finally discarded, and its reasons<\/li><li>Identify the customer groups using this model<\/li><\/ul><\/li><li>Project review<ul><li>Summarize the results and present to stakeholders<\/li><\/ul><\/li><\/ul>\n","protected":false},"excerpt":{"rendered":"<p>The&nbsp;cross-industry standard process for data mining&nbsp;or&nbsp;CRISP-DM&nbsp;is an open standard process framework model for data mining project planning, created in 1996. The process of CRISP-DM is into 6 phases or components: Business understanding\u00a0\u2013 What does the business need? Data understanding\u00a0\u2013 What data do we have \/ need? Is it clean? Data preparation\u00a0\u2013 How do we organize &#8230; <a title=\"CRISP-DM methodology\" class=\"read-more\" href=\"https:\/\/joapen.com\/blog\/2022\/05\/21\/crisp-dm-methodology\/\" aria-label=\"Read more about CRISP-DM methodology\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[151],"tags":[],"class_list":["post-6629","post","type-post","status-publish","format-standard","hentry","category-machine-learning"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>CRISP-DM methodology -<\/title>\n<meta name=\"description\" content=\"The&nbsp;cross-industry standard process for data mining&nbsp;or&nbsp;CRISP-DM&nbsp;is an open standard process framework model for data mining project - joapen projects\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/joapen.com\/blog\/2022\/05\/21\/crisp-dm-methodology\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"CRISP-DM methodology -\" \/>\n<meta property=\"og:description\" content=\"The&nbsp;cross-industry standard process for data mining&nbsp;or&nbsp;CRISP-DM&nbsp;is an open standard process framework model for data mining project - joapen projects\" \/>\n<meta property=\"og:url\" content=\"https:\/\/joapen.com\/blog\/2022\/05\/21\/crisp-dm-methodology\/\" \/>\n<meta property=\"og:site_name\" content=\"joapen projects\" \/>\n<meta property=\"article:published_time\" content=\"2022-05-21T12:24:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2022-09-23T10:46:26+00:00\" \/>\n<meta property=\"og:image\" content=\"http:\/\/joapen.com\/blog\/wp-content\/uploads\/2022\/09\/image-21.png\" \/>\n<meta name=\"author\" content=\"joapen\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"joapen\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/2022\\\/05\\\/21\\\/crisp-dm-methodology\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/2022\\\/05\\\/21\\\/crisp-dm-methodology\\\/\"},\"author\":{\"name\":\"joapen\",\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/#\\\/schema\\\/person\\\/23919df2312175fe9c4609203595b217\"},\"headline\":\"CRISP-DM methodology\",\"datePublished\":\"2022-05-21T12:24:00+00:00\",\"dateModified\":\"2022-09-23T10:46:26+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/2022\\\/05\\\/21\\\/crisp-dm-methodology\\\/\"},\"wordCount\":456,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/#\\\/schema\\\/person\\\/23919df2312175fe9c4609203595b217\"},\"image\":{\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/2022\\\/05\\\/21\\\/crisp-dm-methodology\\\/#primaryimage\"},\"thumbnailUrl\":\"http:\\\/\\\/joapen.com\\\/blog\\\/wp-content\\\/uploads\\\/2022\\\/09\\\/image-21.png\",\"articleSection\":[\"Machine Learning\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/joapen.com\\\/blog\\\/2022\\\/05\\\/21\\\/crisp-dm-methodology\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/2022\\\/05\\\/21\\\/crisp-dm-methodology\\\/\",\"url\":\"https:\\\/\\\/joapen.com\\\/blog\\\/2022\\\/05\\\/21\\\/crisp-dm-methodology\\\/\",\"name\":\"CRISP-DM methodology -\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/2022\\\/05\\\/21\\\/crisp-dm-methodology\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/2022\\\/05\\\/21\\\/crisp-dm-methodology\\\/#primaryimage\"},\"thumbnailUrl\":\"http:\\\/\\\/joapen.com\\\/blog\\\/wp-content\\\/uploads\\\/2022\\\/09\\\/image-21.png\",\"datePublished\":\"2022-05-21T12:24:00+00:00\",\"dateModified\":\"2022-09-23T10:46:26+00:00\",\"description\":\"The&nbsp;cross-industry standard process for data mining&nbsp;or&nbsp;CRISP-DM&nbsp;is an open standard process framework model for data mining project - joapen projects\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/2022\\\/05\\\/21\\\/crisp-dm-methodology\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/joapen.com\\\/blog\\\/2022\\\/05\\\/21\\\/crisp-dm-methodology\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/2022\\\/05\\\/21\\\/crisp-dm-methodology\\\/#primaryimage\",\"url\":\"https:\\\/\\\/joapen.com\\\/blog\\\/wp-content\\\/uploads\\\/2022\\\/09\\\/image-21.png\",\"contentUrl\":\"https:\\\/\\\/joapen.com\\\/blog\\\/wp-content\\\/uploads\\\/2022\\\/09\\\/image-21.png\",\"width\":830,\"height\":679},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/2022\\\/05\\\/21\\\/crisp-dm-methodology\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/joapen.com\\\/blog\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"CRISP-DM methodology\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/#website\",\"url\":\"https:\\\/\\\/joapen.com\\\/blog\\\/\",\"name\":\"joapen projects\",\"description\":\"Just a place to write\",\"publisher\":{\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/#\\\/schema\\\/person\\\/23919df2312175fe9c4609203595b217\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/joapen.com\\\/blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/#\\\/schema\\\/person\\\/23919df2312175fe9c4609203595b217\",\"name\":\"joapen\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/wp-content\\\/uploads\\\/2021\\\/04\\\/joapen-mini.jpeg\",\"url\":\"https:\\\/\\\/joapen.com\\\/blog\\\/wp-content\\\/uploads\\\/2021\\\/04\\\/joapen-mini.jpeg\",\"contentUrl\":\"https:\\\/\\\/joapen.com\\\/blog\\\/wp-content\\\/uploads\\\/2021\\\/04\\\/joapen-mini.jpeg\",\"width\":400,\"height\":400,\"caption\":\"joapen\"},\"logo\":{\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/wp-content\\\/uploads\\\/2021\\\/04\\\/joapen-mini.jpeg\"},\"sameAs\":[\"http:\\\/\\\/www.joapen.com\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"CRISP-DM methodology -","description":"The&nbsp;cross-industry standard process for data mining&nbsp;or&nbsp;CRISP-DM&nbsp;is an open standard process framework model for data mining project - joapen projects","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/joapen.com\/blog\/2022\/05\/21\/crisp-dm-methodology\/","og_locale":"en_US","og_type":"article","og_title":"CRISP-DM methodology -","og_description":"The&nbsp;cross-industry standard process for data mining&nbsp;or&nbsp;CRISP-DM&nbsp;is an open standard process framework model for data mining project - joapen projects","og_url":"https:\/\/joapen.com\/blog\/2022\/05\/21\/crisp-dm-methodology\/","og_site_name":"joapen projects","article_published_time":"2022-05-21T12:24:00+00:00","article_modified_time":"2022-09-23T10:46:26+00:00","og_image":[{"url":"http:\/\/joapen.com\/blog\/wp-content\/uploads\/2022\/09\/image-21.png","type":"","width":"","height":""}],"author":"joapen","twitter_misc":{"Written by":"joapen","Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/joapen.com\/blog\/2022\/05\/21\/crisp-dm-methodology\/#article","isPartOf":{"@id":"https:\/\/joapen.com\/blog\/2022\/05\/21\/crisp-dm-methodology\/"},"author":{"name":"joapen","@id":"https:\/\/joapen.com\/blog\/#\/schema\/person\/23919df2312175fe9c4609203595b217"},"headline":"CRISP-DM methodology","datePublished":"2022-05-21T12:24:00+00:00","dateModified":"2022-09-23T10:46:26+00:00","mainEntityOfPage":{"@id":"https:\/\/joapen.com\/blog\/2022\/05\/21\/crisp-dm-methodology\/"},"wordCount":456,"commentCount":0,"publisher":{"@id":"https:\/\/joapen.com\/blog\/#\/schema\/person\/23919df2312175fe9c4609203595b217"},"image":{"@id":"https:\/\/joapen.com\/blog\/2022\/05\/21\/crisp-dm-methodology\/#primaryimage"},"thumbnailUrl":"http:\/\/joapen.com\/blog\/wp-content\/uploads\/2022\/09\/image-21.png","articleSection":["Machine Learning"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/joapen.com\/blog\/2022\/05\/21\/crisp-dm-methodology\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/joapen.com\/blog\/2022\/05\/21\/crisp-dm-methodology\/","url":"https:\/\/joapen.com\/blog\/2022\/05\/21\/crisp-dm-methodology\/","name":"CRISP-DM methodology -","isPartOf":{"@id":"https:\/\/joapen.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/joapen.com\/blog\/2022\/05\/21\/crisp-dm-methodology\/#primaryimage"},"image":{"@id":"https:\/\/joapen.com\/blog\/2022\/05\/21\/crisp-dm-methodology\/#primaryimage"},"thumbnailUrl":"http:\/\/joapen.com\/blog\/wp-content\/uploads\/2022\/09\/image-21.png","datePublished":"2022-05-21T12:24:00+00:00","dateModified":"2022-09-23T10:46:26+00:00","description":"The&nbsp;cross-industry standard process for data mining&nbsp;or&nbsp;CRISP-DM&nbsp;is an open standard process framework model for data mining project - joapen projects","breadcrumb":{"@id":"https:\/\/joapen.com\/blog\/2022\/05\/21\/crisp-dm-methodology\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/joapen.com\/blog\/2022\/05\/21\/crisp-dm-methodology\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/joapen.com\/blog\/2022\/05\/21\/crisp-dm-methodology\/#primaryimage","url":"https:\/\/joapen.com\/blog\/wp-content\/uploads\/2022\/09\/image-21.png","contentUrl":"https:\/\/joapen.com\/blog\/wp-content\/uploads\/2022\/09\/image-21.png","width":830,"height":679},{"@type":"BreadcrumbList","@id":"https:\/\/joapen.com\/blog\/2022\/05\/21\/crisp-dm-methodology\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/joapen.com\/blog\/"},{"@type":"ListItem","position":2,"name":"CRISP-DM methodology"}]},{"@type":"WebSite","@id":"https:\/\/joapen.com\/blog\/#website","url":"https:\/\/joapen.com\/blog\/","name":"joapen projects","description":"Just a place to write","publisher":{"@id":"https:\/\/joapen.com\/blog\/#\/schema\/person\/23919df2312175fe9c4609203595b217"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/joapen.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/joapen.com\/blog\/#\/schema\/person\/23919df2312175fe9c4609203595b217","name":"joapen","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/joapen.com\/blog\/wp-content\/uploads\/2021\/04\/joapen-mini.jpeg","url":"https:\/\/joapen.com\/blog\/wp-content\/uploads\/2021\/04\/joapen-mini.jpeg","contentUrl":"https:\/\/joapen.com\/blog\/wp-content\/uploads\/2021\/04\/joapen-mini.jpeg","width":400,"height":400,"caption":"joapen"},"logo":{"@id":"https:\/\/joapen.com\/blog\/wp-content\/uploads\/2021\/04\/joapen-mini.jpeg"},"sameAs":["http:\/\/www.joapen.com"]}]}},"_links":{"self":[{"href":"https:\/\/joapen.com\/blog\/wp-json\/wp\/v2\/posts\/6629","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/joapen.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/joapen.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/joapen.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/joapen.com\/blog\/wp-json\/wp\/v2\/comments?post=6629"}],"version-history":[{"count":2,"href":"https:\/\/joapen.com\/blog\/wp-json\/wp\/v2\/posts\/6629\/revisions"}],"predecessor-version":[{"id":6636,"href":"https:\/\/joapen.com\/blog\/wp-json\/wp\/v2\/posts\/6629\/revisions\/6636"}],"wp:attachment":[{"href":"https:\/\/joapen.com\/blog\/wp-json\/wp\/v2\/media?parent=6629"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/joapen.com\/blog\/wp-json\/wp\/v2\/categories?post=6629"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/joapen.com\/blog\/wp-json\/wp\/v2\/tags?post=6629"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}