Welcome to the Nanodegree program
02. Meet the Instructors00:00
03. Term 2 Projects00:00
03.2 Term 2 Projects00:00
03.3 Term 2 Projects00:00
03.4 Term 2 Projects00:00
04. Program Structure & Syllabus
05. Learning Plan – First Two Weeks
06. How to Succeed00:00
Words of Encouragement00:00
The Skills That Set You Apart
02. Interview: Robert Chang [AirBnB]00:00
03. Interview: Caroline [BMG]00:00
04. Interview: Dan [Coinbase]00:00
05. Interview: Richard [Starbucks]00:00
06. Outro00:00
The Data Science Process
02. Video: CRISP-DM00:00
03. Video: The Data Science Process – Business & Data00:00
04. Video: Business & Data Understanding – Example00:00
05. Screencast: Using Workspaces00:00
06. Quiz + Notebook: A Look at the Data
07. Screencast: A Look at the Data00:00
08. What Should You Check?
09. Video: Business & Data Understanding00:00
10. Video: Gathering & Wrangling00:00
11. Screencast: How To Break Into the Field?00:00
12. Notebook + Quiz: How To Break Into the Field
13. Screencast: How to Break Into the Field Solution00:00
14. Screencast: Bootcamps00:00
15.1 Quiz: Bootcamp Takeaways
15.1 Quiz: Bootcamp Takeaways
15.2 Quiz: Bootcamp Takeaways
15.2 Quiz: Bootcamp Takeaways
16. Notebook + Quiz: Job Satisfaction
17. Screencast: Job Satisfaction00:00
18. Video: It Is Not Always About ML00:00
19. Video: The Data Science Process – Modeling00:00
20. Video: Predicting Salary00:00
21. Screencast: Predicting Salary00:00
22. Notebook + Quiz: What Happened?
23. Screencast: What Happened Solution00:00
24. Video: Working With Missing Values00:00
25. Video: Removing Data – Why Not?00:00
26. Video: Removing Data – When Is It OK?00:00
27. Video: Removing Data – Other Considerations00:00
28. Quiz: Removing Data
29. Notebook + Quiz: Removing Values
30. ScreenCast: Removing Data Solution00:00
31. Notebook + Quiz: Removing Data Part II
32. Screencast: Removing Data Part II Solution00:00
33. Video: Imputing Missing Values00:00
34. Notebook + Quiz: Imputation Methods & Resources
35. Screencast: Imputation Methods & Resources Solution00:00
36. Notebook + Quiz: Imputing Values
37. Screencast: Imputing Values Solution00:00
38. Video: Working With Categorical Variables Refresher00:00
39. Notebook + Quiz: Categorical Variables
40. Screencast: Categorical Variables Solution00:00
41. Video: How to Fix This?00:00
42. Notebook + Quiz: Putting It All Together
43. Screencast + Notebook: Putting It All Together Solution00:00
44.1 Text + Quiz: Results
44.1 Text + Quiz: Results
44.2 Text + Quiz: Results
44.2 Text + Quiz: Results
44.3 Text + Quiz: Results
44.3 Text + Quiz: Results
45. Video: The Data Science Process – Evaluate & Deploy00:00
46. Text: Recap
Communicating to Stakeholders
02. Video: First Things First00:00
03. Text: README Showcase
04. Video: Posting to Github00:00
05. Quiz: Github Check
06. Video: Up And Running On Medium00:00
07. Text: Medium Getting Started Post and Links
08. Video: Know Your Audience00:00
09. Who Is The Audience?
10. Video: Three Steps to Captivate Your Audience00:00
11. Video: First Catch Their Eye00:00
12. Picture First, Title Second
13. Video: More Advice00:00
14. More Advice
15. Video: End With A Call To Action00:00
16. End With A Call To Action
17. Video: Other Important Information00:00
18. Text: Recap
19. Video: Conclusion00:00
Project Write A Data Science Blog Post
01. Project Overview00:00
02. Project Motivation and Details
Project Description – Write A Data Science Blog Post
Project Rubric – Write A Data Science Blog Post
Introduction to Software Engineering
02. Course Overview0:58
Software Engineering Practices Pt I
02. Clean and Modular Code4:19
Quiz
03. Refactoring Code2:01
04. Writing Clean Code5:11
05. Quiz: Clean Code
06. Writing Modular Code5:25
07. Refactoring – Wine Quality
08. Solution: Refactoring – Wine Quality
09. Efficient Code1:45
10. Optimizing – Common Books3:35
Documentation1:20
In-line Comments1:38
Docstrings1:14
Project Documentation
19. Documentation
Version Control in Data Science0:41
Scenario #12:39
Scenario #21:19
Scenario n. ° 31:18
Model Versioning
Conclusion0:36
Software Engineering Practices Pt II
02. Testing1:03
03. Testing and Data Science1:50
04. Unit Tests2:36
05. Unit Testing Tools1:18
07. Desarrollo basado en pruebas y ciencia de datos2:23
08. Logging0:50
09. Log Messages
Quiz
11. Code reviews0:47
12. Questions to Ask Yourself When Conducting a Code Review
13. Tips for Conducting a Code Review
14. Conclution00:27
OOP
01. Introduction1:21
02. Procedural vs. Object-Oriented Programmingedium1:55
02. Quiz
03. Class, Object, Method and Attribute2:37
03. Quiz
04. OOP Syntax5:32
05. Exercise: OOP Syntax Practice – Part 100:00
06. A Couple of Notes about OOP4:39
07. Exercise: OOP Syntax Practice – Part 200:00
08. Commenting Object-Oriented Code
09. Gaussian Class1:33
10. How the Gaussian Class Works3:36
11. Exercise: Code the Gaussian Class00:00
12. Magic Methods1:47
12.1 Magic Methods in Code00:00
13. Exercise: Code Magic Methods00:00
14. Inheritance00:00
14.1 Inheritance Example V100:00
15. Exercise: Inheritance with Clothing00:00
16. Inheritance: Probability Distribution00:00
17. Demo: Inheritance Probability Distributions00:00
18. Advanced OOP Topics
19. Organizing into Modules3:27
20. Demo: Modularized Code00:00
21. Making a Package5:38
22. Entornos virtuales2:24
24. Binomial Class00:00
24. Binomial Class 200:00
26. Scikit-learn Source Code00:00
27. Putting Code on PyPi00:00
29. Lesson Summary00:00
Portfolio Exercise: Upload a Package to PyPi
01. Introduction
02. Troubleshooting Possible Errors
03. Workspace00:00
Web Development
02. Lesson Overview1:02
03. Components of a Web App00:00
03. Quiz
04. The Front-End00:00
05. HTML00:00
05. Quiz
06. Exercise: HTML00:00
07. Div and Span00:00
08. IDs and Classes00:00
09. Exercise: HTML Div, Span, IDs, Classes00:00
10. CSS00:00
11. Exercise: CSS00:00
12. JavaScript00:00
13. Exercise: JavaScript00:00
14. Bootstrap Library00:00
15. Exercise: Bootstrap00:00
16. Plotly00:00
17. Exercise: Plotly00:00
18. The Backend00:00
19. The Web00:00
20. Flask4:59
21. Exercise: Flask00:00
22. Flask + Pandas00:00
23. Example: Flask + Pandas00:00
24. Flask+Plotly+Pandas Part 100:00
25. Flask+Plotly+Pandas Part 200:00
26. Flask+Plotly+Pandas Part 300:00
27. Flask+Plotly+Pandas Part 400:00
28. Example: Flask + Plotly + Pandas00:00
29. Exercise: Flask + Plotly + Pandas00:00
30. Deployment00:00
31. Exercise: Deployment00:00
32. Lesson Summary00:00
Portfolio Exercise: Deploy a Data Dashboard
02. Workspace Portfolio Exercise00:00
03. Troubleshooting Possible Errors
04. Congratulations3:32
05. APIs [advanced version]
06. World Bank API [advanced version]00:00
07. Python and APIs [advanced version]
08. World Bank Data Dashboard [advanced version]00:00
Introduction to Data Engineering
ETL Pipelines
02. Lesson Overview1:09
03. World Bank Datasets4:01
03. Quiz
05. Extract0:41
05. Overview of the Extract Part of the Lesson00:00
05. Quiz
06. Exercise: CSV
07. Exercise: JSON and XML
08. Exercise: SQL Databases
09. Extracting Text Data
10. Exercise: APIs
11. Transform3:11
11. Overview of the Transform Part of the Lesson00:00
12. Combining Data00:00
11. Quiz
13. Exercise: Combining Data
14. Cleaning Data1:31
15. Exercise: Cleaning Data
16. Exercise: Data Types
17. Exercise: Parsing Dates
18. Matching Encodings00:00
19. Exercise: Matching Encodings
20. Missing Data – Overview00:00
21. Missing Data – Delete00:00
22. Missing Data – Impute00:00
23. Exercise: Imputation
24. SQL, optimization, and ETL – Robert Chang Airbnb4:25
25. Duplicate Data00:00
26. Exercise: Duplicate Data
27. Dummy Variables00:00
28. Exercise: Dummy Variables
29. Outliers – How to Find Them00:00
30. Exercise: Outliers Part 1
31. Outliers – What to do00:00
32. Exercise: Outliers – Part 2
33. AI and Data Engineering – Robert Chang Airbnb2:09
34. Scaling Data00:00
35. Exercise: Scaling Data
36. Feature Engineering00:00
37. Exercise: Feature Engineering
38. Bloopers00:00
39. Load00:00
39. Overview of the Load Part of the Lesson00:00
40. Exercise: Load
41. Putting It All Together00:00
41. Overview of the Final Exercise00:00
42. Exercise: Putting It All Together
43. Lesson Summary00:00
Introduction to NLP
How NLP Pipelines Work00:00
Text Processing00:00
04. Cleaning00:00
05. Notebook: Cleaning
06. Normalization00:00
06. Quiz
07. Notebook: Normalization
08. Tokenization00:00
09. Notebook: Tokenization
10. Stop Word Removal00:00
11. Notebook: Stop Words
12. Part-of-Speech Tagging00:00
13. Named Entity Recognition00:00
14. Notebook: POS and NER
15. Stemming and Lemmatization00:00
16. Notebook: Stemming and Lemmatization
17. Text Processing Summary00:00
18. Feature Extraction00:00
19. Bag of Words00:00
20. TF-IDF00:00
21. Notebook: Bag of Words and TF-IDF
22. One-Hot Encoding00:00
23. Word Embeddings00:00
24. Modeling00:00
25. [OPTIONAL] Word2Vec00:00
26. [OPTIONAL] GloVe00:00
27. [OPTIONAL] Embeddings for Deep Learning00:00
28. [OPTIONAL] t-SNE00:00
Machine Learning Pipelines
02. Corporate Messaging Case Study4:11
03. Case Study Clean and Tokenize00:00
04. Solution Clean and Tokenize00:00
05. Machine Learning Workflow00:00
06. Case Study Machine Learning Workflow00:00
07. Solution Machine Learning Workflow00:00
08. Using Pipeline00:00
09. Advantages of Using Pipeline00:00
10. Case Study Build Pipeline
11. Solution Build Pipeline
12. Pipelines and Feature Unions00:00
13. Using Feature Union00:00
14. Case Study Add Feature Union
15. Solution Add Feature Union
16. Creating Custom Transformers00:00
17. Case Study Create Custom Transformer00:00
18. Solution Create Custom Transformer
19. Pipelines and Grid Search00:00
20. Using Grid Search with Pipelines2:14
21. Case Study Grid Search Pipeline00:00
22. Solution Grid Search Pipeline
23. Conclusion00:00
Disaster Response Pipeline
02. Building a Sentiment Analysis Model (XGBoost)4:17
03. Building a Sentiment Analysis Model (Linear Learner)00:00
04. Combining the Models00:00
05. Mini-Project: Updating a Sentiment Analysis Model00:00
06. Loading and Testing the New Data00:00
07. Exploring the New Data00:00
08. Building a New Model00:00
09. SageMaker Retrospective00:00
11. SageMaker Tips and Tricks00:00
Project1: Disaster Response Pipeline
01. Project Introduction00:00
02. Project Overview1:22
03. Project Details
04. Project Workspace – ETL
05. Project Workspace – ML Pipeline
06. Project Workspace IDE00:00
Project Description – Disaster Response Pipelines
Project Rubric – Disaster Response Pipelines
Concepts in Experiment Design
01. Deployment Project1:41
02. Setting up a Notebook Instance
03. SageMaker Instance Utilization Limits
Deploy a Sentiment Analysis Model
Project Rubric – Deploy a Sentiment Analysis Model
Statistical Considerations in Testing
Interview Segment00:00
02 What Applications Are Enabled By Amazon00:00
03 Why Should Students Gain Skills In Sagemaker And Cloud Services00:00
Course Outline, Case Studies
04. Unsupervised v Supervised Learning00:00
Model Design00:00
Population Segmentation00:00
K-means, Overview00:00
Creating a Notebook Instance00:00
09. Create a SageMaker Notebook Instance
10. Pre-Notebook: Population Segmentation
11. Exercise: Data Loading & Processing00:00
12. Solution: Data Pre-Processing00:00
13. Exercise: Normalization
14. Solution: Normalization00:00
15. PCA, Overview00:00
PCA Estimator & Training00:00
Exercise: PCA Model Attributes & Variance00:00
Solution: Variance00:00
Component Makeup00:00
20. Exercise: PCA Deployment & Data Transformation
21. Solution: Creating Transformed Data00:00
22. Exercise: K-means Estimator & Selecting K
23. Exercise: K-means Predictions (clusters)
24. Solution: K-means Predictor00:00
25. Exercise: Get the Model Attributes
26. Solution: Model Attributes00:00
27. Clean Up: All Resources
AWS Workflow & Summary00:00
Statistical Considerations in Testing
01. Lesson Introduction00:00
02. Practice: Statistical Significance
03. Statistical Significance – Solution
04. Practical Significance00:00
05. Experiment Size00:00
06. Experiment Size – Solution
07. Using Dummy Tests00:00
08. Non-Parametric Tests Part I
09. Non-Parametric Tests Part I – Solution
10. Non-Parametric Tests Part II
11. Non-Parametric Tests Part II – Solution
12. Analyzing Multiple Metrics00:00
12.2 Analyzing Multiple Metrics00:00
13. Early Stopping00:00
14. Early Stopping – Solution
15. Lesson Conclusion00:00
AB Testing Case Study
Pre-Notebook: Payment Fraud Detection
Exercise: Payment Transaction Data00:00
Solution: Data Distribution & Splitting00:00
LinearLearner & Class Imbalance00:00
Exercise: Define a LinearLearner
Solution: Default LinearLearner00:00
Exercise: Format Data & Train the LinearLearner
Solution: Training Job00:00
Precision & Recall, Overview
Exercise: Deploy Estimator
Solution: Deployment & Evaluation00:00
Model Improvements00:00
Improvement, Model Tuning00:00
Exercise: Improvement, Class Imbalance
Solution: Accounting for Class Imbalance00:00
Exercise: Define a Model w/ Specifications
One Solution: Tuned and Balanced LinearLearner
Summary and Improvements00:00
A/B Testing Case Study
01. Lesson Introduction00:00
02. Scenario Description
03. Building a Funnel
04. Building a Funnel – Discussion
05. Deciding on Metrics – Part I
06. Deciding on Metrics – Part II
07. Deciding on Metrics – Discussion
08. Experiment Sizing
09. Experiment Sizing – Discussion
10. Validity, Bias, and Ethics – Discussion
11. Analyze Data
12. Draw Conclusions
13. Draw Conclusions – Discussion
14. Lesson Conclusion00:00
Portfolio Exercise Starbucks
Can You Explain The Idea Behind The GitHub Respository00:00
Does Sagemaker Work With Certain Products Or Use Cases00:00
How Do You Label Data At Scale00:00
What_S Your Prediction Of What Sagemaker Will Prioritize In The Next 1-2 Years00:00
Do You Have Advice For Someone Who Wants To Learn More00:00
Introduction to Recommendation Engines
Pre-Notebook: Custom Models & Moon Data
02. Moon Data & Custom Models4:27
03. Upload Data to S300:00
Exercise: Custom PyTorch Classifier00:00
Solution: Simple Neural Network00:00
Exercise: Training Script00:00
Solution: Complete Training Script00:00
Custom SKLearn Model
PyTorch Estimator00:00
Exercise: Create a PyTorchModel & Endpoint
Solution: PyTorchModel & Evaluation00:00
Clean Up: All Resources
Summary of Skills
Matrix Factorization for Recommendations
Forecasting Energy Consumption00:00
03. Pre-Notebook: Time-Series Forecasting
Processing Energy Data00:00
Exercise Creating Time Series00:00
06. Solution: Split Data
Exercise Convert to JSON00:00
Solution Formatting JSON Lines _ DeepAR Estimator00:00
09. Exercise: DeepAR Estimator
Solution Complete Estimator _ Hyperparameters00:00
Making Predictions00:00
12. Exercise: Predicting the Future
Solution Predicting Future Data00:00
14. Clean Up: All Resources
Recommendation Engines
02. Containment00:00
04. Longest Common Subsequence00:00
05. Dynamic Programming00:00
01. Project Overview
06. Project Files _ Evaluation
07. Notebooks
Project Description – Plagiarism Detector
All Required Files and Tests
Upcoming Lesson
Time-Series Prediction00:00
Training _ Memory00:00
Hidden State Dimensions
Character-wise RNNs00:00
Sequence Batching00:00
Pre-Notebook: Character-Level RNN00:00
07. Notebook: Character-Level RNN
Implementing a Char-RNN
Batching Data, Solution00:00
Batching Data, Solution00:00
Defining the Model00:00
Char-RNN, Solution00:00
Making Predictions00:00
Sentiment Prediction RNN
Pre-Notebook: Sentiment RNN
03. Notebook: Sentiment RNN
04. Data Pre-Processing00:00
Encoding Words, Solution00:00
Getting Rid of Zero-Length00:00
Cleaning & Padding Data00:00
Padded Features, Solution00:00
TensorDataset & Batching Data00:00
Defining the Model00:00
Complete Sentiment RNN00:00
Training the Model00:00
Testing00:00
Inference, Solution
Convolutional Neural Networks
Applications of CNNs00:00
Lesson Outline00:00
MNIST Dataset00:00
How Computers Interpret Images00:00
MLP Structure & Class Scores00:00
Quiz
07. Do Your Research00:00
Loss & Optimization00:00
09. Defining a Network in PyTorch4:28
10. Training the Network00:00
11. Pre-Notebook: MLP Classification, Exercise
12. Notebook: MLP Classification, MNIST
One Solution00:00
14. Model Validation00:00
15. Validation Loss00:00
16. Image Classification Steps00:00
17. MLPs vs CNNs00:00
18. Local Connectivity00:00
19. Filters and the Convolutional Layer00:00
Filters & Edges00:00
21. Frequency in Images
22. High-pass Filters00:00
Quiz: Kernels
Notebook: Custom Filters
OpenCV & Creating Custom Filters
26. Convolutional Layer
27. Convolutional Layer00:00
28. Stride and Padding00:00
29. Pooling Layers
Notebook: Layer Visualization
Capsule Networks
Increasing Depth00:00
33. CNNs for Image Classification
Quiz 33
34. Convolutional Layers in PyTorch
Quiz 34
35. Feature Vector00:00
36. Pre-Notebook: CNN Classification
37. Notebook: CNNs for CIFAR Image Classification
38. CIFAR Classification Example00:00
39. CNNs in PyTorch00:00
40. Image Augmentation00:00
41. Augmentation Using Transformations00:00
42. Groundbreaking CNN Architectures00:00
43. Visualizing CNNs (Part 1)00:00
44. Visualizing CNNs (Part 2)
Summary of CNNs00:00
Transfer Learning
Useful Layers00:00
Fine-Tuning00:00
VGG Model & Classifier00:00
Pre-Notebook: Transfer Learning
06. Notebook: Transfer Learning, Flowers
07. Freezing Weights & Last Layer00:00
Training a Classifier00:00
Weight Initialization
Constant Weights00:00
Random Uniform00:00
General Rule00:00
Normal Distribution00:00
Pre-Notebook: Weight Initialization, Normal Distribution
07. Notebook: Normal & No Initialization
Solution and Default Initialization00:00
Additional Material
Autoencoders
Pre-Notebook: Linear Autoencoder00:00
A Linear Autoencoder00:00
Notebook: Linear Autoencoder
Defining & Training an Autoencoder00:00
A Simple Solution00:00
Learnable Upsampling00:00
Transpose Convolutions00:00
Convolutional Autoencoder00:00
Pre-Notebook: Convolutional Autoencoder
Notebook – Convolutional Autoencoder
Convolutional Solution00:00
Upsampling & Denoising00:00
De-noising00:00
Pre-Notebook: De-noising Autoencoder
Notebook: De-noising Autoencoder
Job Search
Intro00:00
Job Search Mindset00:00
Target Your Application to An Employer00:00
Open Yourself Up to Opportunity00:00
Refine Your Entry-Level Resume
Convey Your Skills Concisely00:00
Effective Resume Components00:00
Resume Structure00:00
Describe Your Work Experiences00:00
Resume Reflection00:00
Resume Review00:00
Craft Your Cover Letter
Get an Interview with a Cover Letter!00:00
Purpose of the Cover Letter00:00
Cover Letter Components00:00
Write the Introduction00:00
Write the Body00:00
Write the Conclusion00:00
Format00:00
Optimize Your GitHub Profile
Introduction00:00
GitHub profile important items00:00
Good GitHub repository00:00
Interview Part 100:00
Identify fixes for example “bad” profile00:00
Identify fixes for example “bad” profile 200:00
Quick Fixes #100:00
Quick Fixes #200:00
Writing READMEs00:00
Interview Part 200:00
Commit messages best practices
Reflect on your commit messages00:00
Participating in open source projects00:00
Interview Part 300:00
Participating in open source projects 2
Starring interesting repositories
Participating in open source projects 200:00
Starring interesting repositories00:00
Develop Your Personal Brand
Why Network?00:00
Why Use Elevator Pitches?00:00
Personal Branding
Elevator Pitch00:00
Pitching to a Recruiter00:00
Why Use Elevator Pitches?00:00
Portfolio Exercise: Deploy a Data Dashboard
Personal portfolios are an excellent way to demonstrate your knowledge and creativity. In fact, they are little by little becoming a must-have for people working in the tech industry. In this portfolio building exercise, you will create a data dashboard using Bootstrap, Plotly, Flask and Heroku.
Note that a portfolio exercise like this is not reviewed. So you will not submit your work on this, and you do not need to complete this assignment in order to graduate.
Your main job will be to write Python code that reads in data, cleans the data, and then uses the data to make Plotly visualizations. This is your opportunity to show off your Python coding ability and visualization encoding skills.
In the next part of the lesson, you’ll find a workspace where you can develop the web app. Note that there is also an optional advanced version of the project where you’re encouraged to pull data from an API. You’ll see in this lesson that there are a few sections with “[advanced version]” in the title. If you’d like to do the advanced version, then you’ll want to go through this entire lesson before starting to develop your app.
General Instructions
Develop and deploy a data dashboard. The Web Development lesson has all of the information you need. If you are new to web development, you might have to go back to the concepts and rewatch some of the videos. The “deployment” parts of the lesson should be especially helpful. The video in that part of the lesson shows how to deploy a web app to Heroku. And the associated exercise has a complete, functioning web app with visualizations.
Most of the work will involve:
- Wrangling your chosen data set to get the data in the format you want
- Writing Python code to read in the data set and set up Plotly plots
- Tweaking HTML so that the website has the design and information that you want.
We are providing a template that uses the Bootstrap library and Flask framework. The template is the same one used to build the app in the course except the name of the app has been changed. In the template, everything has the generic name “myapp” instead of “worldbankapp”. The template is set up so that you can use pandas for loading the data and Python to create the dictionaries needed for plotly.
You’ll only need to modify the following files:
- wrangle_data.py
- index.html
Although the front-end is already set up for you, you should change the links and titles in index.html. If you want to add more visualizations or remove visualizations, you’ll need to adjust the front-end code in index.html accordingly. That will involve adding or removing rows and columns in the HTML file.
For deployment, you can use a back-end service like Heroku.
How to Build the App
You’ll find a workspace in the next part of the lesson. The workspace already contains the template code with a working web app. The web app has a back-end and front-end. Recall that you can run the web app from the workspace:
To run the app from the workspace, open a terminal and type env | grep WORK
. Note the WORKSPACEDOMAIN and WORKSPACEID. To start the web app, type python myapp.py
.
You can open a new browser window and go to the address:http://WORKSPACESPACEID-3001.WORKSPACEDOMAIN
replacing WORKSPACEID and WORKSPACEDOMAIN with your values.
However, there is no data for the visualizations. You’ll need to write a Python script that reads in the data files of your choosing and sets up the plots for Plotly. The process will be exactly the same as the one presented in the web development course.
** If you need to upload any files to the workspace, you can do so by clicking on the plus (+) sign and choosing “add file” or “add folder”.**
The template code is also available on GitHub as part of the data scientist nanodegree term 2 repo.
Test your app in the workspace to make sure that everything is working. You’ll see that if you start the app without modifying any of the code, the app currently works.
You should also save your work to a GitHub or GitLab repository so that you can use your code as part of your professional portfolio.
Once you’re ready to deploy the app, don’t forget to remove the app.run()
line of code in the myapp.py file (In the web development lesson, myapp.py was called worldbank.py). You’ll need to add a Procfile and requirements.txt file as well. Follow the instructions in the web development lesson to learn how to deploy the app from the classroom. And always comment your code :-)!
Also, at the end of this page you’re reading, you’ll find information about a more advanced version of the data dashboard that you can build.
Steps
Here is a reminder of the steps you’ll need to do:
- find a data set or a few data sets that you’re interested in
- explore and clean the data set
- put the data into a csv file or files – you can use pandas or spreadsheet software to do this
- upload your data sets to the correct folder
- write a Python script to read in the data set and set up the Plotly visualizations
- set up a virtual environment and install the necessary libraries to run your app
- run your web app locally to make sure that everything works
- deploy the app to Heroku or some other back-end service
Where to Build the Web App
We are providing a workspace containing a web app template. You can use this template to build and deploy your web app within the classroom.
The classroom has an Ubuntu Linux environment. Developing the app locally on macOS should be very similar. On a Windows machine, the commands are slightly different and you’ll need to use the command prompt. This link contains a comparison of MS-DOS vs Linux commands.
To install the Heroku command line interface on a Windows machine, follow the instructions here on the Heroku website.
Advanced Version of the Exercise
If you’d like an extra challenge, consider using an API to obtain your data. API stands for Application Programming Interface. An API provides a convenient way for two applications to communicate with each other. To be more concrete you can pull data directly from the World Bank API, clean the data in the back-end using pandas, and then display the results on your front-end. This would be instead of using a csv file for your data.
The benefit is that if the data ever changes, your web app will automatically have the correct data. Many companies provide APIs for accessing their data including Facebook, Twitter, Google among others. As an example, here is an API for pulling data about DVDs, movies, books, and games.
After the workspace, you’ll find a set of concepts that explain how to use the World Bank API. Go through that material if you’d like an extra challenge for building your web app.