Data Science for Professionals
Data Science for Professionals
Our Data Science for Professionals program is based on our Applied Labs teaching concept with 100% hands-on learning. The program runs for 5 days (Monday-Friday), and features 100 plus hours with theDevMasters team (50 hours in-class, 40 hours PreWork, and 30 hours of post-class mentoring).
What is Applied Labs?
Simply defined, Applied Labs not your typical classroom, college or boot camp style learning. In Applied Labs every assignment and, project is designed to build your programming, analytical, statistical, and domain expertise skills. We focus on peer-to-peer learning, bringing statisticians, programmers, and domain experts or entrepreneurs together to learn from and teach each other.
Who should attend this program?
This program is geared towards professionals from various backgrounds, and no specific background is required other than completing our pre-course material. This program is very fast paced, and while we do everything in our power to ensure all students keep up we’d recommend students who anticipate needing extra time to consider one of our other programs such as Mastering Applied Data Science. As with our other programs we’re proud to offer our 100% repeat guarantee – you’re welcome to return for another session for free in case you’d like to review any of the content.
Prerequisite Learning :
- Mode of delivery (Webinar, In-class Self-Learning, Mentoring)
- Python 101
- Stats 101
- SQL 101
- Web 101
All of the material will be provided to you
Software, Hardware and Cloud Requirements: A laptop
Any operating system ( Windows, Mac, or Linux) is fine.
Amazon Web Services account(requires credit card) for big data.
Required software will be provided and installed by staff in class
Session I : Python 101
Whether you are familiar with programming or not, our Python PreWork sessions introduce the fundamentals of Python, such as variables, string fundamentals, if-else statements, try & except statements, for loops, while loops, break & continue statements, & lambda functions, as well as certain data types relevant to data science, like lists, tuples, dictionaries, & sets for beginning exposure. The activities done in these sessions will be guides to student’s questions moving forward in the classes.
Session II : Statistics 101
The hands-on portion of statistics in PreWork is to establish the surface level understanding of concepts such as mathematical variables, like numerical vs categorical, nominal vs ordinal, interval vs discrete; measurements of statistics, like when to use mean, when to consider median, & when to revised to mode; relationship between variables, like correlation & independence; ending with hypothesis testing & p-value, but only to the degree of applying the mindset towards data science. These concepts will be reviewed in the program to ensure that student’s clarifications are addressed.
Session III : SQL 101
While some of the tools used in Python will take the place of SQL functions & methods, it is still beneficial to understand the origins of these tools as well as be able to replicate them when applied in future work’s expectations. A solid portion of demand in data science jobs ask for big-query experience with SQL, like Microsoft SQL & PostgreSQL vs NoSQL, like MongoDB & DynamoDB, which we will glimpse at scenarios to further solidify the students’ candidacy.
Session IV : Web 101
An introduction to HTML & CSS is key to future project building & publications of the blog posts of student progress throughout the program. A proportion of relevant data is out there in the web for us to utilize & using the most open source methods, like HTML & CSS to be able to grab that information within our Python environments will be introduced in Day 3 & furthermore, once students are in Project Based Learning, GitHub portfolios are best displayed in themes that students choose & customize with HTML & CSS.
Data Science Applied Labs
Session I : Introduction to Data Science, Big Data and Web Scrapping
We will be starting by introducing Data Science and the popular Cross Industry Standard Process for Data Mining (CRISP-DM) framework. Followed by what is Big Data and how it differs from data and introduction to python libraries i.e. BeautifulSoup for web scrapping. Second half of the day would include introduction to cloud based machine learning platforms Amazon AWS and Microsoft Azure.
Session II : Exploratory Data Analysis and Visualization
We will be working some exploratory data analysis (EDA) using pandas and other data management tools to get a better understanding of data at hand. EDA would include visualization using Matplotlib, Seaborn and Tableau.
Session III : Supervised and Unsupervised Machine Learning
Session 3 starts with a review of machine learning, supervised and unsupervised techniques. Starting with regression based machine learning algorithms: multivariable, ridge and lasso. Second half of the day we will dive into classification algorithms, such as Naïve Bayes, Decision Trees, Random Forest, and Gradient boosting. Along with scoring using precision, recall, sensitivity, specificity, and accuracy score, AUC, and ROC. We will end the day by working with unsupervised machine learning using KMeans.
Session IV : Recommendation Systems and Natural Language Processing
Students will review types of recommender systems and work towards building their own recommendation system with MovieLens dataset. Later during the day we will explore Natural Language Toolkit to process and extract text data. Students will then start a Natural Language Processing project with Yelp data before we move onto Sentimental Analysis to predict positive versus negative Yelp reviews.
Session V : Deep Learning and Project
We will be introducing deep learning and training neural network and visualizing what a neural network has learned using TensorFlow Playground. The second half of the day students will work on a capstone project from Kaggle.