Intro to Python for Computer Science and Data Science

Höfundur Paul Deitel; Harvey M. Deitel

Útgefandi Pearson Education (US)

Snið ePub

Print ISBN 9780135404676

Útgáfa 1

Höfundarréttur 2020

7.790 kr.

Description

Efnisyfirlit

  • Contents
  • Intro to Python® for Computer Science and Data Science
  • Deitel® Series Page
  • Intro to Python® for Computer Science and Data Science
  • Intro to Python® for Computer Science and Data Science
  • Contents
  • Preface
  • Python for Computer Science and Data Science Education
  • Modular Architecture
  • Audiences for the Book
  • Key Features
  • Chapter Dependencies
  • Computing and Data Science Curricula
  • Data Science Overlaps with Computer Science28
  • Jobs Requiring Data Science Skills
  • Jupyter Notebooks
  • Docker
  • Class Tested
  • “Flipped Classroom”
  • Special Feature: IBM Watson Analytics and Cognitive Computing
  • Teaching Approach
  • Software Used in the Book
  • Python Documentation
  • Getting Your Questions Answered
  • Student and Instructor Supplements
  • Instructor Supplements on Pearson’s Instructor Resource Center
  • Instructor Examination Copies
  • Keeping in Touch with the Authors
  • Acknowledgments
  • About the Authors
  • About Deitel® & Associates, Inc.
  • Before You Begin
  • 1 Introduction to Computers and Python
  • Objectives
  • Outline
  • 1.1 Introduction
  • 1.2 Hardware and Software
  • 1.2.1 Moore’s Law
  • 1.2.2 Computer Organization
  • Input Unit
  • Output Unit
  • Memory Unit
  • Arithmetic and Logic Unit (ALU)
  • Central Processing Unit (CPU)
  • Secondary Storage Unit
  • Self Check for Section 1.2
  • 1.3 Data Hierarchy
  • Self Check
  • 1.4 Machine Languages, Assembly Languages and High-Level Languages
  • Self Check
  • 1.5 Introduction to Object Technology
  • Self Check for Section 1.5
  • 1.6 Operating Systems
  • Self Check for Section 1.6
  • 1.7 Python
  • Self Check
  • 1.8 It’s the Libraries!
  • 1.8.1 Python Standard Library
  • 1.8.2 Data-Science Libraries
  • Self Check for Section 1.8
  • 1.9 Other Popular Programming Languages
  • Self Check
  • 1.10 Test-Drive: Using IPython and Jupyter Notebooks
  • 1.10.1 Using IPython Interactive Mode as a Calculator
  • Entering IPython in Interactive Mode
  • Evaluating Expressions
  • Exiting Interactive Mode
  • Self Check
  • 1.10.2 Executing a Python Program Using the IPython Interpreter
  • Changing to This Chapter’s Examples Folder
  • Executing the Script
  • Creating Scripts
  • Problems That May Occur at Execution Time
  • Self Check
  • 1.10.3 Writing and Executing Code in a Jupyter Notebook
  • Opening JupyterLab in Your Browser
  • Creating a New Jupyter Notebook
  • Renaming the Notebook
  • Evaluating an Expression
  • Adding and Executing Another Cell
  • Saving the Notebook
  • Notebooks Provided with Each Chapter’s Examples
  • Opening and Executing an Existing Notebook
  • Closing JupyterLab
  • JupyterLab Tips
  • More Information on Working with JupyterLab
  • Self Check
  • 1.11 Internet and World Wide Web
  • 1.11.1 Internet: A Network of Networks
  • 1.11.2 World Wide Web: Making the Internet User-Friendly
  • 1.11.3 The Cloud
  • Mashups
  • 1.11.4 Internet of Things
  • Self Check for Section 1.11
  • 1.12 Software Technologies
  • Self Check
  • 1.13 How Big Is Big Data?
  • Self Check
  • 1.13.1 Big Data Analytics
  • 1.13.2 Data Science and Big Data Are Making a Difference: Use Cases
  • 1.14 Case Study—A Big-Data Mobile Application
  • 1.15 Intro to Data Science: Artificial Intelligence—at the Intersection of CS and Data Science
  • Self Check
  • Exercises
  • 2 Introduction to Python Programming
  • Objectives
  • Outline
  • 2.1 Introduction
  • 2.2 Variables and Assignment Statements
  • Self Check
  • 2.3 Arithmetic
  • Self Check
  • 2.4 Function print and an Intro to Single- and Double-Quoted Strings
  • Self Check
  • 2.5 Triple-Quoted Strings
  • Self Check
  • 2.6 Getting Input from the User
  • Self Check
  • 2.7 Decision Making: The if Statement and Comparison Operators
  • Self Check
  • 2.8 Objects and Dynamic Typing
  • Self Check
  • 2.9 Intro to Data Science: Basic Descriptive Statistics
  • Self Check
  • 2.10 Wrap-Up
  • Exercises
  • 3 Control Statements and Program Development
  • Objectives
  • Outline
  • 3.1 Introduction
  • 3.2 Algorithms
  • Self Check
  • 3.3 Pseudocode
  • Self Check
  • 3.4 Control Statements
  • Self Check
  • 3.5 if Statement
  • Self Check
  • 3.6 if…else and if…elif…else Statements
  • Self Check
  • 3.7 while Statement
  • Self Check
  • 3.8 for Statement
  • 3.8.1 Iterables, Lists and Iterators
  • 3.8.2 Built-In range Function
  • Off-By-One Errors
  • Self Check
  • 3.9 Augmented Assignments
  • Self Check
  • 3.10 Program Development: Sequence-Controlled Repetition
  • 3.10.1 Requirements Statement
  • 3.10.2 Pseudocode for the Algorithm
  • 3.10.3 Coding the Algorithm in Python
  • Execution Phases
  • Initialization Phase
  • Processing Phase
  • Termination Phase
  • 3.10.4 Introduction to Formatted Strings
  • Self Check
  • 3.11 Program Development: Sentinel-Controlled Repetition
  • Self Check
  • 3.12 Program Development: Nested Control Statements
  • Self Check
  • 3.13 Built-In Function range: A Deeper Look
  • Self Check
  • 3.14 Using Type Decimal for Monetary Amounts
  • Self Check
  • 3.15 break and continue Statements
  • 3.16 Boolean Operators and, or and not
  • Self Check
  • 3.17 Intro to Data Science: Measures of Central Tendency—Mean, Median and Mode
  • Self Check
  • 3.18 Wrap-Up
  • Exercises
  • 4 Functions
  • Objectives
  • Outline
  • 4.1 Introduction
  • 4.2 Defining Functions
  • Self Check
  • 4.3 Functions with Multiple Parameters
  • Self Check
  • 4.4 Random-Number Generation
  • Self Check
  • 4.5 Case Study: A Game of Chance
  • Self Check
  • 4.6 Python Standard Library
  • Self Check
  • 4.7 math Module Functions
  • 4.8 Using IPython Tab Completion for Discovery
  • Self Check
  • 4.9 Default Parameter Values
  • Self Check
  • 4.10 Keyword Arguments
  • Self Check
  • 4.11 Arbitrary Argument Lists
  • Self Check
  • 4.12 Methods: Functions That Belong to Objects
  • 4.13 Scope Rules
  • Self Check
  • 4.14 import: A Deeper Look
  • Self Check
  • 4.15 Passing Arguments to Functions: A Deeper Look
  • Self Check
  • 4.16 Function-Call Stack
  • Self Check
  • 4.17 Functional-Style Programming
  • Pure Functions
  • 4.18 Intro to Data Science: Measures of Dispersion
  • Self Check
  • 4.19 Wrap-Up
  • Exercises
  • 5 Sequences: Lists and Tuples
  • Objectives
  • Outline
  • 5.1 Introduction
  • 5.2 Lists
  • Self Check
  • 5.3 Tuples
  • Self Check
  • 5.4 Unpacking Sequences
  • Self Check
  • 5.5 Sequence Slicing
  • Self Check
  • 5.6 del Statement
  • Self Check
  • 5.7 Passing Lists to Functions
  • Self Check
  • 5.8 Sorting Lists
  • Self Check
  • 5.9 Searching Sequences
  • Self Check
  • 5.10 Other List Methods
  • Self Check
  • 5.11 Simulating Stacks with Lists
  • Self Check
  • 5.12 List Comprehensions
  • Self Check
  • 5.13 Generator Expressions
  • Self Check
  • 5.14 Filter, Map and Reduce
  • Self Check
  • 5.15 Other Sequence Processing Functions
  • Self Check
  • 5.16 Two-Dimensional Lists
  • Self Check
  • 5.17 Intro to Data Science: Simulation and Static Visualizations
  • 5.17.1 Sample Graphs for 600, 60,000 and 6,000,000 Die Rolls
  • Self Check
  • 5.17.2 Visualizing Die-Roll Frequencies and Percentages
  • Launching IPython for Interactive Matplotlib Development
  • Importing the Libraries
  • Rolling the Die and Calculating Die Frequencies
  • Creating the Initial Bar Plot
  • Setting the Window Title and Labeling the x- and y-Axes
  • Finalizing the Bar Plot
  • Rolling Again and Updating the Bar Plot—Introducing IPython Magics
  • Saving Snippets to a File with the %save Magic
  • Command-Line Arguments; Displaying a Plot from a Script
  • Self Check
  • 5.18 Wrap-Up
  • Exercises
  • Exercises 5.24 through 5.26 are reasonably challenging. Once you’ve done them, you ought to be able to implement many popular card games.
  • 6 Dictionaries and Sets
  • Objectives
  • Outline
  • 6.1 Introduction
  • 6.2 Dictionaries
  • 6.2.1 Creating a Dictionary
  • Determining if a Dictionary Is Empty
  • Self Check
  • 6.2.2 Iterating through a Dictionary
  • Self Check
  • 6.2.3 Basic Dictionary Operations
  • Accessing the Value Associated with a Key
  • Updating the Value of an Existing Key–Value Pair
  • Adding a New Key–Value Pair
  • Removing a Key–Value Pair
  • Attempting to Access a Nonexistent Key
  • Testing Whether a Dictionary Contains a Specified Key
  • Self Check
  • 6.2.4 Dictionary Methods keys and values
  • Dictionary Views
  • Converting Dictionary Keys, Values and Key–Value Pairs to Lists
  • Processing Keys in Sorted Order
  • Self Check
  • 6.2.5 Dictionary Comparisons
  • Self Check
  • 6.2.6 Example: Dictionary of Student Grades
  • 6.2.7 Example: Word Counts2
  • Python Standard Library Module collections
  • Self Check
  • 6.2.8 Dictionary Method update
  • 6.2.9 Dictionary Comprehensions
  • Self Check
  • 6.3 Sets
  • Self Check
  • 6.3.1 Comparing Sets
  • Self Check
  • 6.3.2 Mathematical Set Operations
  • Union
  • Intersection
  • Difference
  • Symmetric Difference
  • Disjoint
  • Self Check
  • 6.3.3 Mutable Set Operators and Methods
  • Mutable Mathematical Set Operations
  • Methods for Adding and Removing Elements
  • Self Check
  • 6.3.4 Set Comprehensions
  • 6.4 Intro to Data Science: Dynamic Visualizations
  • Self Check
  • 6.4.1 How Dynamic Visualization Works
  • Animation Frames
  • Running RollDieDynamic.py
  • Sample Executions
  • Self Check
  • 6.4.2 Implementing a Dynamic Visualization
  • Importing the Matplotlib animation Module
  • Function update
  • Function update: Rolling the Die and Updating the frequencies List
  • Function update: Configuring the Bar Plot and Text
  • Variables Used to Configure the Graph and Maintain State
  • Calling the animation Module’s FuncAnimation Function
  • Self Check
  • 6.5 Wrap-Up
  • Exercises
  • 7 Array-Oriented Programming with NumPy
  • Objectives
  • Outline
  • 7.1 Introduction
  • Self Check
  • 7.2 Creating arrays from Existing Data
  • Self Check
  • 7.3 array Attributes
  • Self Check
  • 7.4 Filling arrays with Specific Values
  • 7.5 Creating arrays from Ranges
  • 7.6 List vs. array Performance: Introducing %timeit
  • 7.7 array Operators
  • 7.8 NumPy Calculation Methods
  • 7.9 Universal Functions
  • 7.10 Indexing and Slicing
  • 7.11 Views: Shallow Copies
  • 7.12 Deep Copies
  • 7.13 Reshaping and Transposing
  • 7.14 Intro to Data Science: pandas Series and DataFrames
  • 7.14.1 pandas Series
  • Creating a Series with Default Indices
  • Displaying a Series
  • Creating a Series with All Elements Having the Same Value
  • Accessing a Series’ Elements
  • Producing Descriptive Statistics for a Series
  • Creating a Series with Custom Indices
  • Dictionary Initializers
  • Accessing Elements of a Series Via Custom Indices
  • Creating a Series of Strings
  • Self Check
  • 7.14.2 DataFrames
  • Creating a DataFrame from a Dictionary
  • Customizing a DataFrame’s Indices with the index Attribute
  • Accessing a DataFrame’s Columns
  • Selecting Rows via the loc and iloc Attributes
  • Selecting Rows via Slices and Lists with the loc and iloc Attributes
  • Selecting Subsets of the Rows and Columns
  • Boolean Indexing
  • Accessing a Specific DataFrame Cell by Row and Column
  • Descriptive Statistics
  • Transposing the DataFrame with the T Attribute
  • Sorting by Rows by Their Indices
  • Sorting by Column Indices
  • Sorting by Column Values
  • Copy vs. In-Place Sorting
  • Self Check
  • 7.15 Wrap-Up
  • Exercises
  • 8 Strings: A Deeper Look
  • Objectives
  • Outline
  • 8.1 Introduction
  • 8.2 Formatting Strings
  • 8.2.1 Presentation Types
  • Integers
  • Characters
  • Strings
  • Floating-Point and Decimal Values
  • Self Check
  • 8.2.2 Field Widths and Alignment
  • Explicitly Specifying Left and Right Alignment in a Field
  • Centering a Value in a Field
  • Self Check
  • 8.2.3Numeric Formatting
  • Formatting Positive Numbers with Signs
  • Using a Space Where a + Sign Would Appear in a Positive Value
  • Grouping Digits
  • Self Check
  • 8.2.4String’s format Method
  • Multiple Placeholders
  • Referencing Arguments By Position Number
  • Referencing Keyword Arguments
  • Self Check
  • 8.3 Concatenating and Repeating Strings
  • 8.4 Stripping Whitespace from Strings
  • 8.5 Changing Character Case
  • 8.6 Comparison Operators for Strings
  • 8.7 Searching for Substrings
  • 8.8 Replacing Substrings
  • 8.9 Splitting and Joining Strings
  • 8.10 Characters and Character-Testing Methods
  • 8.11 Raw Strings
  • 8.12 Introduction to Regular Expressions
  • 8.12.1 re Module and Function fullmatch
  • Matching Literal Characters
  • Metacharacters, Character Classes and Quantifiers
  • Other Predefined Character Classes
  • Custom Character Classes
  • * vs. + Quantifier
  • Other Quantifiers
  • Self Check
  • 8.12.2 Replacing Substrings and Splitting Strings
  • Function sub—Replacing Patterns
  • Function split
  • Self Check
  • 8.12.3 Other Search Functions; Accessing Matches
  • Function search—Finding the First Match Anywhere in a String
  • Ignoring Case with the Optional flags Keyword Argument
  • Metacharacters That Restrict Matches to the Beginning or End of a String
  • Function findall and finditer—Finding All Matches in a String
  • Capturing Substrings in a Match
  • Self Check
  • 8.13 Intro to Data Science: Pandas, Regular Expressions and Data Munging
  • Self Check
  • 8.14 Wrap-Up
  • Exercises
  • Regular Expression Exercises
  • More Challenging String-Manipulation Exercises
  • 9 Files and Exceptions
  • Objectives
  • Outline
  • 9.1 Introduction
  • 9.2 Files
  • 9.3 Text-File Processing
  • 9.3.1 Writing to a Text File: Introducing the with Statement
  • The with Statement
  • Built-In Function open
  • Writing to the File
  • Contents of accounts.txt File
  • Self Check
  • 9.3.2 Reading Data from a Text File
  • File Method readlines
  • Seeking to a Specific File Position
  • Self Check
  • 9.4 Updating Text Files
  • Self Check
  • 9.5 Serialization with JSON
  • Self Check
  • 9.6 Focus on Security: pickle Serialization and Deserialization
  • 9.7 Additional Notes Regarding Files
  • Self Check
  • 9.8 Handling Exceptions
  • 9.8.1 Division by Zero and Invalid Input
  • Division By Zero
  • Invalid Input
  • 9.8.2 try Statements
  • try Clause
  • except Clause
  • else Clause
  • Flow of Control for a ZeroDivisionError
  • Flow of Control for a ValueError
  • Flow of Control for a Successful Division
  • Self Check
  • 9.8.3 Catching Multiple Exceptions in One except Clause
  • 9.8.4 What Exceptions Does a Function or Method Raise?
  • 9.8.5 What Code Should Be Placed in a try Suite?
  • 9.9 finally Clause
  • Self Check
  • 9.10 Explicitly Raising an Exception
  • Self Check
  • 9.11 (Optional) Stack Unwinding and Tracebacks
  • Self Check
  • 9.12 Intro to Data Science: Working with CSV Files
  • 9.12.1 Python Standard Library Module csv
  • Writing to a CSV File
  • Reading from a CSV File
  • Caution: Commas in CSV Data Fields
  • Caution: Missing Commas and Extra Commas in CSV Files
  • Self Check
  • 9.12.2 Reading CSV Files into Pandas DataFrames
  • Datasets
  • Working with Locally Stored CSV Files
  • 9.12.3 Reading the Titanic Disaster Dataset
  • Loading the Titanic Dataset via a URL
  • Viewing Some of the Rows in the Titanic Dataset
  • Customizing the Column Names
  • 9.12.4 Simple Data Analysis with the Titanic Disaster Dataset
  • 9.12.5 Passenger Age Histogram
  • Self Check
  • 9.13 Wrap-Up
  • Exercises
  • 10 Object-Oriented Programming
  • Objectives
  • Outline
  • 10.1 Introduction
  • 10.2 Custom Class Account
  • 10.2.1 Test-Driving Class Account
  • Importing Classes Account and Decimal
  • Create an Account Object with a Constructor Expression
  • Getting an Account’s Name and Balance
  • Depositing Money into an Account
  • Account Methods Perform Validation
  • Self Check
  • 10.2.2 Account Class Definition
  • Defining a Class
  • Initializing Account Objects: Method __init__
  • Method deposit
  • 10.2.3 Composition: Object References as Members of Classes
  • Self Check
  • 10.3 Controlling Access to Attributes
  • Self Check
  • 10.4 Properties for Data Access
  • 10.4.1 Test-Driving Class Time
  • Creating a Time Object
  • Displaying a Time Object
  • Getting an Attribute Via a Property
  • Setting the Time
  • Setting an Attribute via a Property
  • Attempting to Set an Invalid Value
  • Self Check
  • 10.4.2 Class Time Definition
  • Class Time: __init__ Method with Default Parameter Values
  • Class Time: hour Read-Write Property
  • Class Time: minute and second Read-Write Properties
  • Class Time: Method set_time
  • Class Time: Special Method __repr__
  • Class Time: Special Method __str__
  • Self Check
  • 10.4.3 Class Time Definition Design Notes
  • Interface of a Class
  • Attributes Are Always Accessible
  • Internal Data Representation
  • Evolving a Class’s Implementation Details
  • Properties
  • Utility Methods
  • Module datetime
  • Self Check
  • 10.5 Simulating “Private” Attributes
  • Self Check
  • 10.6 Case Study: Card Shuffling and Dealing Simulation
  • 10.6.1 Test-Driving Classes Card and DeckOfCards
  • Creating, Shuffling and Dealing the Cards
  • Dealing Cards
  • Class Card’s Other Features
  • 10.6.2 Class Card—Introducing Class Attributes
  • Class Attributes FACES and SUITS
  • Card Method __init__
  • Read-Only Properties face, suit and image_name
  • Methods That Return String Representations of a Card
  • 10.6.3 Class DeckOfCards
  • Method __init__
  • Method shuffle
  • Method deal_card
  • Method __str__
  • 10.6.4 Displaying Card Images with Matplotlib
  • Enable Matplotlib in IPython
  • Create the Base Path for Each Image
  • Import the Matplotlib Features
  • Create the Figure and Axes Objects
  • Configure the Axes Objects and Display the Images
  • Maximize the Image Sizes
  • Shuffle and Re-Deal the Deck
  • Self Check
  • 10.7 Inheritance: Base Classes and Subclasses
  • Self Check
  • 10.8 Building an Inheritance Hierarchy; Introducing Polymorphism
  • 10.8.1 Base Class CommissionEmployee
  • All Classes Inherit Directly or Indirectly from Class object
  • Testing Class CommissionEmployee
  • Self Check
  • 10.8.2 Subclass SalariedCommissionEmployee
  • Declaring Class SalariedCommissionEmployee
  • Inheriting from Class CommissionEmployee
  • Method __init__ and Built-In Function super
  • Overriding Method earnings
  • Overriding Method __repr__
  • Testing Class SalariedCommissionEmployee
  • Testing the “is a” Relationship
  • Self Check
  • 10.8.3 Processing CommissionEmployees and SalariedCommissionEmployees Polymorphically
  • Self Check
  • 10.8.4A Note About Object-Based and Object-Oriented Programming
  • 10.9 Duck Typing and Polymorphism
  • 10.10 Operator Overloading
  • Operator Overloading Restrictions
  • Complex Numbers
  • 10.10.1 Test-Driving Class Complex
  • 10.10.2 Class Complex Definition
  • Method __init__
  • Overloaded + Operator
  • Overloaded += Augmented Assignment
  • Method __repr__
  • Self Check
  • 10.11 Exception Class Hierarchy and Custom Exceptions
  • 10.12 Named Tuples
  • Self Check
  • 10.13 A Brief Intro to Python 3.7’s New Data Classes
  • 10.13.1 Creating a Card Data Class
  • Importing from the dataclasses and typing Modules
  • Using the @dataclass Decorator
  • Variable Annotations: Class Attributes
  • Variable Annotations: Data Attributes
  • Defining a Property and Other Methods
  • Variable Annotation Notes
  • Self Check
  • 10.13.2 Using the Card Data Class
  • Self Check
  • 10.13.3 Data Class Advantages over Named Tuples
  • 10.13.4 Data Class Advantages over Traditional Classes
  • More Information
  • 10.14 Unit Testing with Docstrings and doctest
  • Self Check
  • 10.15 Namespaces and Scopes
  • 10.16 Intro to Data Science: Time Series and Simple Linear Regression
  • Self Check
  • 10.17 Wrap-Up
  • Exercises
  • 11 Computer Science Thinking: Recursion, Searching, Sorting and Big O
  • Objectives
  • Outline
  • 11.1 Introduction
  • 11.2 Factorials
  • 11.3 Recursive Factorial Example
  • Self Check
  • 11.4 Recursive Fibonacci Series Example
  • Self Check
  • 11.5 Recursion vs. Iteration
  • 11.6 Self Check
  • 11.6 Searching and Sorting
  • 11.7 Linear Search
  • Self Check
  • 11.8 Efficiency of Algorithms: Big O
  • Self Check
  • 11.9 Binary Search
  • Self Check
  • 11.9.1 Binary Search Implementation
  • Function binary_search
  • Function remaining_elements
  • Function main
  • 11.9.2 Big O of the Binary Search
  • 11.10 Sorting Algorithms
  • 11.11 Selection Sort
  • 11.11.1 Selection Sort Implementation
  • Function selection_sort
  • Function main
  • 11.11.2 Utility Function print_pass
  • 11.11.3 Big O of the Selection Sort
  • Self Check
  • 11.12 Insertion Sort
  • 11.12.1 Insertion Sort Implementation
  • Function insertion_sort
  • 11.12.2 Big O of the Insertion Sort
  • Self Check
  • 11.13 Merge Sort
  • 11.13.1 Merge Sort Implementation
  • Function merge_sort
  • Recursive Function sort_array
  • Function merge
  • Function subarray_string
  • Function main
  • 11.13.2 Big O of the Merge Sort
  • Self Check
  • 11.14 Big O Summary for This Chapter’s Searching and Sorting Algorithms
  • 11.15 Visualizing Algorithms
  • 11.15.1 Generator Functions
  • yield Statements
  • 11.15.2 Implementing the Selection Sort Animation
  • import Statements
  • update Function That Displays Each Animation Frame
  • flash_bars Function That Flashes the Bars About to Be Swapped
  • selection_sort Generator Function
  • main Function That Launches the Animation
  • Sound Utility Functions
  • 11.16 Wrap-Up
  • Exercises
  • 12 Natural Language Processing (NLP)
  • Objectives
  • Outline
  • 12.1 Introduction
  • 12.2 TextBlob1
  • Self Check
  • 12.2.1 Create a TextBlob
  • Self Check
  • 12.2.2 Tokenizing Text into Sentences and Words
  • Self Check
  • 12.2.3 Parts-of-Speech Tagging
  • Self Check
  • 12.2.4 Extracting Noun Phrases
  • Self Check
  • 12.2.5 Sentiment Analysis with TextBlob’s Default Sentiment Analyzer
  • Getting the Sentiment of a TextBlob
  • Getting the polarity and subjectivity from the Sentiment Object
  • Getting the Sentiment of a Sentence
  • Self Check
  • 12.2.6 Sentiment Analysis with the NaiveBayesAnalyzer
  • Self Check
  • 12.2.7 Language Detection and Translation
  • Self Check
  • 12.2.8 Inflection: Pluralization and Singularization
  • Self Check
  • 12.2.9 Spell Checking and Correction
  • Self Check
  • 12.2.10 Normalization: Stemming and Lemmatization
  • Self Check
  • 12.2.11 Word Frequencies
  • Self Check
  • 12.2.12 Getting Definitions, Synonyms and Antonyms from WordNet
  • Getting Definitions
  • Getting Synonyms
  • Getting Antonyms
  • Self Check
  • 12.2.13 Deleting Stop Words
  • Self Check
  • 12.2.14 n-grams
  • Self Check
  • 12.3 Visualizing Word Frequencies with Bar Charts and Word Clouds
  • 12.3.1 Visualizing Word Frequencies with Pandas
  • Loading the Data
  • Getting the Word Frequencies
  • Eliminating the Stop Words
  • Sorting the Words by Frequency
  • Getting the Top 20 Words
  • Convert top20 to a DataFrame
  • Visualizing the DataFrame
  • 12.3.2 Visualizing Word Frequencies with Word Clouds
  • Installing the wordcloud Module
  • Loading the Text
  • Loading the Mask Image that Specifies the Word Cloud’s Shape
  • Configuring the WordCloud Object
  • Generating the Word Cloud
  • Saving the Word Cloud as an Image File
  • Generating a Word Cloud from a Dictionary
  • Displaying the Image with Matplotlib
  • Self Check
  • 12.4 Readability Assessment with Textatistic
  • Self Check
  • 12.5 Named Entity Recognition with spaCy
  • Self Check
  • 12.6 Similarity Detection with spaCy
  • Self Check
  • 12.7 Other NLP Libraries and Tools
  • 12.8 Machine Learning and Deep Learning Natural Language Applications
  • 12.9 Natural Language Datasets
  • 12.10 Wrap-Up
  • Exercises
  • 13 Data Mining Twitter
  • Objectives
  • Outline
  • 13.1 Introduction
  • Self Check
  • 13.2 Overview of the Twitter APIs
  • Self Check
  • 13.3 Creating a Twitter Account
  • 13.4 Getting Twitter Credentials—Creating an App
  • Self Check
  • 13.5 What’s in a Tweet?
  • Key Properties of a Tweet Object
  • Sample Tweet JSON
  • Twitter JSON Object Resources
  • Self Check
  • 13.6 Tweepy
  • 13.7 Authenticating with Twitter Via Tweepy
  • Self Check
  • 13.8 Getting Information About a Twitter Account
  • Self Check
  • 13.9 Introduction to Tweepy Cursors: Getting an Account’s Followers and Friends
  • 13.9.1 Determining an Account’s Followers
  • Creating a Cursor
  • Getting Results
  • Automatic Paging
  • Getting Follower IDs Rather Than Followers
  • Self Check
  • 13.9.2 Determining Whom an Account Follows
  • Self Check
  • 13.9.3 Getting a User’s Recent Tweets
  • Grabbing Recent Tweets from Your Own Timeline
  • Self Check
  • 13.10 Searching Recent Tweets
  • 13.11 Spotting Trends: Twitter Trends API
  • 13.11.1 Places with Trending Topics
  • Self Check
  • 13.11.2 Getting a List of Trending Topics
  • Worldwide Trending Topics
  • New York City Trending Topics
  • Self Check
  • 13.11.3 Create a Word Cloud from Trending Topics
  • Self Check
  • 13.12 Cleaning/Preprocessing Tweets for Analysis
  • Self Check
  • 13.13 Twitter Streaming API
  • 13.13.1 Creating a Subclass of StreamListener
  • Class TweetListener
  • Class TweetListener: __init__ Method
  • Class TweetListener: on_connect Method
  • Class TweetListener: on_status Method
  • 13.13.2 Initiating Stream Processing
  • Authenticating
  • Creating a TweetListener
  • Creating a Stream
  • Starting the Tweet Stream
  • Asynchronous vs. Synchronous Streams
  • Other filter Method Parameters
  • Twitter Restrictions Note
  • Self Check
  • 13.14 Tweet Sentiment Analysis
  • 13.15 Geocoding and Mapping
  • Self Check
  • 13.15.1 Getting and Mapping the Tweets
  • Get the API Object
  • Collections Required By LocationListener
  • Creating the LocationListener
  • Configure and Start the Stream of Tweets
  • Displaying the Location Statistics
  • Geocoding the Locations
  • Displaying the Bad Location Statistics
  • Cleaning the Data
  • Creating a Map with Folium
  • Creating Popup Markers for the Tweet Locations
  • Saving the Map
  • Self Check
  • 13.15.2 Utility Functions in tweetutilities.py
  • get_tweet_content Utility Function
  • get_geocodes Utility Function
  • Self Check
  • 13.15.3 Class LocationListener
  • 13.16 Ways to Store Tweets
  • 13.17 Twitter and Time Series
  • 13.18 Wrap-Up
  • Exercises
  • 14 IBM Watson and Cognitive Computing
  • Outline
  • 14.1 Introduction: IBM Watson and Cognitive Computing
  • Self Check
  • 14.2 IBM Cloud Account and Cloud Console
  • Self Check
  • 14.3 Watson Services
  • Watson Assistant
  • Visual Recognition
  • Speech to Text
  • Text to Speech
  • Language Translator
  • Natural Language Understanding
  • Discovery
  • Personality Insights
  • Tone Analyzer
  • Natural Language Classifier
  • Synchronous and Asynchronous Capabilities
  • Self Check
  • 14.4 Additional Services and Tools
  • Watson Studio
  • Knowledge Studio
  • Machine Learning
  • Knowledge Catalog
  • Cognos Analytics
  • Self Check
  • 14.5 Watson Developer Cloud Python SDK
  • Modules We’ll Need for Audio Recording and Playback
  • SDK Examples
  • Self Check
  • 14.6 Case Study: Traveler’s Companion Translation App
  • Self Check
  • 14.6.1 Before You Run the App
  • Registering for the Speech to Text Service
  • Registering for the Text to Speech Service
  • Registering for the Language Translator Service
  • Retrieving Your Credentials
  • Self Check
  • 14.6.2 Test-Driving the App
  • Processing the Question
  • Processing the Response
  • Self Check
  • 14.6.3 SimpleLanguageTranslator.py Script Walkthrough
  • Importing Watson SDK Classes
  • Other Imported Modules
  • Main Program: Function run_translator
  • Function speech_to_text
  • Function translate
  • Function text_to_speech
  • Function record_audio
  • Function play_audio
  • Executing the run_translator Function
  • Self Check
  • 14.7 Watson Resources
  • Self Check
  • 14.8 Wrap-Up
  • Exercises
  • 15 Machine Learning: Classification, Regression and Clustering
  • Outline
  • 15.1 Introduction to Machine Learning
  • 15.1.1 Scikit-Learn
  • Which Scikit-Learn Estimator Should You Choose for Your Project
  • 15.1.2 Types of Machine Learning
  • Supervised Machine Learning
  • Datasets
  • Classification
  • Regression
  • Unsupervised Machine Learning
  • K-Means Clustering and the Iris Dataset
  • Big Data and Big Computer Processing Power
  • 15.1.3 Datasets Bundled with Scikit-Learn
  • 15.1.4 Steps in a Typical Data Science Study
  • Self Check
  • 15.2 Case Study: Classification with k-Nearest Neighbors and the Digits Dataset, Part 1
  • Self Check
  • 15.2.1 k-Nearest Neighbors Algorithm
  • Hyperparameters and Hyperparameter Tuning
  • Self Check
  • 15.2.2 Loading the Dataset
  • Displaying the Description
  • Checking the Sample and Target Sizes
  • A Sample Digit Image
  • Preparing the Data for Use with Scikit-Learn
  • Self Check
  • 15.2.3 Visualizing the Data
  • Creating the Diagram
  • Displaying Each Image and Removing the Axes Labels
  • Self Check
  • 15.2.4 Splitting the Data for Training and Testing
  • Training and Testing Set Sizes
  • Self Check
  • 15.2.5 Creating the Model
  • 15.2.6 Training the Model
  • Self Check
  • 15.2.7 Predicting Digit Classes
  • Self Check
  • 15.3 Case Study: Classification with k-Nearest Neighbors and the Digits Dataset, Part 2
  • 15.3.1 Metrics for Model Accuracy
  • Estimator Method score
  • Confusion Matrix
  • Classification Report
  • Visualizing the Confusion Matrix
  • Self Check
  • 15.3.2 K-Fold Cross-Validation
  • KFold Class
  • Using the KFold Object with Function cross_val_score
  • Self Check
  • 15.3.3 Running Multiple Models to Find the Best One
  • Scikit-Learn Estimator Diagram
  • Self Check
  • 15.3.4 Hyperparameter Tuning
  • Self Check
  • 15.4 Case Study: Time Series and Simple Linear Regression
  • Self Check
  • 15.5 Case Study: Multiple Linear Regression with the California Housing Dataset
  • 15.5.1 Loading the Dataset
  • Loading the Data
  • Displaying the Dataset’s Description
  • 15.5.2 Exploring the Data with Pandas
  • Self Check
  • 15.5.3 Visualizing the Features
  • Self Check
  • 15.5.4 Splitting the Data for Training and Testing
  • 15.5.5 Training the Model
  • Self Check
  • 15.5.6 Testing the Model
  • 15.5.7 Visualizing the Expected vs. Predicted Prices
  • 15.5.8 Regression Model Metrics
  • Self Check
  • 15.5.9 Choosing the Best Model
  • 15.6 Case Study: Unsupervised Machine Learning, Part 1—Dimensionality Reduction
  • Loading the Digits Dataset
  • Creating a TSNE Estimator for Dimensionality Reduction
  • Transforming the Digits Dataset’s Features into Two Dimensions
  • Visualizing the Reduced Data
  • Visualizing the Reduced Data with Different Colors for Each Digit
  • Self Check
  • 15.7 Case Study: Unsupervised Machine Learning, Part 2—k-Means Clustering
  • Self Check
  • 15.7.1 Loading the Iris Dataset
  • Checking the Numbers of Samples, Features and Targets
  • 15.7.2 Exploring the Iris Dataset: Descriptive Statistics with Pandas
  • 15.7.3 Visualizing the Dataset with a Seaborn pairplot
  • Displaying the pairplot in One Color
  • Self Check
  • 15.7.4 Using a KMeans Estimator
  • Creating the Estimator
  • Fitting the Model
  • Comparing the Computer Cluster Labels to the Iris Dataset’s Target Values
  • Self Check
  • 15.7.5 Dimensionality Reduction with Principal Component Analysis
  • Creating the PCA Object
  • Transforming the Iris Dataset’s Features into Two Dimensions
  • Visualizing the Reduced Data
  • Self Check
  • 15.7.6 Choosing the Best Clustering Estimator
  • 15.8 Wrap-Up
  • Exercises
  • 16 Deep Learning
  • Objectives
  • Outline
  • 16.1 Introduction
  • Self Check
  • 16.1.1 Deep Learning Applications
  • 16.1.2 Deep Learning Demos
  • 16.1.3 Keras Resources
  • 16.2 Keras Built-In Datasets
  • 16.3 Custom Anaconda Environments
  • Self Check
  • 16.4 Neural Networks
  • Self Check
  • 16.5 Tensors
  • Self Check
  • 16.6 Convolutional Neural Networks for Vision; Multi-Classification with the MNIST Dataset
  • Self Check
  • 16.6.1 Loading the MNIST Dataset
  • Self Check
  • 16.6.2 Data Exploration
  • Visualizing Digits
  • 16.6.3 Data Preparation
  • Reshaping the Image Data
  • Normalizing the Image Data
  • One-Hot Encoding: Converting the Labels From Integers to Categorical Data
  • Self Check
  • 16.6.4 Creating the Neural Network
  • Adding Layers to the Network
  • Convolution
  • Adding a Convolution Layer
  • Dimensionality of the First Convolution Layer’s Output
  • Overfitting
  • Adding a Pooling Layer
  • Adding Another Convolutional Layer and Pooling Layer
  • Flattening the Results
  • Adding a Dense Layer to Reduce the Number of Features
  • Adding Another Dense Layer to Produce the Final Output
  • Printing the Model’s Summary
  • Visualizing a Model’s Structure
  • Compiling the Model
  • Self Check
  • 16.6.5 Training and Evaluating the Model
  • Evaluating the Model
  • Making Predictions
  • Locating the Incorrect Predictions
  • Visualizing Incorrect Predictions
  • Displaying the Probabilities for Several Incorrect Predictions
  • Self Check
  • 16.6.6 Saving and Loading a Model
  • Self Check
  • 16.7 Visualizing Neural Network Training with TensorBoard
  • Self Check
  • 16.8 ConvnetJS: Browser-Based Deep-Learning Training and Visualization
  • 16.9 Recurrent Neural Networks for Sequences; Sentiment Analysis with the IMDb Dataset
  • Self Check
  • 16.9.1 Loading the IMDb Movie Reviews Dataset
  • Self Check
  • 16.9.2 Data Exploration
  • Movie Review Encodings
  • Decoding a Movie Review
  • 16.9.3 Data Preparation
  • Splitting the Test Data into Validation and Test Data
  • Self Check
  • 16.9.4 Creating the Neural Network
  • Adding an Embedding Layer
  • Adding an LSTM Layer
  • Adding a Dense Output Layer
  • Compiling the Model and Displaying the Summary
  • Self Check
  • 16.9.5 Training and Evaluating the Model
  • 16.10 Tuning Deep Learning Models
  • Self Check
  • 16.11 Convnet Models Pretrained on ImageNet
  • 16.12 Reinforcement Learning
  • 16.12.1 Deep Q-Learning
  • 16.12.2 OpenAI Gym
  • 16.13 Wrap-Up
  • Exercises
  • Convolutional Neural Networks
  • Recurrent Neural Networks
  • ConvnetJS Visualization
  • Convolutional Neural Network Projects and Research
  • Recurrent Neural Network Projects and Research
  • Automated Deep Learning Project
  • Reinforcement Learning Projects and Research
  • Generative Deep Learning
  • Deep Fakes
  • Additional Research
  • 17 Big Data: Hadoop, Spark, NoSQL and IoT
  • Objectives
  • Outline
  • 17.1 Introduction
  • Self Check for Section 17.1
  • 17.2 Relational Databases and Structured Query Language (SQL)
  • Self Check
  • 17.2.1 A books Database
  • Self Check
  • 17.2.2 SELECT Queries
  • 17.2.3 WHERE Clause
  • Pattern Matching: Zero or More Characters
  • Pattern Matching: Any Character
  • Self Check
  • 17.2.4 ORDER BY Clause
  • Sorting By Multiple Columns
  • Combining the WHERE and ORDER BY Clauses
  • Self Check
  • 17.2.5 Merging Data from Multiple Tables: INNER JOIN
  • Self Check
  • 17.2.6 INSERT INTO Statement
  • Note Regarding Strings That Contain Single Quotes
  • 17.2.7 UPDATE Statement
  • 17.2.8 DELETE FROM Statement
  • Self Check for Section 17.2
  • 17.3 NoSQL and NewSQL Big-Data Databases: A Brief Tour
  • 17.3.1 NoSQL Key–Value Databases
  • 17.3.2 NoSQL Document Databases
  • 17.3.3 NoSQL Columnar Databases
  • 17.3.4 NoSQL Graph Databases
  • 17.3.5 NewSQL Databases
  • Self Check for Section 17.3
  • 17.4 Case Study: A MongoDB JSON Document Database
  • 17.4.1 Creating the MongoDB Atlas Cluster
  • Creating Your First Database User
  • Whitelist Your IP Address
  • Connect to Your Cluster
  • 17.4.2 Streaming Tweets into MongoDB
  • Use Tweepy to Authenticate with Twitter
  • Loading the Senators’ Data
  • Configuring the MongoClient
  • Setting up Tweet Stream
  • Starting the Tweet Stream
  • Class TweetListener
  • Counting Tweets for Each Senator
  • Show Tweet Counts for Each Senator
  • Get the State Locations for Plotting Markers
  • Grouping the Tweet Counts by State
  • Creating the Map
  • Creating a Choropleth to Color the Map
  • Creating the Map Markers for Each State
  • Displaying the Map
  • Self Check for Section 17.4
  • 17.5 Hadoop
  • 17.5.1 Hadoop Overview
  • HDFS, MapReduce and YARN
  • Hadoop Ecosystem
  • Hadoop Providers
  • Hadoop 3
  • 17.5.2 Summarizing Word Lengths in Romeo and Juliet via MapReduce
  • 17.5.3 Creating an Apache Hadoop Cluster in Microsoft Azure HDInsight
  • Creating an HDInsight Hadoop Cluster
  • 17.5.4 Hadoop Streaming
  • 17.5.5 Implementing the Mapper
  • 17.5.6 Implementing the Reducer
  • 17.5.7 Preparing to Run the MapReduce Example
  • Copying the Script Files to the HDInsight Hadoop Cluster
  • Copying RomeoAndJuliet into the Hadoop File System
  • 17.5.8 Running the MapReduce Job
  • Viewing the Word Counts
  • Deleting Your Cluster So You Do Not Incur Charges
  • Self Check for Section 17.5
  • 17.6 Spark
  • 17.6.1 Spark Overview
  • History
  • Architecture and Components
  • Providers
  • 17.6.2 Docker and the Jupyter Docker Stacks
  • Docker
  • Installing Docker
  • Jupyter Docker Stacks
  • Run Jupyter Docker Stack
  • Opening JupyterLab in Your Browser
  • Accessing the Docker Container’s Command Line
  • Stopping and Restarting a Docker Container
  • 17.6.3 Word Count with Spark
  • Loading the NLTK Stop Words
  • Configuring a SparkContext
  • Reading the Text File and Mapping It to Words
  • Removing the Stop Words
  • Counting Each Remaining Word
  • Locating Words with Counts Greater Than or Equal to 60
  • Sorting and Displaying the Results
  • 17.6.4 Spark Word Count on Microsoft Azure
  • Create an Apache Spark Cluster in HDInsight Using the Azure Portal
  • Install Libraries into a Cluster
  • Copying RomeoAndJuliet.txt to the HDInsight Cluster
  • Accessing Jupyter Notebooks in HDInsight
  • Uploading the RomeoAndJulietCounter.ipynb Notebook
  • Modifying the Notebook to Work with Azure
  • Self Check for Section 17.6
  • 17.7 Spark Streaming: Counting Twitter Hashtags Using the pyspark-notebook Docker Stack
  • 17.7.1 Streaming Tweets to a Socket
  • Executing the Script in the Docker Container
  • starttweetstream.py import Statements
  • Class TweetListener
  • Main Application
  • 17.7.2 Summarizing Tweet Hashtags; Introducing Spark SQL
  • Importing the Libraries
  • Utility Function to Get the SparkSession
  • Utility Function to Display a Barchart Based on a Spark DataFrame
  • Utility Function to Summarize the Top-20 Hashtags So Far
  • Getting the SparkContext
  • Getting the StreamingContext
  • Setting Up a Checkpoint for Maintaining State
  • Connecting to the Stream via a Socket
  • Tokenizing the Lines of Hashtags
  • Mapping the Hashtags to Tuples of Hashtag-Count Pairs
  • Totaling the Hashtag Counts So Far
  • Specifying the Method to Call for Every RDD
  • Starting the Spark Stream
  • Self Check for Section 17.7
  • 17.8 Internet of Things and Dashboards
  • 17.8.1 Publish and Subscribe
  • 17.8.2 Visualizing a PubNub Sample Live Stream with a Freeboard Dashboard
  • Signing up for Freeboard.io
  • Creating a New Dashboard
  • Adding a Data Source
  • Adding a Pane for the Humidity Sensor
  • Adding a Gauge to the Humidity Pane
  • Adding a Sparkline to the Humidity Pane
  • Completing the Dashboard
  • 17.8.3 Simulating an Internet-Connected Thermostat in Python
  • Installing Dweepy
  • Invoking the simulator.py Script
  • Sending Dweets
  • 17.8.4 Creating the Dashboard with Freeboard.io
  • 17.8.5 Creating a Python PubNub Subscriber
  • Message Format
  • Importing the Libraries
  • List and DataFrame Used for Storing Company Names and Prices
  • Class SensorSubscriberCallback
  • Function Update
  • Configuring the Figure
  • Configuring the FuncAnimation and Displaying the Window
  • Configuring the PubNub Client
  • Subscribing to the Channel
  • Ensuring the Figure Remains on the Screen
  • Self Check for Section 17.8
  • 17.9 Wrap-Up
  • Exercises
  • SQL and RDBMS Exercises
  • NoSQL Database Exercises
  • Hadoop Exercises
  • Spark Exercises
  • IoT and Pub/Sub Exercises
  • Platform Exercises
  • Other Exercises
  • Index
  • Symbols
  • Numerics
  • A
  • B
  • C
  • D
  • E
  • F
  • G
  • H
  • I
  • J
  • K
  • L
  • M
  • N
  • O
  • P
  • Q
  • R
  • S
  • T
  • U
  • V
  • W
  • X
  • Y
  • Z

Additional information

Veldu vöru

Leiga á rafbók í 180 daga

Aðrar vörur

0
    0
    Karfan þín
    Karfan þín er tómAftur í búð