This book shows business and data analysts how to use BigQuery most effectively, avoid common pitfalls, and ultimately execute sophisticated queries against large, complex data sets. The authors will share tips and recipes for running complex queries. And they will also show how to write code to communicate with the BigQuery API.
The authors will demonstrate best practices and techniques against an extended real-world example -- a web application that collects sensor data from mobile devices and displays a dashboard visualizing the data in real-time. Along the way, the authors will use examples to demonstrate streaming ingestion, transformation via Hadoop in Google Compute engine, AppEngine datastore integration, and using GViz with Tableau to generate charts of query results.
The authors will not just cover the mechanics of using BigQuery; they will also cover the architecture of the underlying Dremel query engine: understanding how a query will execute is a key to getting good results from BigQuery. The book describes how Dremel works, and pairs it with concrete query examples showing how to work around limitations in the architecture.
The query samples will be in BigQuery’s variant of SQL. And the web application examples will be in Python, the most popular language for analytics. Where the Java analogue of the Python samples would differ significantly, Java samples will be given as well. All code and data sets will be available on the book's companion website.
Google Bigquery Analytics
by Tigani, Jordan; Naidu, SiddarthaBuy New
Rent Textbook
Digital
Used Textbook
We're Sorry
Sold Out
How Marketplace Works:
- This item is offered by an independent seller and not shipped from our warehouse
- Item details like edition and cover design may differ from our description; see seller's comments before ordering.
- Sellers much confirm and ship within two business days; otherwise, the order will be cancelled and refunded.
- Marketplace purchases cannot be returned to eCampus.com. Contact the seller directly for inquiries; if no response within two days, contact customer service.
- Additional shipping costs apply to Marketplace purchases. Review shipping costs at checkout.
Summary
Author Biography
The authors are founding members of the BigQuery team and have helped build and run the service. Jordan Tigani is an active participant in the BigQuery StackOverflow virtual community. Siddartha Naidu has extensive experience helping customers integrate with BigQuery.
Table of Contents
Part I BigQuery Fundamentals
Chapter 1 The Story of Big Data at Google 3
Big Data Stack 1.0 4
Big Data Stack 2.0 (and Beyond) 5
Open Source Stack 7
Google Cloud Platform 8
Cloud Processing 9
Cloud Storage 9
Cloud Analytics 9
Problem Statement 10
What Is Big Data? 10
Why Big Data? 10
Why Do You Need New Ways to Process Big Data? 11
How Can You Read a Terabyte in a Second? 12
What about MapReduce? 12
How Can You Ask Questions of Your Big Data and Quickly
Get Answers? 13
Summary 13
Chapter 2 BigQuery Fundamentals 15
What Is BigQuery? 15
SQL Queries over Big Data 16
Cloud Storage System 21
Distributed Cloud Computing 23
Analytics as a Service (AaaS?) 26
What BigQuery Isn’t 29
BigQuery Technology Stack 31
Google Cloud Platform 34
BigQuery Service History 37
BigQuery Sensors Application 39
Sensor Client Android App 40
BigQuery Sensors AppEngine App 41
Running Ad-Hoc Queries 42
Summary 43
Chapter 3 Getting Started with BigQuery 45
Creating a Project 45
Google APIs Console 46
Free Tier Limitations and Billing 49
Running Your First Query 51
Loading Data 54
Using the Command-Line Client 57
Install and Setup 58
Using the Client 60
Service Account Access 62
Setting Up Google Cloud Storage 64
Development Environment 66
Python Libraries 66
Java Libraries 67
Additional Tools 67
Summary 68
Chapter 4 Understanding the BigQuery Object Model 69
Projects 70
Project Names 70
Project Billing 72
Project Access Control 72
Projects and AppEngine 73
BigQuery Data 73
Naming in BigQuery 73
Schemas 75
Tables 76
Datasets 77
Jobs 78
Job Components 78
BigQuery Billing and Quotas 85
Storage Costs 85
Processing Costs 86
Query RPCs 87
TableData.insertAll() RPCs 87
Data Model for End-to-End Application 87
Project 87
Datasets 88
Tables 89
Summary 91
Part II Basic BigQuery 93
Chapter 5 Talking to the BigQuery API 95
Introduction to Google APIs 95
Authenticating API Access 96
RESTful Web Services for the SOAP-Less Masses 105
Discovering Google APIs 112
Common Operations 113
BigQuery REST Collections 122
Projects 123
Datasets 126
Tables 132
TableData 139
Jobs 144
BigQuery API Tour 151
Error Handling in BigQuery 154
Summary 158
Chapter 6 Loading Data 159
Bulk Loads 160
Moving Bytes 163
Destination Table 170
Data Formats 174
Errors 182
Limits and Quotas 186
Streaming Inserts 188
Summary 193
Chapter 7 Running Queries 195
BigQuery Query API 196
Query API Methods 196
Query API Features 208
Query Billing and Quotas 213
BigQuery Query Language 221
BigQuery SQL in Five Queries 222
Differences from Standard SQL 232
Summary 236
Chapter 8 Putting It Together 237
A Quick Tour 238
Mobile Client 242
Monitoring Service 243
Log Collection Service 252
Log Trampoline 253
Dashboard 260
Data Caching 261
Data Transformation 265
Web Client 269
Summary 272
Part III Advanced BigQuery 273
Chapter 9 Understanding Query Execution 275
Background 276
Storage Architecture 277
Colossus File System (CFS) 277
ColumnIO 278
Durability and Availability 281
Query Processing 282
Dremel Serving Trees 283
Architecture Comparisons 295
Relational Databases 295
MapReduce 298
Summary 303
Chapter 10 Advanced Queries 305
Advanced SQL 306
Subqueries 307
Combining Tables: Implicit UNION and JOIN 310
Analytic and Windowing Functions 315
BigQuery SQL Extensions 318
The EACH Keyword 318
Data Sampling 320
Repeated Fields 324
Query Errors 334
Result Too Large 334
Resources Exceeded 337
Recipes 338
Pivot 339
Cohort Analysis 340
Parallel Lists 343
Exact Count Distinct 344
Trailing Averages 346
Finding Concurrency 347
Summary 348
Chapter 11 Managing Data Stored in BigQuery 349
Query Caching 349
Result Caching 350
Table Snapshots 354
AppEngine Datastore Integration 358
Simple Kind 359
Mixing Types 366
Final Thoughts 368
Metatables and Table Sharding 368
Time Travel 368
Selecting Tables 374
Summary 378
Part IV BigQuery Applications 381
Chapter 12 External Data Processing 383
Getting Data Out of BigQuery 384
Extract Jobs 384
TableData.list() 396
AppEngine MapReduce 405
Sequential Solution 407
Basic AppEngine MapReduce 409
BigQuery Integration 412
Using BigQuery with Hadoop 418
Querying BigQuery from a Spreadsheet 419
BigQuery Queries in Google Spreadsheets (Apps Script) 419
BigQuery Queries in Microsoft Excel 429
Summary 433
Chapter 13 Using BigQuery from Third-Party Tools 435
BigQuery Adapters 436
Simba ODBC Connector 436
JDBC Connection Options 444
Client-Side Encryption with Encrypted BigQuery 445
Scientifi c Data Processing Tools in BigQuery 452
BigQuery from R 452
Python Pandas and BigQuery 461
Visualizing Data in BigQuery 467
Visualizing Your BigQuery Data with Tableau 467
Visualizing Your BigQuery Data with BIME 473
Other Data Visualization Options 477
Summary 478
Chapter 14 Querying Google Data Sources 479
Google Analytics 480
Setting Up BigQuery Access 480
Table Schema 481
Querying the Tables 483
Google AdSense 485
Table Structure 486
Leveraging BigQuery 490
Google Cloud Storage 491
Summary 494
Index 495
An electronic version of this book is available through VitalSource.
This book is viewable on PC, Mac, iPhone, iPad, iPod Touch, and most smartphones.
By purchasing, you will be able to view this book online, as well as download it, for the chosen number of days.
Digital License
You are licensing a digital product for a set duration. Durations are set forth in the product description, with "Lifetime" typically meaning five (5) years of online access and permanent download to a supported device. All licenses are non-transferable.
More details can be found here.
A downloadable version of this book is available through the eCampus Reader or compatible Adobe readers.
Applications are available on iOS, Android, PC, Mac, and Windows Mobile platforms.
Please view the compatibility matrix prior to purchase.
