From the course: Data Science on Google Cloud Platform: Exploratory Data Analytics

BigQuery

- [Instructor] Analysts are used to getting their data in a CSV, loading it in an Excel spreadsheet, and doing ad hoc analysis. Google BigQuery provides the GCP alternative for the same task. BigQuery is an enterprise data warehouse that also can be used as a permanent storage for big data. It is serverless and easy to set up, load data, query, and administer. It supports a SQL interface. This means familiarity and a small learning curve for an experienced data analyst. You can load data that directly into BigQuery or you can create an external table, based on data in other sources, like Google Cloud Storage and Big Table. You can create custom functions to make analytics easy for your specific business needs. BigQuery has excellent integrations with other GCP products, like Data Flow and Data Studio. So, it can be easily integrated into a data pipeline. What are the strengths of using BigQuery for exploratory data analytics? It is serverless, so an analyst can set up and use it without a database administrator's help. It supports SQL, which makes it popular among data analysts and scientists. It is massively scalable and can hold very large quantities of data. It can use multiple data sources statically and helps to do SQL based queries against these sources. What are its shortcomings? Well, BigQuery is native to GCP. It's not available in an enterprise platform or on a WS or ajar. It does not support any scripting or external programming like procedures, in order to execute complex logic. There are no pre-canned reports that can be created. Queries can be saved and re executed, but there are no graphics. Typical applications for BigQuery include using it as a data warehouse and performing ad hoc queries. It can also be used as a searchable repository.

Contents