Data analysts love BigQuery because it allows them to process data without compromising on performance. With the ability to process hundreds of thousands of data rows, BigQuery allows its users to run queries using both Standard SQL and Legacy SQL, even though the former is the preferred SQL dialect for working with BigQuery.
There are many different functions to learn if you want to use BigQuery efficiently, but today we’re going over the EXCEPT operator. If you were looking for a way to exclude certain columns from your query results and therefore spend less time querying data, this clause can help you achieve just that. Let’s get started.
Your Agency’s Data in Google BigQuery
Tons of benefits come with using a SSOT like a BigQuery data warehouse, including high performance, seamless integration with your existing tech stack, stellar data security and more.
BigQuery is extremely capable of processing high volumes of data. It can query more than 35 billion rows of data within 10 seconds. Even if you have less data than this, your agency can still benefit from the smart integrations with other tools that BigQuery supports, such as Google Sheets, Looker or PowerBI.
When it comes to the BigQuery pricing structure, it is logical and affordable. Users are charged mainly based on the data they store and the queries they process. Considering that the Standard SQL language it supports makes code easier to understand and write, it’s worth the buck.
And, integrating BigQuery into your agency means you’ll have to spend less time doing manual spreadsheet work, so you can focus more on doing strategic tasks. BigQuery will enable your non-technical staff to easily use your agency’s data too, as it can be automated, allowing them to analyse and interpret data without having coding knowledge.
Google BigQuery EXCEPT Overview
When working with data in BigQuery, you might need to access specific data that meets given conditions, analyse it and extract insights. After all, the goal of using each query is to produce relevant results more efficiently.
Say, if you had a table with 20 columns, and you needed 15 of them, why spend your credits by returning the entire table? The EXCEPT operator is an ideal shortcut to avoid returning the columns you don’t need.
Imagine how effective this approach is with larger tables?! What if you had a table with 40 rows and you needed to extract 35 of them? Without the EXCEPT clause, you’d have to manually define the 35 columns you need, which is a lot of work.
But with the BigQuery EXCEPT operator, you just have to specify the columns you want to exclude and you’re done.
BigQuery SELECT * EXCEPT Syntax
The syntax of the BigQuery EXCEPT statement is simple and easy to remember. Here’s what it looks like:
SELECT * EXCEPT (column_1, column_2, …, column_n…) FROM `table.name`
Pro Tip: Remember to keep the Legacy SQL box in the settings unchecked because this statement is written in Standard SQL. Also, keep in mind to use backticks in quoting your table names and use a period instead of a colon as a separator. Stay aware of the differences between Legacy SQL and Standard BigQuery SQL. |
How to Write EXCEPT Queries in Google BigQuery
Writing EXCEPT queries is quite easy once you get familiar with the syntax. Let’s create an example table and test it on the spot. Our table has recorded customers’ projects for the year 2022.
Example 1: Using BigQuery EXCEPT to Exclude One Column
Now, let’s query the table and return all the data except for the Timeline column.
SELECT * EXCEPT (Timeline) FROM `acuto-bigquery.clients.clients_aug_2022`
Example 2: Using BigQuery EXCEPT to Exclude Multiple Columns
This is also a simple query to write. Let’s try to exclude two columns this time (Timeline and Delivered).
SELECT * EXCEPT (Timeline, Delivered) FROM `acuto-bigquery.clients.clients_aug_2022`
The statement works smoothly. Now, let’s try something more complicated.
Example 3: Using BigQuery EXCEPT DISTINCT to Exclude Results Present in Other Tables
We can combine EXCEPT with the DISTINCT statement and extract results that are not in the second results query. Basically, the intersection of results from both query results.
For this query, we’ll take another table with clients coming in through referrals for the year:
Now, let’s find which clients didn’t come through as referrals for this year by using EXCEPT DISTINCT.
SELECT * FROM `acuto-bigquery.clients.clients_aug_2022` EXCEPT DISTINCT SELECT * FROM `acuto-bigquery.clients.clients_2022_referrals`
Pro Tip: For the query to work, both tables must have the same number of columns with corresponding data types.) |
BigQuery Data Warehouse by Acuto
Setting up a BigQuery data warehouse can be a challenge if you want to manage and analyse all your agency data from one spot. BigQuery is otherwise a fantastic data warehouse that any agency could use to enhance their operations. But, what if you can’t implement it in-house?
Acuto is here to help you bridge the gap between all your data sources by consolidating them into a single BigQuery data warehouse. The best part is that this solution doesn’t break the bank, and you get to own it.
We’ll work with you from the planning phase through to implementation, so you can sit back and focus on what you do best. Contact us today to get started and unleash the power of your data.
Conclusion
Google BigQuery provides one of the most efficient data warehouse services online, but if you don’t know how to take full advantage of it from a technical perspective, you could be missing out on a lot. The data warehouse comes with its own query engine to explore and analyse data.
You can leverage simple queries like BigQuery EXCEPT to extract relevant information from your tables. By using EXCEPT and EXCEPT DISTINCT, you can decide which columns to exclude from the final results. This can be useful when producing reports and other analysis.
Agencies of any industry and size can rely on BigQuery to build a reliable data infrastructure. Yours is one step away from less manual labour too. It’s time to fully join the data revolution and store, process and automate data so that your time is spent on tasks that no machine can execute.