We're the top company on Modern Data Stack! Give us an upvote here.
bubbles svg

Cloud Data Viz and Analytics Health Check

Uncover the fitness of your Cloud Data Viz & Analytics

Get my free score

Astrato: Understanding the Full Outer Join

Astrato has added the option for users to view a Full Outer Join (FOJ), called ‘Show all values from *table name*’. This blog explains what a FOJ is and how it helps the user, and includes a useful example to follow along with.

First of all, do users understand what a FOJ is? Well, put simply, no, most users do not understand it. This is probably because of its name. It sounds so nerdy that a user would just ignore it!

That’s why we decided to call it “Show All.” Basically, a FOJ is a functionality that allows users to see all values in a query – so not only the “matching” values, but also the “non-matching” ones, too. Let’s explain it with an example.

Full Outer Joins Example

I have a table of Sales, looking like this:

The total amount of Sales is 1200. Please remember this number.

Let’s imagine that the user now wants to add the product name to the query. This can be found in the table Products:

Please notice that this data set is not perfect (as is often the case in the real world).

We notice three things:

  1. Not all the products that are in Sales appear in Products (PX, PY)
  2. Not all the products that are in Products appear in Sales (P4, P5, P6, P7)
  3. Some rows of Sales have a NULL product ID

 

The first scenario is a case of “bad data”. We call PX and PY “Orphan values”.

The second scenario, instead, is normal: We call P4 P5 P6 P7 “Unused values”

The third may or may not be OK, depending on the business process

You may wonder: why is the first bad, and the second normal? Well, it’s because the first table is an important table: Sales. We call this table a “fact table”: it represents a list of events, and it usually puts together two or more business entities, a date, and some number quantifying the event. 

The second is a “lookup” table (a.k.a. dimension table). It is a “list of something”, and not necessarily every value is relevant. A perfect example of a lookup table in the real world is the table Countries: It usually has around 200 rows, but Afghanistan, Algeria, and many other Countries do not appear in the table of Sales.

Now, what can we do with these two tables? If we want to see the product name and the sales amount, we need to join them.

Here is the join:

But something is missing! It now looks like we only sold an amount of 600, which is not correct. As you should remember, the correct total amount is 1200. This is happening because the query used an “INNER JOIN”.

Fortunately, in Astrato we always join by default with a “LEFT JOIN”.

Now our numbers are back in a good state: the total is 1200, as it should be. When the product is not found in the Products table (either because the key is an orphan or a NULL), we have a NULL value in all the columns that come from Products. 

The LEFT JOIN is ideal: the total is 1200, and the info from Products is shown when it is available, and NULL otherwise.

NOTE: The word “LEFT” may sound confusing. How can we have a concept of “left” and “right” in a text? Well, by definition, the “LEFT” table is the one that is mentioned first in the SQL statement. In this case, the SQL statement says “Sales JOIN Products, therefore Sales is the LEFT table, and Products is the RIGHT table.

Typically, this type of situation is well described by a Venn Diagram:

This representation is very intuitive. We call “INNER” the intersection, and “OUTER” the sections on the LEFT and on the RIGHT.

In SQL, we have four types of JOIN:

  • An INNER JOIN only shows the intersection – This is only good in the scenario of the total absence of orphan keys and NULL keys.
  • A LEFT JOIN shows the intersection, plus the values on the LEFT – This is GOOD
  • A RIGHT JOIN shows the intersection, plus the values on the RIGHT – This is usually not good, because the table mentioned first is usually the most important (a good practice in SQL is to mention the fact table first, and then join the dimension tables).
  • A FULL OUTER JOIN shows the intersection, plus the values on the LEFT and the values on the RIGHT. This is an interesting scenario. Thanks to this, users can have the correct figure (in this case 1200), and at the same time, they can put in evidence the unused values (in this case, the unsold products P4 P5 P6 P7). 

This is what a Full Outer Join looks like:

This result is correct, but in this format, it is not very helpful. It is much better like this:

In this representation, despite the bad quality of the data, the user has a good overview of the sales by product name, including information about the unsold products.

Now that we know what an FOJ is, let’s look at Astrato. What does Astrato add to the existing SQL? Astrato can produce this result, but without the need for the user to go to change the join type.Changing the join type has two downsides:

  • If the join layer (Data View, or its equivalent in another BI tool) is shared across users and dashboards, this change would have an impact on the existing dashboards created by other users
  • If not, this is in any case an advanced operation, which a business user would probably not be comfortable doing.

Currently, there is no BI tool on the market that allows building a FOJ without changing the join type. In Astrato, this is possible, and very easy!

Astrato default:

Here is the new Astrato feature, in the Property Panel:

Please notice that the option “Show all values from Sales” is already selected and greyed out: this is because, by the nature of this query, Astrato is already showing all values. But with Products, the choice is up to the user. By default, it is not selected. But if the user ticks the box “Show All values from Products”, this is the result:

Which produces this:

And it’s done! The unsold products are now visible.

Summing Up 

Through the new Full Outer Join functionality, users can see all the existing values of the selected columns, not just the “matching values”. This sets Astrato apart; currently, no other BI tool allows building a FOJ without changing the join type. Accessed by selecting the “Show All values from Products” box, FOJ display is simple and accessible. 

Experience Astrato’s FOJ functionality and intuitive BI interface by booking a demo with one of our experts!