QGIS: Count Unique Values In An Array (Field Calculator)

by Andrew McMorgan 57 views

Hey guys, ever found yourself wrestling with a QGIS project, trying to wrangle unique values and count them, all while keeping things neat and tidy? I know the feeling! It can be a real head-scratcher. Specifically, let's talk about how to count unique values within an array for a specific attribute, all using the field calculator in QGIS. No Python scripts needed (yet!), just some clever field calculator tricks. I'm going to walk you through it, so stick with me, and we'll get this sorted out together!

Setting the Stage: Your QGIS Project

First off, let’s get our ducks in a row. Imagine you're working with a layer of features in QGIS – maybe it's points, lines, or polygons, it doesn’t really matter for this specific task. You’ve got a bunch of these features (let’s say 1000 features), and each one has a specific attribute. Think of it like a parent key, and you want to understand how frequently certain values pop up for a related attribute. For example, maybe you're tracking customer orders and each feature represents an order. You have a “Parent Key” that would identify the customer, and you want to know how many unique items (another attribute) each customer ordered. Your task is to count how many unique items, for each customer.

The Data: A Quick Example

Let's pretend we have a simple dataset to visualize this. Think of a table with the following structure:

Parent Key (Customer ID) Item Ordered
101 Apple
101 Banana
101 Apple
102 Orange
102 Grape
103 Banana

In this tiny example, we want to know, for each Parent Key (Customer ID) the total number of items, and the number of unique items that each one has purchased. Customer 101 bought 2 unique items (Apple, Banana), customer 102 bought 2 unique items (Orange, Grape), and customer 103 only bought one unique item (Banana).

Understanding the Goal

The ultimate goal here is to create an array for each Parent Key. The array will contain the list of unique values associated with the Item Ordered attribute. At the same time, we need to count those unique values. This will help you see which customer is ordering which item and how many different items they are ordering in total. Sounds like a breeze, right? Let's dive into how we can do it with the field calculator!

Step-by-Step Guide: The Field Calculator Magic

Alright, let's roll up our sleeves and get our hands dirty with the field calculator. We will be using a combination of array functions and aggregate functions to achieve the desired result.

Step 1: Preparing Your Data

Make sure your data is set up with a parent key attribute (e.g., Customer ID) and an attribute that you want to analyze for unique values (e.g., Item Ordered). The goal is to aggregate the "Item Ordered" values for each "Customer ID".

Step 2: Accessing the Field Calculator

Open your QGIS project, select the layer you're working with in the Layers panel, and open the attribute table. Then, click on the field calculator. It's usually a small icon that looks like a calculator (or find it in the menu: Layer -> Field calculator).

Step 3: Creating the First Field: Aggregating Values

We need to create a new field that aggregates the values of the Item Ordered field, grouped by our Parent Key (Customer ID).

  1. Name Your Field: Give your new field a descriptive name, like unique_items_array.

  2. Choose Your Field Type: Select "Text (string)" for now, as we'll be dealing with strings representing the item names. We will modify the field type later, for the array.

  3. Use the Following Expression: In the expression box, use the array_agg() function combined with the filter function to get the unique items.

    array_agg( "Item Ordered", filter:="Customer ID" = attribute( $currentfeature, 'Customer ID' ))
    
    • array_agg( "Item Ordered" ): This part aggregates all the “Item Ordered” values, but we need to specify a criteria to separate the values per “Customer ID”.
    • filter:="Customer ID" = attribute( $currentfeature, 'Customer ID' ): This part is used to filter by the current Customer ID. It compares the “Customer ID” of the current feature ($currentfeature) with the "Customer ID" of the same feature. It means that, for each feature, we are gathering the “Item Ordered” values that matches the same “Customer ID” of that same feature.
  4. Click OK: The field calculator will process and populate your new field with an array for each feature. At this point, you'll likely have a text-based representation of each array (e.g., ['Apple', 'Banana', 'Apple']).

Step 4: Refine the field

In our field unique_items_array, we are storing an array. This array represents the "Item Ordered" aggregated for each "Customer ID". However, it's not a counted array. We need to create a new field to count the unique items.

  1. Create a new field: Click on the field calculator again.

  2. Name your field: Give your new field a descriptive name, like unique_items_count.

  3. Choose Your Field Type: Select "Integer" for this field.

  4. Use the Following Expression: In the expression box, use the array_distinct() and array_length() functions to count the unique items in each array.

    array_length(array_distinct(