Implementing “Get or Create” in YugabyteDB (or PostgreSQL)



This content originally appeared on DEV Community and was authored by Franck Pachot

In SQL, you can perform a “Get or Create” operation using a WITH clause, where each operation is a Common Table Expression (CTE). The following tweet and article inspired this idea, but I suggest a more efficient design.

  • When executing two statements conditionally, start with the most frequent scenario. Avoid inserting if it fails most of the time. Databases optimize successful operations over exceptions.
  • Avoid the ON CONFLICT construct, which relies on constraints and may have surprising transactional behavior. Common Table Expressions (CTEs) are safer with clean input and output states.

In this example, we have a USERS table with two keys: a natural key (“email”) and a surrogate key (“id”). I decided to use the surrogate key as the primary key. The natural key is defined with a constraint and serves as a secondary index.

create table users (
    id generated always as identity primary key,
    email varchar(255) unique not null
);

Here is my query, which I declare as a prepared statement to execute many times:

prepare get_or_create_user_email(text)
 as
with
-- the input set
val(email) as (
  values ($1)
),
-- the result of get (can be empty)
get as (
  select users.* from users natural join val
),
-- the result of insert (if get is empty)
ins as (
  insert into users(email) select email from val
  where not exists (select * from get)
  returning *
)
-- get or create
select * from get
union all
select * from ins
;

Common table expressions make it easy to understand:

  • val captures the input email.
  • get attempts to retrieve the user with the given email. The result is empty if not found.
  • ins creates the new user only if get is empty and returns the new row.
  • Union: Combine the get and ins results to return one user id.

Let’s execute the “create” scenario with explain analyze and the dist option that displays the number of read and write requests. (In PostgreSQL, you may look at buffers, wal to understand the execution cost).


yugabyte=# explain (analyze, dist, buffers)
execute get_or_create_user_email('john.doe@example.com')
;
                                                                                                                            QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
 Append  (cost=4.20..4.25 rows=2 width=524) (actual time=8.482..8.487 rows=1 loops=1)
   CTE val
     ->  Result  (cost=0.00..0.01 rows=1 width=32) (actual time=0.001..0.002 rows=1 loops=1)
   CTE get
     ->  YB Batched Nested Loop Join  (cost=0.00..4.15 rows=1 width=524) (actual time=0.986..0.986 rows=0 loops=1)
           Join Filter: ((users.email)::text = val.email)
           ->  CTE Scan on val  (cost=0.00..0.02 rows=1 width=32) (actual time=0.004..0.004 rows=1 loops=1)
           ->  Index Scan using users_email_key on users  (cost=0.00..4.12 rows=1 width=524) (actual time=0.923..0.923 rows=0 loops=1)
                 Index Cond: ((email)::text = ANY (ARRAY[val.email, $2, $3, ..., $1024]))
                 Storage Index Read Requests: 1
                 Storage Index Read Execution Time: 0.849 ms
   CTE ins
     ->  Insert on users users_1  (cost=0.02..0.04 rows=1 width=524) (actual time=7.489..7.492 rows=1 loops=1)
           Storage Table Write Requests: 1
           Storage Index Write Requests: 1
           InitPlan 3 (returns $1026)
             ->  CTE Scan on get get_1  (cost=0.00..0.02 rows=1 width=0) (actual time=0.000..0.000 rows=0 loops=1)
           ->  Result  (cost=0.00..0.03 rows=1 width=524) (actual time=7.423..7.425 rows=1 loops=1)
                 One-Time Filter: (NOT $1026)
                 ->  CTE Scan on val val_1  (cost=0.00..0.02 rows=1 width=32) (actual time=0.001..0.002 rows=1 loops=1)
   ->  CTE Scan on get  (cost=0.00..0.02 rows=1 width=524) (actual time=0.988..0.988 rows=0 loops=1)
   ->  CTE Scan on ins  (cost=0.00..0.02 rows=1 width=524) (actual time=7.492..7.497 rows=1 loops=1)
 Planning Time: 7.889 ms
 Execution Time: 12.009 ms
 Storage Read Requests: 1
 Storage Read Execution Time: 0.849 ms
 Storage Rows Scanned: 0
 Storage Write Requests: 2
 Catalog Read Requests: 12
 Catalog Read Execution Time: 10.036 ms
 Catalog Write Requests: 0
 Storage Flush Requests: 1
 Storage Flush Execution Time: 3.231 ms
 Storage Execution Time: 14.117 ms
 Peak Memory Usage: 568 kB

For the creation of a new email, a Read Request is first made to check if the email exists in the Secondary index. This is followed by two write requests to update the table and the secondary index. These two write requests are then batched into one Flush request to reduce the latency.


yugabyte=# explain (analyze, dist, buffers)
execute get_or_create_user_email('john.doe@example.com')
;

                                                              QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
 Append  (cost=4.20..4.25 rows=2 width=524) (actual time=1.420..1.433 rows=1 loops=1)
   CTE val
     ->  Result  (cost=0.00..0.01 rows=1 width=32) (actual time=0.001..0.001 rows=1 loops=1)
   CTE get
     ->  YB Batched Nested Loop Join  (cost=0.00..4.15 rows=1 width=524) (actual time=1.417..1.422 rows=1 loops=1)
           Join Filter: ((users.email)::text = val.email)
           ->  CTE Scan on val  (cost=0.00..0.02 rows=1 width=32) (actual time=0.004..0.004 rows=1 loops=1)
           ->  Index Scan using users_email_key on users  (cost=0.00..4.12 rows=1 width=524) (actual time=1.351..1.354 rows=1 loops=1)
                 Index Cond: ((email)::text = ANY (ARRAY[val.email, $2, $3, ..., $1024]))
                 Storage Table Read Requests: 1
                 Storage Table Read Execution Time: 0.542 ms
                 Storage Table Rows Scanned: 1
                 Storage Index Read Requests: 1
                 Storage Index Read Execution Time: 0.656 ms
                 Storage Index Rows Scanned: 1
   CTE ins
     ->  Insert on users users_1  (cost=0.02..0.04 rows=1 width=524) (actual time=0.004..0.004 rows=0 loops=1)
           InitPlan 3 (returns $1026)
             ->  CTE Scan on get get_1  (cost=0.00..0.02 rows=1 width=0) (actual time=0.000..0.000 rows=1 loops=1)
           ->  Result  (cost=0.00..0.03 rows=1 width=524) (actual time=0.002..0.002 rows=0 loops=1)
                 One-Time Filter: (NOT $1026)
                 ->  CTE Scan on val val_1  (cost=0.00..0.02 rows=1 width=32) (never executed)
   ->  CTE Scan on get  (cost=0.00..0.02 rows=1 width=524) (actual time=1.419..1.425 rows=1 loops=1)
   ->  CTE Scan on ins  (cost=0.00..0.02 rows=1 width=524) (actual time=0.005..0.005 rows=0 loops=1)
 Planning Time: 0.856 ms
 Execution Time: 1.631 ms
 Storage Read Requests: 2
 Storage Read Execution Time: 1.198 ms
 Storage Rows Scanned: 2
 Storage Write Requests: 0
 Catalog Read Requests: 0
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 1.198 ms
 Peak Memory Usage: 536 kB

With an existing email, two Read Requests are made to read the index entry and the remaining columns from the table.

I can include the “id” column to the secondary index to avoid this additional read request to the table:

yugabyte=# \d users;
                                    Table "public.users"
 Column |          Type          | Collation | Nullable |              Default
--------+------------------------+-----------+----------+-----------------------------------
 id     | bigint                 |           | not null | nextval('users_id_seq'::regclass)
 email  | character varying(255) |           | not null |
Indexes:
    "users_pkey" PRIMARY KEY, lsm (id HASH)
    "users_email_key" UNIQUE CONSTRAINT, lsm (email HASH)

yugabyte=# create unique index users_email_unique on users ( email asc ) include ( id );
CREATE INDEX

yugabyte=# alter table users add constraint users_email_unique unique using index users_email_unique;
ALTER TABLE

yugabyte=# alter table users drop constraint users_email_key;
ALTER TABLE

As I created the new index before dropping the existing constraint, my table was always protected from duplicates. However, my query doesn’t depend on the constraint, like ON CONFLICT would do, which makes it safer.

The “create” scenario is the same, as an additional column in the index entry has no overhead, but the “get” scenario is reduced to one read request thanks to the Index Only Scan:

yugabyte-# execute get_or_create_user_email('john.doe@example.com')
;
                                                                  QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------
 Append  (cost=4.19..4.24 rows=2 width=524) (actual time=0.954..0.964 rows=1 loops=1)
   CTE val
     ->  Result  (cost=0.00..0.01 rows=1 width=32) (actual time=0.001..0.001 rows=1 loops=1)
   CTE get
     ->  YB Batched Nested Loop Join  (cost=0.00..4.14 rows=1 width=524) (actual time=0.952..0.955 rows=1 loops=1)
           Join Filter: ((users.email)::text = val.email)
           ->  CTE Scan on val  (cost=0.00..0.02 rows=1 width=32) (actual time=0.004..0.004 rows=1 loops=1)
           ->  Index Only Scan using users_email_unique on users  (cost=0.00..4.11 rows=1 width=524) (actual time=0.888..0.889 rows=1 loops=1)
                 Index Cond: (email = ANY (ARRAY[val.email, $2, $3, ..., $1024]))
                 Heap Fetches: 0
                 Storage Index Read Requests: 1
                 Storage Index Read Execution Time: 0.757 ms
                 Storage Index Rows Scanned: 1
   CTE ins
     ->  Insert on users users_1  (cost=0.02..0.04 rows=1 width=524) (actual time=0.003..0.003 rows=0 loops=1)
           InitPlan 3 (returns $1026)
             ->  CTE Scan on get get_1  (cost=0.00..0.02 rows=1 width=0) (actual time=0.000..0.000 rows=1 loops=1)
           ->  Result  (cost=0.00..0.03 rows=1 width=524) (actual time=0.002..0.002 rows=0 loops=1)
                 One-Time Filter: (NOT $1026)
                 ->  CTE Scan on val val_1  (cost=0.00..0.02 rows=1 width=32) (never executed)
   ->  CTE Scan on get  (cost=0.00..0.02 rows=1 width=524) (actual time=0.954..0.958 rows=1 loops=1)
   ->  CTE Scan on ins  (cost=0.00..0.02 rows=1 width=524) (actual time=0.004..0.004 rows=0 loops=1)
 Planning Time: 0.790 ms
 Execution Time: 1.171 ms
 Storage Read Requests: 1
 Storage Read Execution Time: 0.757 ms
 Storage Rows Scanned: 1
 Storage Write Requests: 0
 Catalog Read Requests: 0
 Catalog Write Requests: 0
 Storage Flush Requests: 0
 Storage Execution Time: 0.757 ms
 Peak Memory Usage: 536 kB

Consider consolidating this type of data logic into a single statement using the WITH clause and Common Table Expressions (CTE) in a declarative manner, rather than using multiple roundtrips or a block of procedural code. Each CTE’s input is the database state at the beginning of the statement or the output of another CTE in the same statement. The dependency is determining the order of execution. Start with the most frequent scenario. Remember, the database isn’t a black box: you can review the execution plan to understand the exact time complexity.

Improving Your SQL Indexing: How to Effectively Order Columns | Yugabyte

Improve your SQL indexing by effectively ordering columns – follow the detailed instructions in this new ‘How to’ guide.

favicon yugabyte.com


This content originally appeared on DEV Community and was authored by Franck Pachot