Sepertinya saya tidak butuh mongoDb

Sepertinya saya tidak butuh mongoDb
Photo by Caspar Camille Rubin / Unsplash

Sering ada candaan, "Ngapain MongoDB, JSON/JSONB di Postgre cukup!"

Oke, mari kita coba!

Di PostgreSQL, ada tipe kolom json dan jsonb yang berfungsi untuk menerima dan menyimpan data dalam format JSON atau array JSON.

Perbedaan yang cukup kentara antara keduanya adalah:

  • JSON: Menyimpan data persis seperti yang diterima, tanpa ada perubahan. Tipe ini cocok digunakan jika ingin menyimpan data mentah yang datang dari request/response, seperti log, tanpa perlu normalisasi atau perubahan format.
  • JSONB: Sebelum disimpan, JSONB akan melakukan normalisasi terhadap data. Key dalam objek akan di-reorganisasi, whitespace akan dihapus, dan data akan disimpan dalam format biner.

Contoh di tabel user_logs untuk menyimpan informasi perubahan data.

 user_id |                              logs
---------+----------------------------------------------------------------
     102 | {                                                             +
         |   "after": {"email": "[email protected]", "name": "Alice B."},+
         |   "before": {"name": "Alice", "email": "[email protected]"}     +
         | }
      15 | {                                                             +
         |   "before": {"address": "No. 123"},                           +
         |   "after": {"address": "No. 123B"}                            +
         | }

Contoh di atas memperlihatkan data yang disimpan sebagaimana data yang diterima, sedangkan untuk jsonb, saat saya menyimpan data seperti ini

INSERT INTO user_config  
            (user_id,  
             config)  
VALUES      (5,  
             '{ "theme": "dark", "language": "id", "status": "free_user" }');

tetap menghasilkan

user_id |                             config
---------+-----------------------------------------------------------------
       1 | {"theme": "dark", "status": "premium_user", "language": "jp"}
       2 | {"theme": "dark", "status": "premium_user", "language": "en"}
       3 | {"theme": "dark", "status": "admin", "language": "id"}
       4 | {"theme": "dark", "status": "premium_user", "language": "id"}
       5 | {"theme": "solarized", "status": "free_user", "language": "id"}
(5 rows)

Operator di JSON/JSONB

Saya belum banyak mencoba tapi setidaknya saya mencoba untuk hal-hal sederhana dulu

  • saya mengeluarkan informasi json menjadi bentuk field di sql
  • saya mencoba filter informasi dari json field

Menampilkan nilai dari json

Untuk mengambil data dari kolom json atau jsonb, ada dua operator utama yang bisa digunakan:

  • -> digunakan untuk mengambil value dalam bentuk JSON.
  • ->> digunakan untuk mengambil value dalam bentuk teks (string).
example=# select user_id, logs -> 'before' ->> 'email' as email_before,                                                   logs -> 'after' ->> 'email' as after_email                                                                                from user_logs;
 user_id |   email_before   |    after_email
---------+------------------+-------------------
       1 | [email protected]     | [email protected]
       2 | [email protected]   | [email protected]
       3 |                  |
       4 |                  |
       5 | [email protected] | [email protected]
       6 |                  |
       7 | [email protected]     | [email protected]
       8 |                  |
       9 | [email protected]   | [email protected]
      10 | [email protected]    | [email protected]
      11 | [email protected]    | [email protected]
      12 | [email protected]    | [email protected]
      13 |                  |
      14 | [email protected]   | [email protected]
      15 |                  |
     102 | [email protected]  | [email protected]
(16 rows)

example=#

Operator yang sama juga bisa digunakan untuk pencarian, misal pastikan cek yang benar-benar berbeda.

example=# select user_id, logs -> 'before' ->> 'email' as email_before,                                                   logs -> 'after' ->> 'email' as after_email                                                                                from user_logs where logs -> 'after' ->> 'email' <> logs -> 'before' ->> 'email';
 user_id |  email_before   |    after_email
---------+-----------------+-------------------
       1 | [email protected]    | [email protected]
       7 | [email protected]    | [email protected]
       9 | [email protected]  | [email protected]
      11 | [email protected]   | [email protected]
      12 | [email protected]   | [email protected]
      14 | [email protected]  | [email protected]
     102 | [email protected] | [email protected]
(7 rows)

atau pencarian seperti pada umumnya

example=# select user_id, logs -> 'before' ->> 'email' as email_before,                                                   logs -> 'after' ->> 'email' as after_email                                                                                from user_logs where logs -> 'after' ->> 'email' = '[email protected]';
 user_id | email_before | after_email
---------+--------------+--------------
       1 | [email protected] | [email protected]
(1 row)

example=#

JSONB
Untuk jsonb tidak beda jauh dengan json, tapi ada beberapa operator tambahan yang bisa dipakai, yaitu:

  • @> (contains)
  • ? (exist)
 user_id |                             config
---------+-----------------------------------------------------------------
       1 | {"theme": "dark", "status": "premium_user", "language": "jp"}
       2 | {"theme": "dark", "status": "premium_user", "language": "en"}
       3 | {"theme": "dark", "status": "admin", "language": "id"}
       4 | {"theme": "dark", "status": "premium_user", "language": "id"}
       5 | {"theme": "solarized", "status": "free_user", "language": "id"}

kita bisa gunakan cara serupa dengan json dengan operator ->>

example=# select * from user_config where config ->> 'theme' = 'dark' limit 5;
 user_id |                            config
---------+---------------------------------------------------------------
       1 | {"theme": "dark", "status": "premium_user", "language": "jp"}
       2 | {"theme": "dark", "status": "premium_user", "language": "en"}
       3 | {"theme": "dark", "status": "admin", "language": "id"}
       4 | {"theme": "dark", "status": "premium_user", "language": "id"}
       7 | {"theme": "dark", "status": "free_user", "language": "id"}
(5 rows)

atau bisa langsung dengan contains (@>)

example-# logs -> 'after' ->> 'email' as after_email
example=# select * from user_config where config @> '{"theme":"dark"}' limit 5;
 user_id |                            config
---------+---------------------------------------------------------------
       1 | {"theme": "dark", "status": "premium_user", "language": "jp"}
       2 | {"theme": "dark", "status": "premium_user", "language": "en"}
       3 | {"theme": "dark", "status": "admin", "language": "id"}
       4 | {"theme": "dark", "status": "premium_user", "language": "id"}
       7 | {"theme": "dark", "status": "free_user", "language": "id"}

Bisa juga memastikan apakah sebuah key exist di json yang ada dengan operator ?

example=# select * from user_config where config ? 'notification' limit 5;
 user_id | config
---------+--------
(0 rows)

example=# select * from user_config where config ? 'language' limit 5;
 user_id |                             config
---------+-----------------------------------------------------------------
       1 | {"theme": "dark", "status": "premium_user", "language": "jp"}
       2 | {"theme": "dark", "status": "premium_user", "language": "en"}
       3 | {"theme": "dark", "status": "admin", "language": "id"}
       4 | {"theme": "dark", "status": "premium_user", "language": "id"}
       5 | {"theme": "solarized", "status": "free_user", "language": "id"}
(5 rows)

Dan yang paling menarik dari jsonb adalah: field-nya bisa di-index.

Di tabel user_config, saya memasukan 100k+ data dummy. Sekarang mari kita cek performa query pencarian tanpa index:

explain analyze select user_id, config  from user_config where config @> '{"theme":"dark"}';

Seq Scan on user_config  
(cost=0.00..2574.50 rows=35199 width=68)  
(actual time=0.027..46.364 rows=34805 loops=1)
Filter: (config @> '{"theme": "dark"}'::jsonb)
Rows Removed by Filter: 69395
Execution Time: 48.419 ms

Waktu yang dibutuhkan 48.419ms karena harus scan keseluruhan tabel.

Lalu kita coba tambahkan index

CREATE INDEX idx_user_config_config_jsonb ON user_config USING gin (config);

kita ulangi

explain analyze select user_id, config  from user_config where config @> '{"theme":"dark"}';
Bitmap Heap Scan on user_config  
(cost=263.71..1975.70 rows=35199 width=68)  
(actual time=12.631..32.547 rows=34805 loops=1)
   Recheck Cond: (config @> '{"theme": "dark"}'::jsonb)
   Heap Blocks: exact=1272
   ->  Bitmap Index Scan on idx_user_config_config_jsonb  
      (cost=0.00..254.91 rows=35199 width=0)  
      (actual time=12.393..12.394 rows=34805 loops=1)
Execution Time: 34.529 ms

Dengan penambahan index di kolom config, waktu pencarian turun jadi 34.529ms, jauh lebih optimal dibanding sebelumnya yang harus full scan.

Sebenarnya kalau saya baca dari dokumentasi PostgreSQL, masih banyak fitur lain dari json dan jsonb yang bisa dieksplor, mulai dari fungsi-fungsi lanjutan sampai kombinasi query yang lebih kompleks. Tapi ya, mari dicoba di lain kesempatan.

Read more