Django ORM dan SQL Query

Dari sekian banyak fitur yang dimiliki oleh django, django orm menurut saya adalah salah satu fitur yang paling powerfull, selain cukup mudah digunakan django orm juga tidak sekadar "query builder" tapi sudah lengkap dengan migration, hanya saja dengan kemudahan yang ada, mudah pula dalam membuat kesalahan. Dalam tulisan ini saya merangkum pengalaman saya dalam menggunakan django orm.

Model Migration

Djang ORm menyediakan fitur migration yang membuat kita mudah dalam membtuk struktur tabel, contoh jika kita ingin membuat tabel seperti berikut

Di django kita cukup membuat class model seperti ini

class Author(models.Model):
	name = models.Charfield(max_length=255)
	...

    def __str__(self):
        return self.name
        
class Genre(models.Model):
    name = models.CharField(max_length=255)

    def __str__(self):
        return self.name
class Book(moodels.Model):
	title = models.Charfield(max_length=255)
	author = models.Foreignkey(Author)
    genres = models.ManyToManyField("Genre")

    def __str__(self):
        return self.tile
	...

Dengan kode diatas saat menjalankan perintah makemigrations dan migrate django akan menjalankan SQL DDL (Data Definition Language) berupa Create Table

Oke cukupi dulu terkait migration dan DDL, mari lanjut ke issue yang cukup sering terjadi.

Select Query

Dengan contoh model di atas kita akan coba menarik semua buku yang dipunya, dengan menggunakan sql kita bisa melakukan perintah ini

select title from book

Dengan Django ORM yang kita lakukan

Book.objects.all()

mari lihat contoh kode menampilkan semua judul buku

def get_data():
	books = Book.objects.all()
	for b in books:
		print(b.title)

Saat kita debug kode di atas diterjemahkan menjadi

SELECT "books_book"."id",
       "books_book"."title",
       "books_book"."author_id"
  FROM "books_book"

Execution time: 0.000343s [Database: default]
Book Number 1
Book Number 2
Book Number 3
Book Number 4
Book Number 5
+--------+--------+--------+--------+------------+-------+----------+
| Select | Insert | Update | Delete | Duplicates | Total | Duration |
+--------+--------+--------+--------+------------+-------+----------+
|   1    |   0    |   0    |   0    |     0      |   1   |   0.00   |
+--------+--------+--------+--------+------------+-------+----------+
Di django saat membuat model akan diterjemahkan menjadi <app_name>_<model_name>, sehingga contoh di atas menjadi books_book
Debug di atas menggunakan django-extension dan django query counter

Berapun data yang kita punya jumlah query akan tetap 1.

N+1 Problem

Mari ubah kodenya sedikit, selain mengambil judul buku saya juga ingin mendapatkan author dari buku tersebut, sehingga kodenya menjadi seperti berikut

def get_data():
	books = Book.objects.all()
	for b in books:
		print(f"{b.title} - {b.author.name}")

Kalau dijalankan hasilnya akan menjadi seperti berikut

SELECT "books_book"."id",
       "books_book"."title",
       "books_book"."author_id"
  FROM "books_book"

Execution time: 0.002014s [Database: default]
SELECT "books_author"."id",
       "books_author"."name"
  FROM "books_author"
 WHERE "books_author"."id" = 1
 LIMIT 21

Execution time: 0.000223s [Database: default]
Book Number 1 - John Doe
SELECT "books_author"."id",
       "books_author"."name"
  FROM "books_author"
 WHERE "books_author"."id" = 1
 LIMIT 21

Execution time: 0.000114s [Database: default]
Book Number 2 - John Doe
SELECT "books_author"."id",
       "books_author"."name"
  FROM "books_author"
 WHERE "books_author"."id" = 1
 LIMIT 21

Execution time: 0.000308s [Database: default]
Book Number 3 - John Doe
SELECT "books_author"."id",
       "books_author"."name"
  FROM "books_author"
 WHERE "books_author"."id" = 1
 LIMIT 21

Execution time: 0.000123s [Database: default]
Book Number 4 - John Doe
SELECT "books_author"."id",
       "books_author"."name"
  FROM "books_author"
 WHERE "books_author"."id" = 1
 LIMIT 21

Execution time: 0.000106s [Database: default]
Book Number 5 - John Doe
Duplicate queries:
5: SELECT "books_author"."id",
       "books_author"."name"
FROM "books_author"
WHERE "books_author"."id" = %s
LIMIT 21

+--------+--------+--------+--------+------------+-------+----------+
| Select | Insert | Update | Delete | Duplicates | Total | Duration |
+--------+--------+--------+--------+------------+-------+----------+
|   6    |   0    |   0    |   0    |     5      |   6   |   0.10   |
+--------+--------+--------+--------+------------+-------+----------+

Total query yang asalnya hanya 1 melonjak menjadi 6, kenapa menjadi seperti berikut, karena saat kita mengambil data "author.name" itu tidak ada di tabel books_book dan harus mengambil dari tabel books_author, sehinga django akan melakukan untuk setiap author dari buku yang didapat

Pertama mengambil semua buku

SELECT "books_book"."id",
       "books_book"."title",
       "books_book"."author_id"
  FROM "books_book"

lalu dari setiap buku dia akan query lagi untuk mendapatkan nama dari authornya

 SELECT "books_author"."id",
       "books_author"."name"
FROM "books_author"
WHERE "books_author"."id" = %s

di mana %s di sana itu adalah author_id dari tabel books_book

Jika berpikir "hanya 6 query" saja, mari buat simulasi 10 author dan masing-masing author memilii 3-10 judul buku, berapa jumlah query yang dihasilkan

+--------+--------+--------+--------+------------+-------+----------+
| Select | Insert | Update | Delete | Duplicates | Total | Duration |
+--------+--------+--------+--------+------------+-------+----------+
|   81   |   0    |   0    |   0    |     80     |  81   |   0.73   |
+--------+--------+--------+--------+------------+-------+----------+

Perbandungan durasi yang dihasilkan:

0.0s
0.10s
0.73s

Idealnya untuk mendapatkan informasi itu jika menggunakan sql langsung adalah dengan cara

SELECT "books_book"."id",
"books_book"."title",
"books_book"."author_id",
"books_author"."id",
"books_author"."name"
FROM "books_book"
	LEFT JOIN "books_author"
ON ("books_book"."author_id" = "books_author"."id")

Issue N+1 bukanlah masalah baru oleh karena itu Django sudah ada built in solution. Ubah kodenya menjadi seperti ini

books = Book.objects.all().select_related("author")
for b in books:
	print(f"{b.title} - {b.author.name}")

Setelah dijalankan, hasilnya

SELECT "books_book"."id",
       "books_book"."title",
       "books_book"."author_id",
       "books_author"."id",
       "books_author"."name"
  FROM "books_book"
 INNER JOIN "books_author"
    ON ("books_book"."author_id" = "books_author"."id")

Execution time: 0.000411s [Database: default]
Book Number 3 - Author 1
Book Number 4 - Author 1
Book Number 5 - Author 2
...
+--------+--------+--------+--------+------------+-------+----------+
| Select | Insert | Update | Delete | Duplicates | Total | Duration |
+--------+--------+--------+--------+------------+-------+----------+
|   1    |   0    |   0    |   0    |     0      |   1   |   0.02   |
+--------+--------+--------+--------+------------+-------+----------+

Saya menggunakan contoh dari data yang banyak dari 81 query menjadi 1 query, dari  0.73s menjadi 0.02s

Sekaran mari ubah kembali, selain ingin mendapatkan authornya saya ingin mendapatkan informasi genre yang dimiliki oleh buku ini, ubah kode menjadi seperti ini

def get_data():
	books = Book.objects.all().select_related("author")
	for b in books:
	print(f"{b.title} - {b.author.name}")
	for g in b.genres.all():
		print(g.name)

Hasilnya

SELECT "books_book"."id",
       "books_book"."title",
       "books_book"."author_id",
       "books_author"."id",
       "books_author"."name"
  FROM "books_book"
 INNER JOIN "books_author"
    ON ("books_book"."author_id" = "books_author"."id")

Execution time: 0.000313s [Database: default]
Book Number 1 - John Doe
SELECT "books_genre"."id",
       "books_genre"."name"
  FROM "books_genre"
 INNER JOIN "books_book_genres"
    ON ("books_genre"."id" = "books_book_genres"."genre_id")
 WHERE "books_book_genres"."book_id" = 1

Execution time: 0.000309s [Database: default]
Genre 1
Book Number 2 - John Doe
SELECT "books_genre"."id",
       "books_genre"."name"
  FROM "books_genre"
 INNER JOIN "books_book_genres"
    ON ("books_genre"."id" = "books_book_genres"."genre_id")
 WHERE "books_book_genres"."book_id" = 2

Execution time: 0.000147s [Database: default]
Genre 1
Book Number 3 - John Doe
SELECT "books_genre"."id",
       "books_genre"."name"
  FROM "books_genre"
 INNER JOIN "books_book_genres"
    ON ("books_genre"."id" = "books_book_genres"."genre_id")
 WHERE "books_book_genres"."book_id" = 3

Execution time: 0.000136s [Database: default]
Genre 1
Book Number 4 - John Doe
SELECT "books_genre"."id",
       "books_genre"."name"
  FROM "books_genre"
 INNER JOIN "books_book_genres"
    ON ("books_genre"."id" = "books_book_genres"."genre_id")
 WHERE "books_book_genres"."book_id" = 4

Execution time: 0.000215s [Database: default]
Genre 1
Book Number 5 - John Doe
SELECT "books_genre"."id",
       "books_genre"."name"
  FROM "books_genre"
 INNER JOIN "books_book_genres"
    ON ("books_genre"."id" = "books_book_genres"."genre_id")
 WHERE "books_book_genres"."book_id" = 5

Execution time: 0.000380s [Database: default]
Genre 1
Duplicate queries:
5: SELECT "books_genre"."id",
       "books_genre"."name"
FROM "books_genre"
INNER JOIN "books_book_genres" ON ("books_genre"."id" = "books_book_genres"."genre_id")
WHERE "books_book_genres"."book_id" = %s

+--------+--------+--------+--------+------------+-------+----------+
| Select | Insert | Update | Delete | Duplicates | Total | Duration |
+--------+--------+--------+--------+------------+-------+----------+
|   6    |   0    |   0    |   0    |     5      |   6   |   0.05   |
+--------+--------+--------+--------+------------+-------+----------+

Untuk yang tipe datanya seperti ini kita bisa menggunakan prefetch_related, ubah kembali kodenya menjadi seperti berikut

def get_data():
	books = Book.objects.all().select_related("author").prefetch_related("genres")
	for b in books:
	print(f"{b.title} - {b.author.name}")
		for g in b.genres.all():
			print(g.name)

hasilnya

SELECT "books_book"."id",
       "books_book"."title",
       "books_book"."author_id",
       "books_author"."id",
       "books_author"."name"
  FROM "books_book"
 INNER JOIN "books_author"
    ON ("books_book"."author_id" = "books_author"."id")

Execution time: 0.000451s [Database: default]
SELECT ("books_book_genres"."book_id") AS "_prefetch_related_val_book_id",
       "books_genre"."id",
       "books_genre"."name"
  FROM "books_genre"
 INNER JOIN "books_book_genres"
    ON ("books_genre"."id" = "books_book_genres"."genre_id")
 WHERE "books_book_genres"."book_id" IN (1, 2, 3, 4, 5)

Execution time: 0.000296s [Database: default]
Book Number 1 - John Doe
Genre 1
Book Number 2 - John Doe
Genre 1
Book Number 3 - John Doe
Genre 1
Book Number 4 - John Doe
Genre 1
Book Number 5 - John Doe
Genre 1
+--------+--------+--------+--------+------------+-------+----------+
| Select | Insert | Update | Delete | Duplicates | Total | Duration |
+--------+--------+--------+--------+------------+-------+----------+
|   2    |   0    |   0    |   0    |     0      |   2   |   0.03   |
+--------+--------+--------+--------+------------+-------+----------+

Untuk prefetch berbeda dengan select_related, dia menjalankan dua query, pertama dia get all data dari books lalu melakukan pencarian degan where in di tabel pivot.

Untuk kapan pakai select_related atau prefetch_related jawabannya selama punya foreign key, yang membedakan:
select_related itu jika kita ingin mendapatkan 1 data dari model relation
prefetch_related itu jika kita mungkin mendapatkan lebih dari 1 data dari model relation

Contoh di atas cukup mudah karena dari books untuk mendapatkan author itu cukup mengambil dari informasi field author, bagaimana jika dibalik, saya punya author dan saya ingin mendaptkan buku-buku dari author tersebut

def author():
	author = Author.objects.all()
	for a in author:
	print(a.name)
	for b in a.book.all():
		print(b.title)
saya bisa memanggil langsung a.book karena saya menambahkan "related_name" di model book
author = models.ForeignKey("Author", on_delete=models.CASCADE, related_name="book")

Saat kita jalankan hasilnya

SELECT "books_author"."id",
       "books_author"."name"
  FROM "books_author"

Execution time: 0.000291s [Database: default]
Author 1
SELECT "books_book"."id",
       "books_book"."title",
       "books_book"."author_id"
  FROM "books_book"
 WHERE "books_book"."author_id" = 1

Execution time: 0.000238s [Database: default]
Book Number 3
Book Number 4
Book Number 5
Book Number 6
Book Number 7
Book Number 8
Book Number 9
Book Number 10
Author 2
...

Execution time: 0.000166s [Database: default]
Book Number 3
Book Number 4
Book Number 5
Book Number 6
Book Number 7
Book Number 8
Book Number 9
Book Number 10
Author 10
SELECT "books_book"."id",
       "books_book"."title",
       "books_book"."author_id"
  FROM "books_book"
 WHERE "books_book"."author_id" = 10

Execution time: 0.000241s [Database: default]
Book Number 3
Book Number 4
Book Number 5
Book Number 6
Book Number 7
Book Number 8
Book Number 9
Book Number 10
Duplicate queries:
10: SELECT "books_book"."id",
       "books_book"."title",
       "books_book"."author_id"
FROM "books_book"
WHERE "books_book"."author_id" = %s

+--------+--------+--------+--------+------------+-------+----------+
| Select | Insert | Update | Delete | Duplicates | Total | Duration |
+--------+--------+--------+--------+------------+-------+----------+
|   11   |   0    |   0    |   0    |     10     |  11   |   0.08   |
+--------+--------+--------+--------+------------+-------+----------+

Fungsi prefetch_related bisa digunakan juga dengan related_name

author = Author.objects.all().prefetch_related("book")

Hasilnya

SELECT "books_author"."id",
       "books_author"."name"
  FROM "books_author"

Execution time: 0.000231s [Database: default]
SELECT "books_book"."id",
       "books_book"."title",
       "books_book"."author_id"
  FROM "books_book"
 WHERE "books_book"."author_id" IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

Execution time: 0.000173s [Database: default]
Author 1
Book Number 3
Book Number 4
Book Number 5
Book Number 6
Book Number 7
Book Number 8
Book Number 9
Book Number 10

Author 10
Book Number 3
Book Number 4
Book Number 5
Book Number 6
Book Number 7
Book Number 8
Book Number 9
Book Number 10
+--------+--------+--------+--------+------------+-------+----------+
| Select | Insert | Update | Delete | Duplicates | Total | Duration |
+--------+--------+--------+--------+------------+-------+----------+
|   2    |   0    |   0    |   0    |     0      |   2   |   0.02   |
+--------+--------+--------+--------+------------+-------+----------+

Another Prefetch

Jika ingin membuat query prefetch-related dengan sebuah kondisi kita bisa memanfaatkan Prefetch, contohnya menjadi seperti berikut

from django.db.models import Prefetch
author = Author.objects.all().prefetch_related(
	Prefetch(
		"book",
		queryset=Book.objects.filter(id__lte=10),
		to_attr="books",
		)
	)
	for a in author:
	print(a.name)
	for b in a.books:
		print(b.title)

Hasilnya

SELECT "books_author"."id",
       "books_author"."name"
  FROM "books_author"

Execution time: 0.000481s [Database: default]
SELECT "books_book"."id",
       "books_book"."title",
       "books_book"."author_id"
  FROM "books_book"
 WHERE ("books_book"."id" <= 10 AND "books_book"."author_id" IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10))

Execution time: 0.000301s [Database: default]
Author 1
Book Number 3
Book Number 4
Book Number 5
Book Number 6
Book Number 7
Book Number 8
Book Number 9
Book Number 10
Author 2
Book Number 3
Book Number 4
Author 3
Author 4
Author 5
Author 6
Author 7
Author 8
Author 9
Author 10
+--------+--------+--------+--------+------------+-------+----------+
| Select | Insert | Update | Delete | Duplicates | Total | Duration |
+--------+--------+--------+--------+------------+-------+----------+
|   2    |   0    |   0    |   0    |     0      |   2   |   0.02   |
+--------+--------+--------+--------+------------+-------+----------+

Raw Query

Bisa dilihat kode ORM dikit-dikit menjadi banyak abstraksi lainnya, dari sekadar .all(), lalu ada select_related() dan juga prefetch_related() ditambah ada lagi Prefetch() dan ada juga nanti  annotate() terus nanti ada Subquery() dan lain sebagainya.
Bagi saya itu sudah waktunya kembali ke Raw Query, karena kemudahan yang diberikan sudah tidak berlaku jadi lebih baik saya menulis raw query saja.