Django ORM dan SQL Query
Dari sekian banyak fitur yang dimiliki oleh django, django orm menurut saya adalah salah satu fitur yang paling powerfull, selain cukup mudah digunakan django orm juga tidak sekadar "query builder" tapi sudah lengkap dengan migration, hanya saja dengan kemudahan yang ada, mudah pula dalam membuat kesalahan. Dalam tulisan ini saya merangkum pengalaman saya dalam menggunakan django orm.
Model Migration
Djang ORm menyediakan fitur migration yang membuat kita mudah dalam membtuk struktur tabel, contoh jika kita ingin membuat tabel seperti berikut
Di django kita cukup membuat class model seperti ini
class Author(models.Model):
name = models.Charfield(max_length=255)
...
def __str__(self):
return self.name
class Genre(models.Model):
name = models.CharField(max_length=255)
def __str__(self):
return self.name
class Book(moodels.Model):
title = models.Charfield(max_length=255)
author = models.Foreignkey(Author)
genres = models.ManyToManyField("Genre")
def __str__(self):
return self.tile
...
Dengan kode diatas saat menjalankan perintah makemigrations
dan migrate
django akan menjalankan SQL DDL (Data Definition Language) berupa Create Table
Oke cukupi dulu terkait migration dan DDL, mari lanjut ke issue yang cukup sering terjadi.
Select Query
Dengan contoh model di atas kita akan coba menarik semua buku yang dipunya, dengan menggunakan sql kita bisa melakukan perintah ini
select title from book
Dengan Django ORM yang kita lakukan
Book.objects.all()
mari lihat contoh kode menampilkan semua judul buku
def get_data():
books = Book.objects.all()
for b in books:
print(b.title)
Saat kita debug kode di atas diterjemahkan menjadi
SELECT "books_book"."id",
"books_book"."title",
"books_book"."author_id"
FROM "books_book"
Execution time: 0.000343s [Database: default]
Book Number 1
Book Number 2
Book Number 3
Book Number 4
Book Number 5
+--------+--------+--------+--------+------------+-------+----------+
| Select | Insert | Update | Delete | Duplicates | Total | Duration |
+--------+--------+--------+--------+------------+-------+----------+
| 1 | 0 | 0 | 0 | 0 | 1 | 0.00 |
+--------+--------+--------+--------+------------+-------+----------+
Di django saat membuat model akan diterjemahkan menjadi <app_name>_<model_name>, sehingga contoh di atas menjadi books_book
Debug di atas menggunakan django-extension dan django query counter
Berapun data yang kita punya jumlah query akan tetap 1.
N+1 Problem
Mari ubah kodenya sedikit, selain mengambil judul buku saya juga ingin mendapatkan author dari buku tersebut, sehingga kodenya menjadi seperti berikut
def get_data():
books = Book.objects.all()
for b in books:
print(f"{b.title} - {b.author.name}")
Kalau dijalankan hasilnya akan menjadi seperti berikut
SELECT "books_book"."id",
"books_book"."title",
"books_book"."author_id"
FROM "books_book"
Execution time: 0.002014s [Database: default]
SELECT "books_author"."id",
"books_author"."name"
FROM "books_author"
WHERE "books_author"."id" = 1
LIMIT 21
Execution time: 0.000223s [Database: default]
Book Number 1 - John Doe
SELECT "books_author"."id",
"books_author"."name"
FROM "books_author"
WHERE "books_author"."id" = 1
LIMIT 21
Execution time: 0.000114s [Database: default]
Book Number 2 - John Doe
SELECT "books_author"."id",
"books_author"."name"
FROM "books_author"
WHERE "books_author"."id" = 1
LIMIT 21
Execution time: 0.000308s [Database: default]
Book Number 3 - John Doe
SELECT "books_author"."id",
"books_author"."name"
FROM "books_author"
WHERE "books_author"."id" = 1
LIMIT 21
Execution time: 0.000123s [Database: default]
Book Number 4 - John Doe
SELECT "books_author"."id",
"books_author"."name"
FROM "books_author"
WHERE "books_author"."id" = 1
LIMIT 21
Execution time: 0.000106s [Database: default]
Book Number 5 - John Doe
Duplicate queries:
5: SELECT "books_author"."id",
"books_author"."name"
FROM "books_author"
WHERE "books_author"."id" = %s
LIMIT 21
+--------+--------+--------+--------+------------+-------+----------+
| Select | Insert | Update | Delete | Duplicates | Total | Duration |
+--------+--------+--------+--------+------------+-------+----------+
| 6 | 0 | 0 | 0 | 5 | 6 | 0.10 |
+--------+--------+--------+--------+------------+-------+----------+
Total query yang asalnya hanya 1 melonjak menjadi 6, kenapa menjadi seperti berikut, karena saat kita mengambil data "author.name" itu tidak ada di tabel books_book
dan harus mengambil dari tabel books_author
, sehinga django akan melakukan untuk setiap author dari buku yang didapat
Pertama mengambil semua buku
SELECT "books_book"."id",
"books_book"."title",
"books_book"."author_id"
FROM "books_book"
lalu dari setiap buku dia akan query lagi untuk mendapatkan nama dari authornya
SELECT "books_author"."id",
"books_author"."name"
FROM "books_author"
WHERE "books_author"."id" = %s
di mana %s
di sana itu adalah author_id dari tabel books_book
Jika berpikir "hanya 6 query" saja, mari buat simulasi 10 author dan masing-masing author memilii 3-10 judul buku, berapa jumlah query yang dihasilkan
+--------+--------+--------+--------+------------+-------+----------+
| Select | Insert | Update | Delete | Duplicates | Total | Duration |
+--------+--------+--------+--------+------------+-------+----------+
| 81 | 0 | 0 | 0 | 80 | 81 | 0.73 |
+--------+--------+--------+--------+------------+-------+----------+
Perbandungan durasi yang dihasilkan:
0.0s
0.10s
0.73s
Idealnya untuk mendapatkan informasi itu jika menggunakan sql langsung adalah dengan cara
SELECT "books_book"."id",
"books_book"."title",
"books_book"."author_id",
"books_author"."id",
"books_author"."name"
FROM "books_book"
LEFT JOIN "books_author"
ON ("books_book"."author_id" = "books_author"."id")
Select Related & Prefetch Related
Issue N+1 bukanlah masalah baru oleh karena itu Django sudah ada built in solution. Ubah kodenya menjadi seperti ini
books = Book.objects.all().select_related("author")
for b in books:
print(f"{b.title} - {b.author.name}")
Setelah dijalankan, hasilnya
SELECT "books_book"."id",
"books_book"."title",
"books_book"."author_id",
"books_author"."id",
"books_author"."name"
FROM "books_book"
INNER JOIN "books_author"
ON ("books_book"."author_id" = "books_author"."id")
Execution time: 0.000411s [Database: default]
Book Number 3 - Author 1
Book Number 4 - Author 1
Book Number 5 - Author 2
...
+--------+--------+--------+--------+------------+-------+----------+
| Select | Insert | Update | Delete | Duplicates | Total | Duration |
+--------+--------+--------+--------+------------+-------+----------+
| 1 | 0 | 0 | 0 | 0 | 1 | 0.02 |
+--------+--------+--------+--------+------------+-------+----------+
Saya menggunakan contoh dari data yang banyak dari 81 query menjadi 1 query, dari 0.73s menjadi 0.02s
Sekaran mari ubah kembali, selain ingin mendapatkan authornya saya ingin mendapatkan informasi genre yang dimiliki oleh buku ini, ubah kode menjadi seperti ini
def get_data():
books = Book.objects.all().select_related("author")
for b in books:
print(f"{b.title} - {b.author.name}")
for g in b.genres.all():
print(g.name)
Hasilnya
SELECT "books_book"."id",
"books_book"."title",
"books_book"."author_id",
"books_author"."id",
"books_author"."name"
FROM "books_book"
INNER JOIN "books_author"
ON ("books_book"."author_id" = "books_author"."id")
Execution time: 0.000313s [Database: default]
Book Number 1 - John Doe
SELECT "books_genre"."id",
"books_genre"."name"
FROM "books_genre"
INNER JOIN "books_book_genres"
ON ("books_genre"."id" = "books_book_genres"."genre_id")
WHERE "books_book_genres"."book_id" = 1
Execution time: 0.000309s [Database: default]
Genre 1
Book Number 2 - John Doe
SELECT "books_genre"."id",
"books_genre"."name"
FROM "books_genre"
INNER JOIN "books_book_genres"
ON ("books_genre"."id" = "books_book_genres"."genre_id")
WHERE "books_book_genres"."book_id" = 2
Execution time: 0.000147s [Database: default]
Genre 1
Book Number 3 - John Doe
SELECT "books_genre"."id",
"books_genre"."name"
FROM "books_genre"
INNER JOIN "books_book_genres"
ON ("books_genre"."id" = "books_book_genres"."genre_id")
WHERE "books_book_genres"."book_id" = 3
Execution time: 0.000136s [Database: default]
Genre 1
Book Number 4 - John Doe
SELECT "books_genre"."id",
"books_genre"."name"
FROM "books_genre"
INNER JOIN "books_book_genres"
ON ("books_genre"."id" = "books_book_genres"."genre_id")
WHERE "books_book_genres"."book_id" = 4
Execution time: 0.000215s [Database: default]
Genre 1
Book Number 5 - John Doe
SELECT "books_genre"."id",
"books_genre"."name"
FROM "books_genre"
INNER JOIN "books_book_genres"
ON ("books_genre"."id" = "books_book_genres"."genre_id")
WHERE "books_book_genres"."book_id" = 5
Execution time: 0.000380s [Database: default]
Genre 1
Duplicate queries:
5: SELECT "books_genre"."id",
"books_genre"."name"
FROM "books_genre"
INNER JOIN "books_book_genres" ON ("books_genre"."id" = "books_book_genres"."genre_id")
WHERE "books_book_genres"."book_id" = %s
+--------+--------+--------+--------+------------+-------+----------+
| Select | Insert | Update | Delete | Duplicates | Total | Duration |
+--------+--------+--------+--------+------------+-------+----------+
| 6 | 0 | 0 | 0 | 5 | 6 | 0.05 |
+--------+--------+--------+--------+------------+-------+----------+
Untuk yang tipe datanya seperti ini kita bisa menggunakan prefetch_related
, ubah kembali kodenya menjadi seperti berikut
def get_data():
books = Book.objects.all().select_related("author").prefetch_related("genres")
for b in books:
print(f"{b.title} - {b.author.name}")
for g in b.genres.all():
print(g.name)
hasilnya
SELECT "books_book"."id",
"books_book"."title",
"books_book"."author_id",
"books_author"."id",
"books_author"."name"
FROM "books_book"
INNER JOIN "books_author"
ON ("books_book"."author_id" = "books_author"."id")
Execution time: 0.000451s [Database: default]
SELECT ("books_book_genres"."book_id") AS "_prefetch_related_val_book_id",
"books_genre"."id",
"books_genre"."name"
FROM "books_genre"
INNER JOIN "books_book_genres"
ON ("books_genre"."id" = "books_book_genres"."genre_id")
WHERE "books_book_genres"."book_id" IN (1, 2, 3, 4, 5)
Execution time: 0.000296s [Database: default]
Book Number 1 - John Doe
Genre 1
Book Number 2 - John Doe
Genre 1
Book Number 3 - John Doe
Genre 1
Book Number 4 - John Doe
Genre 1
Book Number 5 - John Doe
Genre 1
+--------+--------+--------+--------+------------+-------+----------+
| Select | Insert | Update | Delete | Duplicates | Total | Duration |
+--------+--------+--------+--------+------------+-------+----------+
| 2 | 0 | 0 | 0 | 0 | 2 | 0.03 |
+--------+--------+--------+--------+------------+-------+----------+
Untuk prefetch berbeda dengan select_related, dia menjalankan dua query, pertama dia get all data dari books
lalu melakukan pencarian degan where in di tabel pivot.
Untuk kapan pakai select_related
atau prefetch_related
jawabannya selama punya foreign key, yang membedakan:select_related
itu jika kita ingin mendapatkan 1 data dari model relationprefetch_related
itu jika kita mungkin mendapatkan lebih dari 1 data dari model relation
Contoh di atas cukup mudah karena dari books
untuk mendapatkan author
itu cukup mengambil dari informasi field author, bagaimana jika dibalik, saya punya author dan saya ingin mendaptkan buku-buku dari author tersebut
def author():
author = Author.objects.all()
for a in author:
print(a.name)
for b in a.book.all():
print(b.title)
saya bisa memanggil langsung a.book karena saya menambahkan "related_name" di model book
author = models.ForeignKey("Author", on_delete=models.CASCADE, related_name="book")
Saat kita jalankan hasilnya
SELECT "books_author"."id",
"books_author"."name"
FROM "books_author"
Execution time: 0.000291s [Database: default]
Author 1
SELECT "books_book"."id",
"books_book"."title",
"books_book"."author_id"
FROM "books_book"
WHERE "books_book"."author_id" = 1
Execution time: 0.000238s [Database: default]
Book Number 3
Book Number 4
Book Number 5
Book Number 6
Book Number 7
Book Number 8
Book Number 9
Book Number 10
Author 2
...
Execution time: 0.000166s [Database: default]
Book Number 3
Book Number 4
Book Number 5
Book Number 6
Book Number 7
Book Number 8
Book Number 9
Book Number 10
Author 10
SELECT "books_book"."id",
"books_book"."title",
"books_book"."author_id"
FROM "books_book"
WHERE "books_book"."author_id" = 10
Execution time: 0.000241s [Database: default]
Book Number 3
Book Number 4
Book Number 5
Book Number 6
Book Number 7
Book Number 8
Book Number 9
Book Number 10
Duplicate queries:
10: SELECT "books_book"."id",
"books_book"."title",
"books_book"."author_id"
FROM "books_book"
WHERE "books_book"."author_id" = %s
+--------+--------+--------+--------+------------+-------+----------+
| Select | Insert | Update | Delete | Duplicates | Total | Duration |
+--------+--------+--------+--------+------------+-------+----------+
| 11 | 0 | 0 | 0 | 10 | 11 | 0.08 |
+--------+--------+--------+--------+------------+-------+----------+
Fungsi prefetch_related
bisa digunakan juga dengan related_name
author = Author.objects.all().prefetch_related("book")
Hasilnya
SELECT "books_author"."id",
"books_author"."name"
FROM "books_author"
Execution time: 0.000231s [Database: default]
SELECT "books_book"."id",
"books_book"."title",
"books_book"."author_id"
FROM "books_book"
WHERE "books_book"."author_id" IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
Execution time: 0.000173s [Database: default]
Author 1
Book Number 3
Book Number 4
Book Number 5
Book Number 6
Book Number 7
Book Number 8
Book Number 9
Book Number 10
Author 10
Book Number 3
Book Number 4
Book Number 5
Book Number 6
Book Number 7
Book Number 8
Book Number 9
Book Number 10
+--------+--------+--------+--------+------------+-------+----------+
| Select | Insert | Update | Delete | Duplicates | Total | Duration |
+--------+--------+--------+--------+------------+-------+----------+
| 2 | 0 | 0 | 0 | 0 | 2 | 0.02 |
+--------+--------+--------+--------+------------+-------+----------+
Another Prefetch
Jika ingin membuat query prefetch-related dengan sebuah kondisi kita bisa memanfaatkan Prefetch, contohnya menjadi seperti berikut
from django.db.models import Prefetch
author = Author.objects.all().prefetch_related(
Prefetch(
"book",
queryset=Book.objects.filter(id__lte=10),
to_attr="books",
)
)
for a in author:
print(a.name)
for b in a.books:
print(b.title)
Hasilnya
SELECT "books_author"."id",
"books_author"."name"
FROM "books_author"
Execution time: 0.000481s [Database: default]
SELECT "books_book"."id",
"books_book"."title",
"books_book"."author_id"
FROM "books_book"
WHERE ("books_book"."id" <= 10 AND "books_book"."author_id" IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10))
Execution time: 0.000301s [Database: default]
Author 1
Book Number 3
Book Number 4
Book Number 5
Book Number 6
Book Number 7
Book Number 8
Book Number 9
Book Number 10
Author 2
Book Number 3
Book Number 4
Author 3
Author 4
Author 5
Author 6
Author 7
Author 8
Author 9
Author 10
+--------+--------+--------+--------+------------+-------+----------+
| Select | Insert | Update | Delete | Duplicates | Total | Duration |
+--------+--------+--------+--------+------------+-------+----------+
| 2 | 0 | 0 | 0 | 0 | 2 | 0.02 |
+--------+--------+--------+--------+------------+-------+----------+
Raw Query
Bisa dilihat kode ORM dikit-dikit menjadi banyak abstraksi lainnya, dari sekadar .all()
, lalu ada select_related()
dan juga prefetch_related()
ditambah ada lagi Prefetch()
dan ada juga nanti annotate()
terus nanti ada Subquery()
dan lain sebagainya.
Bagi saya itu sudah waktunya kembali ke Raw Query, karena kemudahan yang diberikan sudah tidak berlaku jadi lebih baik saya menulis raw query saja.