JJ's Mostly Adequate Summary of the Django Meetup: When *Not* To Use the ORM & Goodbye REST: Building GraphQL APIs with Django

The Django meetup was at Prezi. They have a great space. They are big Django users.

Goodbye REST: Building APIs with Django and GraphQL

Jaden Windle, @jaydenwindle, lead engineer at Jetpack.

https://github.com/jaydenwindle/goodbye-rest-talk

They moved from Django REST Framework to GraphQL.

It sounds like a small app.

They're using Django, React, and React Native.

I think he said they used Reason and moved away from it, but I could be wrong.

They had to do a rebuild anyway, so they used GraphQL.

He said not to switch to GraphQL just because it's cool, and in general, avoid hype-driven development.

GraphQL is a query language, spec, and collection of tools, designed to operate over a single endpoint via HTTP, optimzing for perf and flexibility.

Key features:

  • Query for only the data you need.
  • Easily query for multiple resources in a single request.
  • Great front end tooling for handling caching, loading / error states, and updates.

(I wonder if he's talking about Apollo, not just GraphQL. I found out later, that they are using Apollo.)

GraphQL Schema:

  • Types
  • Queries
  • Mutations
  • *Subscriptions

Graphene is the goto framework for GraphQL in Django.

He showed an example app.

You create a type and connect it to a model. It's like a serializer in DRF.

(He's using VS Code.)

It knows how to understand relationships between types based on the relationships in the models.

He showed the query syntax.

He showed how Graphene connects to the Django model. You're returning raw Django model objects, and it takes care of serialization.

There's a really nice UI where you can type in a query, and it queries the server. It has autocomplete. I can't tell if this is from Apollo, Graphene, or some other GraphQL tool.

You only pass what you need across the wire.

When you do a mutation, you can also specify what data it should give back to you.

There is some Mutation class that you subclass.

The code looks a lot like DRF.

Subscriptions aren't fully implemented in Graphene. His company is working on implementing and open sourcing something. There are a bunch of other possible real-time options--http://graphql-python/graphql-ws is one.

There's a way to do file uploads.

Important: There's this thing called graphene-django-extras. There's even something to connect to DRF automatically.

Pros:

  • Dramatically improves front end DX
  • Flexible types allow for quick iteration
  • Always up-to-date documentation
  • Only send needed data over the wire

Cons:

  • Graphene isn't as mature as some other GraphQL implementations (for instance, in JS and Ruby)
  • Logging is different when using a single GraphQL endpoint
  • REST is currently better at server-side caching (E-Tag, etc.)

Graphene 3 is coming.

In the Q&A, they said they do use Apollo.

They're not yet at a scale where they have to worry about performance.

He's not entirely sure whether it's prone to the N+1 queries problem, but there are GitHub issues related to that.

You can do raw ORM or SQL queries if you need to. Otherwise, he's not sure what it's doing behind the scenes.

You can add permissions to the models. There's also a way to tie into Django's auth model.

Their API isn't a public API. It's only for use by their own client.

The ORM, and When Not To Use It

Christophe Pettus from PostgreSQL Experts.

He thinks the ORM is great.

The first ORM he wrote was written before a lot of the audience was born.

Sometimes, writing code for the ORM is hard.

Database agnosticism isn't as important as you think it is. For instance, you don't make the paint on your house color-agnostic.

Libraries have to be DB agnostic. Your app probably doesn't need to be.

Reasons you might want to avoid the query language:

  • Queries that generate sub-optimal SQL
  • Queries more easily expressed in SQL
  • SQL features not available via the ORM

Django's ORM's SQL is much better than it used to be.

Don't use __in with very large lists. 100 is about the longest list you should use.

Avoid very deep joins.

It's not hard to chain together a bunch of stuff that ends up generating SQL that's horrible.

The query generator doesn't do a separate optimization pass that makes the query better.

It's better to express .filter().exclude().filter() in SQL.

There are so many powerful functions and operators in PostgreSQL!

SomeModel.objects.raw() is your friend!

You can write raw SQL, and yet still have it integrate with Django models.

You can even get stuff back from the database that isn't in the model definition.

There's some WITH RECURSIVE thing in PostgreSQL that would be prohibitively hard to do with the Django ORM. It's not really recursive--it's iterative.

You can also do queries without using the model framework.

The model framework is very powerful, but it's not cheap.

Interesting: The data has to be converted into Python data and then again into model data. If you're just going to serialize it into JSON, why create the model objects? You can even create the JSON directly in the database and hand it back directly with PostgreSQL. But make sure the database driver doesn't convert the JSON back to Python ;) Return it as raw text.

There are also tables that Django can't treat as model tables. For instance, there are logging tables that lack a primary key. Sometimes, you have weird tables with non-Django-able primary keys.

The ORM is great, though. For instance, it's great for basic CRUD.

Interfaces that require building queries in steps are better done with SQL--for instance, an advanced search function.

Summary:

  • Don't be afraid to step outside the ORM.
  • SQL isn't a bug. It's a feature. It's code like everything else.
  • Do use the ORM for operations that it makes easier.
  • Don't hesitate to use the full power of SQL.

Q&A:

Whether you put your SQL in model methods or managers is a matter of taste. Having all the code for a particular model in one place (i.e. the model or manager) is useful.

Write tests. Use CI.

Use parameter substitution in order to avoid SQL injection attacks. Remember, all external input is hostile. You can't use parameter substitution if the table name is dynamic--just be careful with what data you allow.

If you're using PostGIS, write your own queries for sure. It's really easy to shoot yourself in the foot with the GIS package.

 

 

Comments

Giacomo said…
Thank you for sharing this, JJ. It was useful.