Table of Content
Having used the Python Sqlalchemy library for so many years now, on over ten projects and tutorial guides, I have noticed a common error pattern with beginners to SQLAlchemy.
These are pitfalls that I also find myself falling for at times.
In this post, you will find some pitfalls you may likely encounter when working with the Python SQLAlchemy library.
What are the Pitfalls you Might Face with Python SQLAlchemy?
Here are the ten most common pitfalls you may likely encounter when working on a SQLAlchemy project:
1. Not knowing when to use one()
, first()
, and scalar()
I have noticed that most Python devs working with SQLAlchemy hardly use the scalar()
function.
However, you see codes written with the first()
function and expect to return a scalar value rather than a Keyed Tuple.
Other times, when the one()
function is used, the user expects to catch an exception using the NoResultException
exception class.
These three functions have unique use cases in SQLAlcemy.
-
How to use the
one()
function in SQLAlchemyThe
one()
function in SQLAchemy executes a SQL query with an intention to strictly return a single row as a Python tuple.Hence, it raises the exception
MultipleResultsFound
if the number of rows returned is more than one.It also returns the
NoResultFound
exception when none is returned.The
one()
function in SQLAchemy should be used when we want to select a single row and we assume there is exactly one row to be found. -
How to use the
first()
function in SQLAlchemyThe SQLAlchemy
first()
function executes a SQL query withLIMIT 1
and returns the result row as a Python tuple.It returns
None
if nothing is returned.The SQLAlchemy
first()
function should be used when we want a single row but we want to be sure if there's exactly one row. -
How to use the
scalar()
function in SQLAlchemyThe SQLAlchemy
scalar()
function executes the SQL query and returns the value of the first column of the matched row.The SQLAlchemy
scalar()
function should be used when we want to know the maximum result returned.For example, we want just the maximum of a single column.
If multiple values are returned, it raises the exception
MultipleResultsFound
.If no rows were found, it returns
None
.
Read more: How to use FastAPI with Django ORM and Admin
2. Trying to access related classes through relationships in queries
Take a look at the code block below:
What is wrong with this query?
Well, if it isn't obvious at first, you are not wrong.
It does make sense, from a syntactic perspective, for this code to work by automatically adding a join statement to Post.parent
in the query.
However, that's not so in SQLAlchemy.
Here, Post.parent
is a relationship, and as a result, it does not have a name
attribute.
To fix the query, SQLAlchemy requires you add an alias and join the statement:
3. Not Reading the Docs
Most tutorials written on SQLAlchemy are not updated to the latest changes, while others use incorrect methodologies or fall into these pitfalls we mentioned.
I am a huge advocate of reading the official docs of any new tech you use, be it a language, framework, library, package, or plugin.
I would say that is common sense. However, I do not completely blame anyone for skipping the SQLAlchemy docs in particular.
Not only could it be organized better, but also the grammar jargon used to explain even the simplest terms really need a simplification face-lift to accommodate beginners.
With the official documentation, it's easy to miss the tutorials and get lost trying to find a solution or a direction of use.
4. Defining or Modifying the Model Classes with the Intention of Creating or Modifying the Database Tables
One misunderstood concept that you may encounter when using SQLAlchemy for the first time is the belief that SQLAlchemy can automatically modify the database tables to match the models by modifying the model classes.
This misunderstanding stems from not having knowledge of how the models and the database work with each other in Python using an Object-relational mapping library (ORM).
To create and make modifications to the database, the database schema needs to be explicitly altered.
This process is called database migrations.
In the Django framework, this is done when you run:
In SQLAlchemy, database migrations can be done in two ways:
- Manually using graphical or command-line database tools like psql, mysql, phpMyAdmin, etc. to create, alter, or drop the tables.
- Using a Python library like Alembic to perform migrations from the command line.
Read more: How to Create User Model Using FastAPI and Django
5. Not Knowing the Difference Between scoped_session and sessionmaker
This is something most newcomers to SQLAlchemy will encounter, especially if they have little to no experience in multithreading.
What is a session in SQLAlchemy?
A session is a container of instances of mapped classes. At the most common use case, you can leverage it to construct queries dynamically and add, delete, and update entities in a database.
Let's break this down.
The sqlalchemy.orm.session.Session
class is the center of attention here and is all you need. The other two are merely helper functions.
What is sessionmaker
in SQLAlchemy?
A sessionmaker
is a session factory generator. If you are familiar with the Factory design pattern, this will not be a big deal to you.
In simple terms, it is a function that returns a function, which in turn returns a new Session
every time it is called.
The sessionmaker
function is a convenience function for instantiating Session
objects with customized parameters, but they cannot be used as proxies – you have to call one to get a session first.
What is scoped_session
in SQLAlchemy?
A scoped_session
is a thread-local manager and proxy of sessions.
The scoped_session
function creates a thread-local proxy/manager of Session instances. This allows you to gain access to the scoped_session
object as a Session
instance.
The scoped_session
is aware of which thread is asking for a session. Hence, it returns the same session for the same thread. This process is done under the hood using Python's thread-local objects.
It makes it easy to write multithreaded applications since you don't have to worry about sessions displaying unexpected behavior such as accidentally crossing threads.
Another place where most devs fail at this is to call the scoped_session
to get a Session
instance.
While that may seem to work, It is redundant. Instead, you can do this:
Read more: Build a Text Paraphraser Using Python with Pegasus Transformer for NLP
6. Intending to get the ID of an object by committing the session so it can be used to insert related objects
Well, I made this mistake multiple times. Hell, I was frustrated when it didn't work as I wanted.
All SQLAlchemy applications implement related models in one way or the other.
Take for instance:
You need to create a Car
model and add different brands of cars to it. That's a related model right there!
So how would you accomplish this task?
Well, you might think in this sense, like I used to:
- Create an object of the
Car
- Add the
Car
object to the session - Flush the session
- Create an object of the car
Brand
with acar_id
from the previously flushedCar
object.
- Add the car
Brand
to the session - Commit the session
This works actually, but with SQLAlchemy, some steps are unnecessary.
Here is how you would add items to related models with SQLAlchemy:
- Create an object of the
Car
- Create an object of the car
Brand
- Add the object of the
Car
to the session - Commit the session
What follows next under the hood is that:
When the session is flushed, the Car
row is first inserted into the database.
Afterward, the object of the car Brand
is discovered and also inserted into the database through the help of a relationship cascade.
A typical Relational Database Management System (RDBMS) like MySQL and Postgres, cannot differentiate both.
7. Testing on LocalHost with SQLite instead of an RDBMS
This is very simple, straightforward, and has less to do with SQLAlchemy.
Always carry out tests in a local environment using tools you would use in a production environment.
This is one way to avoid the “It works on my computer” saga.
Read more: What Are Lambda Functions in Python and How to Use them
8. Creating Constructors Explicitly
Another pitfall most devs fail to recognize may be directly blamed on the SQLAlchemy documentation.
Having combed the documentation multiple times, I've only ever found a single mention of this feature.
So, what is this issue?
Many developers who use SQLAlchemy add explicit constructors to all their classes.
What they don't know, rather, what they won't know, is that the declarative metaclass from SQLAlchemy provides a default constructor for model classes.
This makes it redundant to write constructors explicitly.
What you should do from this point is write all new codes, or if you have the time, refactor the old ones to skip the constructor and instantiate classes like Vehicle(make=‘Mercedes', color=‘red') rather than creating a constructor for all your fields as arguments.
9. Calling metadata.create_all() without import model modules
If you use SQLAlchemy, it means you are familiar with the fact that metadata.create_all()
creates all your tables.
However, to do this effectively, you should import all the modules that are used to declare tables.
This might look like a no-brainer, but some people seem to expect SQLAlchemy to scan the project structure for model classes!
I mean, how else would SQLAlchemy know what the tables are?
At the point above, we looked at how to avoid redundancy in the metaclass machinery since SQLAlchemy puts declarative classes in the metadata.
Knowing this might get you carried away in the beginning so much so that you expect everything to “just work” by itself.
But, that is far from the truth.
You need to import all your model modules so that the metadata
can be filled in.
Only then will the call to metadata.create_all()
do anything useful.
Read more: Everything to Know About Cohesion and Coupling with Examples
10. Querying Specific Column
Anyone coming from a Flask-SQLAlchemy background must have been exposed to the use of query properties that makes querying easier.
The trouble starts when they want to work with core SQLAlchemy and they need to query a specific column but cannot.
Not to mention how it is not obvious from the example code how one would make aggregates or other expressions using said query property.
Well, the solution is simple. Use the session directly like so:
Wrap Off
In this post, we looked into some pitfalls you may likely encounter when working with the Python SQLAlchemy library.
These are pitfalls that I also find myself falling for at times.
Hopefully, you won't make these mistakes anymore.
If you learned from this post, please share and subscribe to our newsletter to have awesome posts like these delivered to your inbox once or twice a week.