10 Pitfalls you Might Face with Python SQLAlchemy

Learn the pitfalls you may likely encounter when working with the Python SQLAlchemy

Picture of Nsikak Imoh, author of Macsika Blog
10 Pitfalls you Might Face with Python SQLAlchemy written on a white background.
10 Pitfalls you Might Face with Python SQLAlchemy written on a white background.

Table of Content

Having used the Python Sqlalchemy library for so many years now, on over ten projects and tutorial guides, I have noticed a common error pattern with beginners to SQLAlchemy.

These are pitfalls that I also find myself falling for at times.

In this post, you will find some pitfalls you may likely encounter when working with the Python SQLAlchemy library.

What are the Pitfalls you Might Face with Python SQLAlchemy?

Here are the ten most common pitfalls you may likely encounter when working on a SQLAlchemy project:

1. Not knowing when to use one(), first(), and scalar()

I have noticed that most Python devs working with SQLAlchemy hardly use the scalar() function.

However, you see codes written with the first() function and expect to return a scalar value rather than a Keyed Tuple.

Other times, when the one() function is used, the user expects to catch an exception using the NoResultException exception class.

These three functions have unique use cases in SQLAlcemy.

  • How to use the one() function in SQLAlchemy

    The one() function in SQLAchemy executes a SQL query with an intention to strictly return a single row as a Python tuple.

    Hence, it raises the exception MultipleResultsFound if the number of rows returned is more than one.

    It also returns the NoResultFound exception when none is returned.

    The one() function in SQLAchemy should be used when we want to select a single row and we assume there is exactly one row to be found.

  • How to use the first() function in SQLAlchemy

    The SQLAlchemy first() function executes a SQL query with LIMIT 1 and returns the result row as a Python tuple.

    It returns None if nothing is returned.

    The SQLAlchemy first() function should be used when we want a single row but we want to be sure if there's exactly one row.

  • How to use the scalar() function in SQLAlchemy

    The SQLAlchemy scalar() function executes the SQL query and returns the value of the first column of the matched row.

    The SQLAlchemy scalar() function should be used when we want to know the maximum result returned.

    For example, we want just the maximum of a single column.

    If multiple values are returned, it raises the exception MultipleResultsFound.

    If no rows were found, it returns None.

Read more: How to use FastAPI with Django ORM and Admin

Take a look at the code block below:

c = Post.query.join(Post.parent).filter(Post.name=='Python SQLAlchemuy Tutorial', Post.parent.name=='Python').first()
Highlighted code sample.

What is wrong with this query?

Well, if it isn't obvious at first, you are not wrong.

It does make sense, from a syntactic perspective, for this code to work by automatically adding a join statement to Post.parent in the query.

However, that's not so in SQLAlchemy.

Here, Post.parent is a relationship, and as a result, it does not have a name attribute.

To fix the query, SQLAlchemy requires you add an alias and join the statement:

parent = aliased(Category)
c = Post.query.join(parent, Post.parent).\
    filter(Post.name=='Python SQLAlchemuy Tutorial', Post.parent.name=='Python').first()
Highlighted code sample.

3. Not Reading the Docs

Most tutorials written on SQLAlchemy are not updated to the latest changes, while others use incorrect methodologies or fall into these pitfalls we mentioned.

I am a huge advocate of reading the official docs of any new tech you use, be it a language, framework, library, package, or plugin.

I would say that is common sense. However, I do not completely blame anyone for skipping the SQLAlchemy docs in particular.

Not only could it be organized better, but also the grammar jargon used to explain even the simplest terms really need a simplification face-lift to accommodate beginners.

With the official documentation, it's easy to miss the tutorials and get lost trying to find a solution or a direction of use.

4. Defining or Modifying the Model Classes with the Intention of Creating or Modifying the Database Tables

One misunderstood concept that you may encounter when using SQLAlchemy for the first time is the belief that SQLAlchemy can automatically modify the database tables to match the models by modifying the model classes.

This misunderstanding stems from not having knowledge of how the models and the database work with each other in Python using an Object-relational mapping library (ORM).

To create and make modifications to the database, the database schema needs to be explicitly altered.

This process is called database migrations.

In the Django framework, this is done when you run:

python manage.py makemigrations
python manage.py migrate
Highlighted code sample.

In SQLAlchemy, database migrations can be done in two ways:

  1. Manually using graphical or command-line database tools like psql, mysql, phpMyAdmin, etc. to create, alter, or drop the tables.
  2. Using a Python library like Alembic to perform migrations from the command line.

Read more: How to Create User Model Using FastAPI and Django

5. Not Knowing the Difference Between scoped_session and sessionmaker

This is something most newcomers to SQLAlchemy will encounter, especially if they have little to no experience in multithreading.

What is a session in SQLAlchemy?

A session is a container of instances of mapped classes. At the most common use case, you can leverage it to construct queries dynamically and add, delete, and update entities in a database.

Let's break this down.

The sqlalchemy.orm.session.Session class is the center of attention here and is all you need. The other two are merely helper functions.

What is sessionmaker in SQLAlchemy?

A sessionmaker is a session factory generator. If you are familiar with the Factory design pattern, this will not be a big deal to you.

In simple terms, it is a function that returns a function, which in turn returns a new Session every time it is called.

The sessionmaker function is a convenience function for instantiating Session objects with customized parameters, but they cannot be used as proxies – you have to call one to get a session first.

What is scoped_session in SQLAlchemy?

A scoped_session is a thread-local manager and proxy of sessions.

The scoped_session function creates a thread-local proxy/manager of Session instances. This allows you to gain access to the scoped_session object as a Session instance.

The scoped_session is aware of which thread is asking for a session. Hence, it returns the same session for the same thread. This process is done under the hood using Python's thread-local objects.

It makes it easy to write multithreaded applications since you don't have to worry about sessions displaying unexpected behavior such as accidentally crossing threads.

Another place where most devs fail at this is to call the scoped_session to get a Session instance.

While that may seem to work, It is redundant. Instead, you can do this:

DBSession = scoped_session(sessionmaker(engine))
vehicles = DBSession.query(Vehicle).all()
Highlighted code sample.

Read more: Build a Text Paraphraser Using Python with Pegasus Transformer for NLP

Well, I made this mistake multiple times. Hell, I was frustrated when it didn't work as I wanted.

All SQLAlchemy applications implement related models in one way or the other.

Take for instance:

You need to create a Car model and add different brands of cars to it. That's a related model right there!

So how would you accomplish this task?

Well, you might think in this sense, like I used to:

  • Create an object of the Car
  • Add the Car object to the session
  • Flush the session
  • Create an object of the car Brand with a car_id from the previously flushed Car object.
  1. Add the car Brand to the session
  2. Commit the session

This works actually, but with SQLAlchemy, some steps are unnecessary.

Here is how you would add items to related models with SQLAlchemy:

  • Create an object of the Car
  • Create an object of the car Brand
  • Add the object of the Car to the session
  • Commit the session

What follows next under the hood is that:

When the session is flushed, the Car row is first inserted into the database.

Afterward, the object of the car Brand is discovered and also inserted into the database through the help of a relationship cascade.

A typical Relational Database Management System (RDBMS) like MySQL and Postgres, cannot differentiate both.

7. Testing on LocalHost with SQLite instead of an RDBMS

This is very simple, straightforward, and has less to do with SQLAlchemy.

Always carry out tests in a local environment using tools you would use in a production environment.

This is one way to avoid the “It works on my computer” saga.

Read more: What Are Lambda Functions in Python and How to Use them

8. Creating Constructors Explicitly

Another pitfall most devs fail to recognize may be directly blamed on the SQLAlchemy documentation.

Having combed the documentation multiple times, I've only ever found a single mention of this feature.

So, what is this issue?

Many developers who use SQLAlchemy add explicit constructors to all their classes.

What they don't know, rather, what they won't know, is that the declarative metaclass from SQLAlchemy provides a default constructor for model classes.

This makes it redundant to write constructors explicitly.

What you should do from this point is write all new codes, or if you have the time, refactor the old ones to skip the constructor and instantiate classes like Vehicle(make=‘Mercedes', color=‘red') rather than creating a constructor for all your fields as arguments.

9. Calling metadata.create_all() without import model modules

If you use SQLAlchemy, it means you are familiar with the fact that metadata.create_all() creates all your tables.

However, to do this effectively, you should import all the modules that are used to declare tables.

This might look like a no-brainer, but some people seem to expect SQLAlchemy to scan the project structure for model classes!

I mean, how else would SQLAlchemy know what the tables are?

At the point above, we looked at how to avoid redundancy in the metaclass machinery since SQLAlchemy puts declarative classes in the metadata.

Knowing this might get you carried away in the beginning so much so that you expect everything to “just work” by itself.

But, that is far from the truth.

You need to import all your model modules so that the metadata can be filled in.

Only then will the call to metadata.create_all() do anything useful.

Read more: Everything to Know About Cohesion and Coupling with Examples

10. Querying Specific Column

Anyone coming from a Flask-SQLAlchemy background must have been exposed to the use of query properties that makes querying easier.

The trouble starts when they want to work with core SQLAlchemy and they need to query a specific column but cannot.

Not to mention how it is not obvious from the example code how one would make aggregates or other expressions using said query property.

Well, the solution is simple. Use the session directly like so:

Highlighted code sample.

Wrap Off

In this post, we looked into some pitfalls you may likely encounter when working with the Python SQLAlchemy library.

These are pitfalls that I also find myself falling for at times.

Hopefully, you won't make these mistakes anymore.

If you learned from this post, please share and subscribe to our newsletter to have awesome posts like these delivered to your inbox once or twice a week.

Connect with me.

Need an engineer on your team to grease an idea, build a great product, grow a business or just sip tea and share a laugh?