Threaded comments using Common Table Expressions (CTE) for a MySQL Flask blog or CMS

8 February 2020 Updated 16 February 2020 by Peter
post main image
https://unsplash.com/@di_an_h

Now that I have blog posts, pages and a contact form, I decided to implement the comments for the blog posts and pages. Not just flat comments but threaded comments, also called nested comments. Some months ago I read about this and I really liked Miguel Grinberg's article: Implementing User Comments with SQLAlchemy.

As often Miguel starts with defining the problem and some hard core theory and explaining very clear the Adjacency List and Nested List approaches. Then he came up with his own solution and showed how he implemented this. I tried it and it worked perfectly. One of the comments below his article was suggesting to use Common Table Extensions or CTE. At the time of the article MySQL introduced CTE. It was already available in PostgreSQL.

Because my site is running on a server managed by ISPConfig I must use MySQL. I know that I can install PostgreSQL but the nice thing of ISPConfig is that I can manage MySQL using the administrator. Besides that, I have been using MySQL for very long time and it never let me down. I mainly do front-end development and this requires fast and simple queries.

Comments with Common Table Expressions (CTEs)

I looked into a threaded comments solution with CTE, 'WITH RECURSIVE', and it indeed becomes a bit more easy. When using a recursive query we let MySQL iterate over the comments using the comment.id and comment.parent_id.

The path is constructed by concatenating the comment.id. Afterwards, the result is sorted by the path. MySQL does not have an array data type like PostgreSQL, meaning that we cannot create a path by adding comment ids to an array but that we must concatenate strings, representing the comment ids. I tried to convert the comment.id to a string and then zero-pad it using CONVERT() and LPAD() but this did not seem to work.

I did not see any other option then storing the comment.id also as a zero-padded string into another field zpstring_id in the same record. The number of characters in this string must be enough to cover the maximum number of comments you expect for your comment system in its lifetime. I chose a number of 8 which means I can handle one hundred million (99.999.999) comments.

A (bad) side effect of using a table column is that the width of the zpstring_id column must be as large as the largest number of concatenated zpstring_id values. If we allow a maximum level, or depth, of 10 then the zpstring_id column size must be at least 8 * 10 = 80 characters. We add some more to allow for a separation character which makes reading more easy.

Comment model and the recursive query

The Comment class:

class Comment(Base):
    comment_path_level_width = 6

    __tablename__ = 'comment'

    id = Column(Integer, primary_key=True)
    created_on = Column(DateTime, server_default=func.now(), index=True)

    parent_id = Column(Integer, ForeignKey('comment.id'))

    author = Column(String(64))
    text = Column(Text())

    zpstring_id = Column(String(200), server_default='', index=True)
    thread_created_on = Column(DateTime, index=True)

    content_item_id = Column(Integer, ForeignKey('content_item.id'))

    replies = relationship(
        'Comment', 
        backref=backref('parent', 
        remote_side=[id]),
        lazy='dynamic')

The column thread_created_on is the timestamp for all comments in a thread. We use it when we want to sort by newest first, see also the Miguel's article. The column content_item_id is the id of a blog post or the id of a page. The MySQL query to select the comments:

WITH RECURSIVE tree_path (id, thread_created_on, parent_id, text, level, path) AS
(
  SELECT id, thread_created_on, parent_id, text as text, 0 as level, zpstring_id as path
    FROM comment
    WHERE
          content_item_id = :content_item_id
      AND parent_id IS NULL
  UNION ALL
  SELECT t.id, t.thread_created_on, t.parent_id, t.text, tp.level + 1 AS level, CONCAT(tp.path, '/', t.zpstring_id)
    FROM tree_path AS tp JOIN comment AS t
      ON tp.id = t.parent_id
)
SELECT * FROM tree_path
ORDER BY path; 

Inserting comments and replies

When inserting a comment we use two commits. In the first commit we save the comment, then we use this id, convert it to a string, zero-pad it and store it in zpstring_id. Finally, we commit again. The parent_id is NULL in this case.

    comment = Comment(
        text = text,
        author = author, 
        content_item_id = content_item.id,
    )
    db.session.add(comment)
    db.session.commit()
    # we got the id, now set zpstring_id
    comment.zpstring_id = str(comment.id).zfill(8)
    # set thread timestamp
    comment.thread_created_on = comment.created_on
    db.session.commit()

Inserting replies is slightly different because we must add the parent_id. Also, we get the thread_created_on value from the parent! What I do is before inserting a reply is to get the parent record. This is a good idea anyway and an extra check if the submitted parent_id is valid.

    comment = Comment(
        parent = parent,
        text = text, 
        author = author, 
        content_item_id = content_item.id,
        # add thread timestamp
        thread_created_on = parent.thread_created_on
    )
    db.session.add(comment)
    db.session.commit()
    # we got the id, now set zpstring_id
    comment.zpstring_id = str(comment.id).zfill(comment_path_width)
    db.session.commit()

Of course we can combine these two functions into one but for clarity I show them both.

Time for action

Let's insert some comments. You should be able to copy and paste the statements when using the MySQL command line:

# clear comments
SET FOREIGN_KEY_CHECKS=0;
delete from comment;
SET FOREIGN_KEY_CHECKS=1;

# level 0 comment
INSERT INTO comment (text, content_item_id) VALUES ('first level 0 text', 34);
SET @level_0_comment_id = (SELECT LAST_INSERT_ID());
SET @thread_timestamp = (SELECT created_on FROM comment WHERE id = @level_0_comment_id);
UPDATE comment SET zpstring_id = LPAD(@level_0_comment_id, 8, '0'), thread_created_on = @thread_timestamp WHERE id = @level_0_comment_id;

# reply: parent = first level 0 comment
INSERT INTO comment (parent_id, thread_created_on, text, content_item_id) VALUES (@level_0_comment_id, @thread_timestamp, 'reply to: first level 0 text', 34);
SET @level_1_comment_id = (SELECT  LAST_INSERT_ID());
UPDATE comment SET zpstring_id = LPAD(@level_1_comment_id, 8, '0') WHERE id = @level_1_comment_id;

# reply: parent = first level 1 comment
INSERT INTO comment (parent_id, thread_created_on, text, content_item_id) VALUES (@level_1_comment_id, @thread_timestamp, 'reply to: reply to: first level 0 text', 34);
SET @level_2_comment_id = (SELECT  LAST_INSERT_ID());
UPDATE comment SET zpstring_id = LPAD(@level_2_comment_id, 8, '0') WHERE id = @level_2_comment_id;

# reply: parent = first level 1 comment
INSERT INTO comment (parent_id, thread_created_on, text, content_item_id) VALUES (@level_1_comment_id, @thread_timestamp, '2e reply to: reply to: first level 0 text', 34);
SET @level_2_comment_id = (SELECT  LAST_INSERT_ID());
UPDATE comment SET zpstring_id = LPAD(@level_2_comment_id, 8, '0') WHERE id = @level_2_comment_id;

Now wait a moment. Why wait? Because I want a separation of minimal one second between the two level-0 comments. We can also add a MySQL timestamp with the Fractional Seconds but this is out of the scope of this post. To add a second thread copy-paste the following:

# a second level 0 comment
INSERT INTO comment (text, content_item_id) VALUES ('second level 0 text', 34);
SET @level_0_comment_id = (SELECT LAST_INSERT_ID());
SET @thread_timestamp = (SELECT created_on FROM comment WHERE id = @level_0_comment_id);
UPDATE comment SET zpstring_id = LPAD(@level_0_comment_id, 8, '0'), thread_created_on = @thread_timestamp WHERE id = @level_0_comment_id;

# reply: parent second level 0 comment
INSERT INTO comment (parent_id, thread_created_on, text, content_item_id) VALUES (@level_0_comment_id, @thread_timestamp, 'reply to: second level 0 text', 34);
SET @level_1_comment_id = (SELECT  LAST_INSERT_ID());
UPDATE comment SET zpstring_id = LPAD(@level_1_comment_id, 8, '0') WHERE id = @level_1_comment_id;

Now lets run the recursive query:

WITH RECURSIVE tree_path (id, thread_created_on, parent_id, text, level, path) AS
(
  SELECT id, thread_created_on, parent_id, text as text, 0 as level, zpstring_id as path
    FROM comment
    WHERE
          content_item_id = 34
      AND parent_id IS NULL
  UNION ALL
  SELECT t.id, t.thread_created_on, t.parent_id, t.text, tp.level + 1 AS level, CONCAT(tp.path, '/', t.zpstring_id)
    FROM tree_path AS tp JOIN comment AS t
      ON tp.id = t.parent_id
)
SELECT * FROM tree_path
ORDER BY path; 

This should give you the following result:

+------+---------------------+-----------+-------------------------------------------+-------+----------------------------+
| id   | thread_created_on   | parent_id | text                                      | level | path                       |
+------+---------------------+-----------+-------------------------------------------+-------+----------------------------+
|  110 | 2020-02-08 20:49:19 |      NULL | first level 0 text                        |     0 | 00000110                   |
|  111 | 2020-02-08 20:49:19 |       110 | reply to: first level 0 text              |     1 | 00000110/00000111          |
|  112 | 2020-02-08 20:49:19 |       111 | reply to: reply to: first level 0 text    |     2 | 00000110/00000111/00000112 |
|  113 | 2020-02-08 20:49:19 |       111 | 2e reply to: reply to: first level 0 text |     2 | 00000110/00000111/00000113 |
|  114 | 2020-02-08 20:49:38 |      NULL | second level 0 text                       |     0 | 00000114                   |
|  115 | 2020-02-08 20:49:38 |       114 | reply to: second level 0 text             |     1 | 00000114/00000115          |
+------+---------------------+-----------+-------------------------------------------+-------+----------------------------+

The order is 'oldest first'. If we want to sort by 'newest first' we change the ORDER BY clause:

WITH RECURSIVE tree_path (id, thread_created_on, parent_id, text, level, path) AS
(
  SELECT id, thread_created_on, parent_id, text as text, 0 as level, zpstring_id as path
    FROM comment
    WHERE
          content_item_id = 34
      AND parent_id IS NULL
  UNION ALL
  SELECT t.id, t.thread_created_on, t.parent_id, t.text, tp.level + 1 AS level, CONCAT(tp.path, '/', t.zpstring_id)
    FROM tree_path AS tp JOIN comment AS t
      ON tp.id = t.parent_id
)
SELECT * FROM tree_path
ORDER BY thread_created_on DESC, path; 

Comparing both solutions

The solution of Miguel is not really that much different from the CTE solution. He implements the path in another way and the level as well. Note that you can implement the level very easy by adding to a comment reply: level = parent.level + 1. Both solutions require a double submit because there is no array field type in MySQL (?).

What about Flask, SQLALchemy and Bootstrap 4? You may wonder what the above has to do with Flask? Well not really that much. This website is build with Flask and SQLAlchemy, without the Flask-SQLAlchemy extension, see the Comment class.

What about SQLAlchemy? I am not sure if the CTE query can be transformed into a pure SQLAlchemy query. MySQL developers state that they do not want to implement non-SQL complient queries so I must look into this. The above queries can be executed in SQLAlchemy as 'raw' queries in a way like:

    db.session.execute(text(sql), {
        'content_item_id': self.content_item_id, 
    })

And what about Bootstrap 4? We can use the grid system to indent the level of the comments:

    {% if comment_level == 0 %}
        <div class="col-12 mb-1">
    {% elif comment_level == 1 %}
        <div class="col-11 offset-1 mb-1">
    {% elif comment_level == 2 %}
        <div class="col-10 offset-2 mb-1">
    {% elif comment_level == 3 %}
        <div class="col-9 offset-3 mb-1">
    {% elif comment_level == 4 %}
        <div class="col-8 offset-4 mb-1">
    {% else %}
        <div class="col-7 offset-5 mb-1">
    {% endif %}

Summary

The above is a first implementation of threaded comments using CTE for this website. MySQL may not be the perfect database to implement the CTE 'WITH RECURSIVE' query, but it is very much used with many websites so we have to live with its limitations.

Getting the comments is just a small part of implementing comments for a website. There are so much more items to be addressed like status of a comment, deleted, hidden, (un)moderated, voting. And we may allow comments, require logged in. And there also is email, send a mail when someone replies, send emails for moderation. May be one day I will write a part 2 of this post.

Links / credits

Adjacency List Model vs Nested Set Model for MySQL hierarchical data?
https://stackoverflow.com/questions/31641504/adjacency-list-model-vs-nested-set-model-for-mysql-hierarchical-data

Adjacency list vs. nested sets: PostgreSQL
https://explainextended.com/2009/09/24/adjacency-list-vs-nested-sets-postgresql/

Cannot use ROW_NUMBER() in recursive block of CTE
https://bugs.mysql.com/bug.php?id=96538

Creating Threaded Comments With PHP And Postgresql Recursive Query
https://phpro.org/tutorials/Creating-Threaded-Comments-With-PHP-And-Postgresql-Recursive-Query.html

How do I create nested categories in a Database?
https://stackoverflow.com/questions/926175/how-do-i-create-nested-categories-in-a-database

Implementing User Comments with SQLAlchemy
https://blog.miguelgrinberg.com/post/implementing-user-comments-with-sqlalchemy

Is there any array data type in MySQL like in PostgreSQL?
https://stackoverflow.com/questions/5541175/is-there-any-array-data-type-in-mysql-like-in-postgresql

Managing Hierarchical Data in MySQL
http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/

Storing and retrieving tree structures in relational databases using Python
https://medium.com/@spybugg/storing-and-retrieving-tree-structures-in-relational-databases-using-python-django-7480f40c24b

Add comment

Comments (37)

Add reply

avatar

first comment

avatar
Anonymous visitor (not logged in) 1 month ago Anonymous visitor (not logged in)

reply to first comment

avatar
user39164362 1 month ago Anonymous visitor (not logged in)

asaqwer

avatar

xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxx xdxxxxxxxxxxxxxxxxxxxxxxxxxxxx xdxxxxxxxxxxxxxxxxxxxxxxx

avatar
user39164362 1 month ago Anonymous visitor (not logged in)

vdcdsa

avatar

sacacdsa

avatar
user39164362 1 month ago Anonymous visitor (not logged in)

IIIIIIIIIIIIIIIIIIIIIIIIIII

avatar
user39164362 1 month ago Anonymous visitor (not logged in)

isit the same

avatar

not same remember

avatar
user39164362 1 month ago Anonymous visitor (not logged in)

asadfsagafsd

avatar
Anonymous visitor (not logged in) 1 month ago Anonymous visitor (not logged in)

ssdagasdsg

avatar
user39164362 1 month ago

aszadcdsda

avatar
user39164362 1 month ago

Erste Nachricht

avatar
user39164362 1 month ago

asxxxxx

avatar
user39164362 1 month ago

leveltje NUL

avatar

reply to levetje NULLLL

avatar
user39164362 1 month ago

Gimme LOLOL

avatar

LOL reply

avatar
user39164362 1 month ago

bla die bla

avatar
user39164362 1 month ago

blibla

avatar

aaaaaaaa

avatar

leffeltjuhnul

avatar

Is this working?

avatar
Anonymous visitor (not logged in) 1 month ago Anonymous visitor (not logged in)

Lets reply

avatar

? ?? ????? ?????, ??? ??? ?? ???????.

------
<a href=https://tel-number.ru/our-services/1823-korea-direct-number>????? ????? ????? ????????</a> | https://tel-number.ru/

avatar

I apologise, but, in my opinion, you commit an error.


-----
<a href=https://www.anal4us.com/latest-updates>https://www.anal4us.com/latest-updates</a> | https://www.anal4us.com

avatar

Excuse, that I can not participate now in discussion - it is very occupied. I will return - I will necessarily express the opinion on this question.


-----
<a href=https://www.analibiza.com/videos>https://www.analibiza.com/videos</a> | https://www.analibiza.com

avatar

http://mewkid.net/when-is-xaxlop/ - Amoxicillin 500 Mg Dosage <a href="http://mewkid.net/when-is-xaxlop/">Amoxicillin Online</a> scj.frdm.peterspython.com.sjg.sx http://mewkid.net/when-is-xaxlop/

avatar

http://mewkid.net/when-is-xaxlop/ - Amoxicillin On Line <a href="http://mewkid.net/when-is-xaxlop/">Amoxicillin Online</a> iqf.ejgj.peterspython.com.vus.sy http://mewkid.net/when-is-xaxlop/

avatar

http://mewkid.net/when-is-xaxlop/ - Dosage For Amoxicillin 500mg <a href="http://mewkid.net/when-is-xaxlop/">Amoxicillin</a> mep.gznv.peterspython.com.yaj.fo http://mewkid.net/when-is-xaxlop/

avatar

http://mewkid.net/when-is-xaxlop/ - Dosage For Amoxicillin 500mg <a href="http://mewkid.net/when-is-xaxlop/">Amoxicillin</a> smu.ulag.peterspython.com.ryc.ho http://mewkid.net/when-is-xaxlop/

avatar

http://mewkid.net/when-is-xaxlop/ - Amoxicillin <a href="http://mewkid.net/when-is-xaxlop/">Amoxicillin 500 Mg</a> hcd.cumj.peterspython.com.epd.sm http://mewkid.net/when-is-xaxlop/

avatar

http://mewkid.net/when-is-xaxlop/ - Amoxicillin 500mg Capsules <a href="http://mewkid.net/when-is-xaxlop/">Amoxicillin Without Prescription</a> llb.egyu.peterspython.com.uso.lt http://mewkid.net/when-is-xaxlop/

avatar

Потоковые комментарии с использованием блога Common Table Expressions (CTE) для блога MySQL Flask или CMS

avatar
Anonymous visitor (not logged in) 5 days ago Anonymous visitor (not logged in)

Большинство просмотренных:

avatar

Если арестанту предложат на выбор — прожить год на свободе, а потом умереть, или гнить в тюрьме, как, по-твоему, он должен поступить?


-----
<a href=https://virtual-local-numbers.com/countries/32-usa.html>usa number</a> | https://virtual-local-numbers.com

avatar

На приглашение посетить торжественный вегетарианский обед Шоу ответил отказом, объяснив:


-----
<a href=https://european-sailing.com/rent-yachts-greece>crewed yacht charter greece</a> | https://european-sailing.com/