EDIT: With thanks to u/code@lemmy.mayes.io, I have a solution for cleaning up the one table that is growing ridiculously in size and that is the activity table. A TRUNCATE activity;
command cleaned that right up. A word of caution is in order because the Lemmy instance must be shutdown completely and only the postgres container running if you’re using docker.
As a new Lemmy admin, I think I really need to learn PostgreSQL administration because I need a way to keep my database from ballooning in size and there are no management tools built yet for it. Would someone be so kind as to recommend a good website for learning PostgreSQL? This would also help me maintain my mastodon instance.
learn PostgreSQL administration
Definitely helpful, but administration only goes so far with the Lemmy database. Take a look at [this post](https://beehaw.org/post/980868 and let me know if it answers any of your questions; if you have more feel free to ask, or ping me on matrix @penguincoder:hive.beehaw.org
I’ll ping @Penguincoder@beehaw.org since they would know a lot more about this.
Well thank you kindly!
Look at vacumn command. Also dbeaver is a teally good gui tool. Right now there is one table thats the main cause (i think its called activity but dont quote me on that there was a post about it)
That table is mainly for debugging as it logs every action in activitypub your instance does. I truncate it once a week and vacumn it (shrinks the disk space used). Caution. You must shut down lemmy before doing so. I run containers so i stop all except postgres and clean it up.
looks like the devs are on it.
After that commit my database dump went from around 1.1G to 260M
You are exactly correct about that one table. It is like 90% of the space. When I ran the vacuum I did not shutdown Lemmy. Now I know.
Really ugly, but really works. Connect to psql and run:
WITH RECURSIVE pg_inherit(inhrelid, inhparent) AS (select inhrelid, inhparent FROM pg_inherits UNION SELECT child.inhrelid, parent.inhparent FROM pg_inherit child, pg_inherits parent WHERE child.inhparent = parent.inhrelid), pg_inherit_short AS (SELECT * FROM pg_inherit WHERE inhparent NOT IN (SELECT inhrelid FROM pg_inherit)) SELECT table_schema , TABLE_NAME , row_estimate , pg_size_pretty(total_bytes) AS total , pg_size_pretty(index_bytes) AS INDEX , pg_size_pretty(toast_bytes) AS toast , pg_size_pretty(table_bytes) AS TABLE , total_bytes::float8 / sum(total_bytes) OVER () AS total_size_share FROM ( SELECT *, total_bytes-index_bytes-COALESCE(toast_bytes,0) AS table_bytes FROM ( SELECT c.oid , nspname AS table_schema , relname AS TABLE_NAME , SUM(c.reltuples) OVER (partition BY parent) AS row_estimate , SUM(pg_total_relation_size(c.oid)) OVER (partition BY parent) AS total_bytes , SUM(pg_indexes_size(c.oid)) OVER (partition BY parent) AS index_bytes , SUM(pg_total_relation_size(reltoastrelid)) OVER (partition BY parent) AS toast_bytes , parent FROM ( SELECT pg_class.oid , reltuples , relname , relnamespace , pg_class.reltoastrelid , COALESCE(inhparent, pg_class.oid) parent FROM pg_class LEFT JOIN pg_inherit_short ON inhrelid = oid WHERE relkind IN ('r', 'p') ) c LEFT JOIN pg_namespace n ON n.oid = c.relnamespace ) a WHERE oid = parent ) a ORDER BY total_bytes DESC LIMIT 3;
That will show the top 3 database tables sizes. I bet you number one is
activity
.
You really don’t want to do a vacuum full on the live DB, it will lock everything. While useful and necessary I think, you have got to plan down time for it. Or your site is going to be inaccessible anyway.
Yeah, that worked magically!
Why are you concerned with your database ballooning?
What do you think you’ll be able to do to the database to prevent this?
What do you want to do?