overview for lodion

Ongoing federation issues in c/[email protected]

[–] [email protected] 2 points 2 years ago (3 children)

Cool, we're on 0.19.3 now.

Ongoing federation issues in c/[email protected]

[–] [email protected] 2 points 2 years ago (5 children)

Excellent, hopefully that is included in 0.19.3 that came out overnight... will upgrade shortly.

Ongoing federation issues in c/[email protected]

[–] [email protected] 5 points 2 years ago (8 children)

Ok, there's more to this than I first thought. It seems there is a back end task set to run at a set time every day, if the instance is restarting at that time the task doesn't run... this task updates the instances table to show remote instances as "seen" by AZ. With the memory leaks in 0.19.1, the instance has been restarting when this task is running... leading to this situation.

I've updated the server restart cronjob to not run around the time this task runs... and I've again manually updated the DB to flag all known instances as alive rather than dead.

Will keep an eye on it some more...

For anyone curious, two of the bugs that are related to this:
https://github.com/LemmyNet/lemmy/issues/4288
https://github.com/LemmyNet/lemmy/issues/4039

Ongoing federation issues in c/[email protected]

[–] [email protected] 5 points 2 years ago (9 children)

Ok, something is busted with the lemmy API endpoint that shows current federation state. It is currently showing nearly all remote instances as dead:

But "dead" instances are still successfully receiving content from AZ, and sending back to us.

Ongoing federation issues in c/[email protected]

[–] [email protected] 1 points 2 years ago

reply

Ongoing federation issues in c/[email protected]

[–] [email protected] 6 points 2 years ago

Seems to have sorted it for the most part... not sure what caused it, will do some more digging.

Ongoing federation issues in c/[email protected]

[–] [email protected] 7 points 2 years ago (5 children)

Ok, for some reason practically all instances were flagged as "dead" in the database. I've manually set them all to be requeued... server is now smashed as it attempts to updated the ~4000 instances I've told it are no longer dead. See how this goes...