ticoombs

joined 2 years ago
MODERATOR OF
[โ€“] ticoombs@reddthat.com 3 points 11 months ago

๐Ÿ˜ Thank you for being here!

[โ€“] ticoombs@reddthat.com 2 points 11 months ago

โค๏ธ the heart emoji doesn't do it justice! Thankyou!

[โ€“] ticoombs@reddthat.com 4 points 11 months ago

๐Ÿคฉ Always great to hear! Thanks!

[โ€“] ticoombs@reddthat.com 1 points 11 months ago (2 children)

The long loads are because of huge images/content believe it or not ๐Ÿ˜‚. I too can't wait to finally see some nice fast Reddthat

[โ€“] ticoombs@reddthat.com 5 points 11 months ago (2 children)

Whoa! Thanks for even considering donating. I won't hold you to it if you happen to donate less later ๐Ÿ˜‰

After testing ko-fi we still end up having the same fees compared with OpenCollective as it's PayPal instead of Stripe. So in the end it's better to go via OpenCollective. As it's a lot more transparent and shows all the donations and will allow me to show all of our bills etc.

Thanks!

[โ€“] ticoombs@reddthat.com 36 points 11 months ago* (last edited 11 months ago) (1 children)

๐Ÿคท

[โ€“] ticoombs@reddthat.com 1 points 1 year ago

Ah. I see you too enjoy the debian approach

[โ€“] ticoombs@reddthat.com 2 points 1 year ago (1 children)

This should be fixed as we rolled out 0.19.5 today

[โ€“] ticoombs@reddthat.com 2 points 1 year ago (1 children)

Oh I was wrong, after further reading this looks to be a lot better than what I was thinking.

I must have been thinking about another methodology of attempted privacy over a dataset.

[โ€“] ticoombs@reddthat.com 4 points 1 year ago (3 children)

Before I start reading, if this has anything to do with differential privacy, I'm going to be disappointed.

[โ€“] ticoombs@reddthat.com 3 points 1 year ago (1 children)

Yes, you can upload images to Reddthat when posting, commenting, etc.

We have a CDN In front and aggressively cache all of the images.

3rd party images are fetched to generate a thumbnail, and to cache the image. The problem with this is, catbox can be slow at times and if that happens when you post it can't generate them.
Some clients also only open the direct links instead of showing the cached image, resulting in images not loading, or taking forever to load.

I say if you are posting on Reddthat, I'm happy for people to use the features provided by Reddthat. So upload here, if you so wish.

Just remember that the images you upload are linked to your account.

[โ€“] ticoombs@reddthat.com 1 points 1 year ago

Ooooh I love the temp change idea! I'll have to find a new supplier for that, and think of a cool idea for it! ๐Ÿ˜‰

35
submitted 1 year ago* (last edited 1 year ago) by ticoombs@reddthat.com to c/reddthat@reddthat.com
 

April is here and we're still enjoying our little corner of the lemmy-verse! This post is quite late for our April announcement as my chocolate coma has only now subsided

Donations

Due to some recent huge donations recently on our Ko-Fi we decided to migrate our database from our previous big server to a dedicated database server. The idea behind this move was to allow our database to respond faster with a faster CPU. It was a 3.5GHz cpu rather than what we had which was a 3.0GHz. This, as we know, did not pan out as we expected.After that fell through we have now migrated everything to 1 huge VPS in a completely different hosting company (OVH).

Since the last update I used the donations towards setting up an EU proxy to filter out down votes & spam as a way to try and respond faster to allow us to catch up. We've purchased the new VPS from OVH (which came out of the Ko-Fi money), & did the test of the separate database server in our previous hosting company.

Our Donations as of (7th of April):

  • Ko-Fi: $280.00
  • OpenCollective: $691.55 (of that 54.86 is pending payment)

Threads Federation

Straight off the bat, I'd like to say thank you for those voicing your opinions on the Thread federation post. While we had more people who were opposed to federation and have since deleted their accounts or moved communities because of the uncertainty, I left the thread pinned for over a week as I wanted to make sure that everyone could respond and have a general consensus. Many people bought great points forward, and we have decided to block Threads. The reasoning behind blocking them boils down to:

  • Enforced one-way communication, allowing threads users to post in our communities without being able to respond to comments
  • Known lack of Moderation which would allow for abuse

These two factors alone make it a simple decision on my part. If they allowed for comments on the post to make it back to a threads user then I probably would not explicitly block them. We are an open-first instance and I still stand by that. But when you have a history of abusive users, lack of moderation and actively ensure your users cannot conduct a conversation which by definition would be 2 way. That is enough to tip the scales in my book.

Decision: We will block Threads.net starting immediately.

Overview of what we've been tackling over the past 4 weeks

In the past month we've:

  • Re-configured our database to ensure it could accept activities as fast as possible (twice!)
  • Attempted to move all lemmy apps to a separate server to allow the database to have full use of our resources
  • Purchased an absurd amount of cpus for a month to give everything a lot of cpu to play with
  • Setup a haproxy with a lua-script in Amsterdam to filter out all 'bad' requests
  • Worked with the LW Infra team to help test the Amsterdam proxy
  • Rebuilt custom docker containers for added logging
  • Optimised our nginx proxies
  • Investigated the relationship between network latency and response times
  • Figured out the maximum 3r/s rate of activities and notified the Lemmy Admin matrix channel, created bug reports etc.
  • Migrated our 3 servers from one hosting company to 1 big server at another company this post

This has been a wild ride and I want to say thanks to everyone who's stuck with us, reached out to think of ideas, or send me a donation with a beautiful message.

The 500 mile problem (Why it's happening for LemmyWorld & Reddthat)

There are a few causes of this and why it effects Reddthat and a very small number of other instances but the main one is network latency. The distance between Australia (where Reddthat is hosted) and Europe/Americas is approximately 200-300ms. That means the current 'maximum' number of requests that a Lemmy instance can generate is approximate 3 requests per second. This assumes a few things such as responding instantly but is a good guideline.

Fortunately for the lemmy-verse, most of the instances that everyone can join are centralised to the EU/US areas which have very low latency. The network distance between Paris and Amsterdam is about 10ms. This means any instances that exist in close proximity can have an order of magnitude of activities generated before they lag behind. It goes from 3r/s to 100r/s

  • Servers in EU<->EU can generate between 50-100r/s without lagging
  • Servers in EU<->US can generate between 10-12r/s without lagging
  • Servers in EU<->AU can generate between 2-3r/s without lagging

Already we have a practical maximum of 100r/s that an instance can generate before everyone on planet earth lags behind.

Currently (as of writing) Lemmy needs to process every activity sequentially. This ensures consistency between votes, edits, comments, etc. Allowing activities to be received out-of-order brings a huge amount of complexity which everyone is trying to solve to allow for all of us (Reddthat et al.) to no longer start lagging. There is a huge discussion on the git issue tracker of Lemmy if you wish to see how it is progressing.

As I said previously this assumes we respond in near real-time with no processing time. In reality, no one does, and there are a heap of reasons because of that. The biggest culprit of blocking in the activity threads I have found (and I could be wrong) is/was the metadata enrichment when new posts are added. So you get a nice Title, Subtitle and an image for all the links you add. Recent logs show it blocks adding post activities from anywhere between 1s to 10+ seconds! Since 0.19.4-beta.2 (which we are running as of this post) this no longer happens so for all new posts we will no longer have a 5-10s wait time. You might not have image displayed immediately when a Link is submitted, but it will still be enriched within a few seconds. Unfortunately this is only 1 piece of the puzzel and does not solve the issue. Out of the previous 24hours ~90% of all recieved activities are related to votes. Posts are in the single percentage, a rounding error.

This heading is in reference to the 500 miles email.
Requests here mean Lemmy "Activities", which are likes, posts, edits, comments, etc.

So ... are we okay now?

It is a boring answer but we wait and enjoy what the rest of the fediverse has to offer. This (now) only affects votes between LemmyWorld to Reddthat. All communities on Reddthat are successfully federating to all external parties so your comments and votes are being seen throughout the fediverse. There are still plenty of users in the fediverse who enjoy the content we create, who interact with us and are pleasant human beings. This only affects votes because of our forcing federation crawler which automatically syncs all LW posts and comments. We've been "up-to-date" for over 2 weeks now.

It is unfortunate that we are the ones to be the most affected. It's always a lot more fun being on the outside looking in, thinking about what they are dealing with and theorising solutions. Knowing my fellow Lemmy.nz and aussie.zone were affected by these issues really cemented the network latency issue and was a brilliant light bulb moment. I've had some hard nights recently trying to manage my life and looking into the problems that are effecting Reddthat. If I was dealing with these issues in isolation I'm not sure I would have come to these conclusions, so thank you our amazing Admin Team!

New versions means new features (Local Communities & Videos)

As we've updated to a beta version of 0.19.4 to get the metadata patches, we've already found bugs in Lemmy (or regressions) and you will notice if you use Jerboa as a client. Unfortunately, rolling back isn't advisable and as such we'll try and get the issues resolved so Jerboa can work.
We now have ability to change and create any community to be "Local Only".

With the migration comes support for Video uploads, Limited to under 20MB and 10000 frames (~6 minutes)! I suggest if you want to shared video links to tag it with [Video] as it seems videos on some clients don't always show it correctly.

Thoughts

Everyday I strive to learn about new things, and it has certainly been a learning experience! I started Reddthat with knowing enough of alternate technologies, but nearly nothing of rust nor postgres. ๐Ÿ˜…
We've found possibly a crucial bug in the foundation of Lemmy which hinders federation, workarounds, and found not all VPS providers are the same. I explained the issues in the hosting migration post. Learnt a lot about postgres and tested postgres v15 to v16 upgrade processes so everyone who uses the lemmy-ansible repository can benefit.
I'm looking forward to a relaxing April compared to the hectic March but I foresee some issues relating to the 0.19.4 release, which was meant to be released in the next week or so. ๐Ÿคท

Cheers,

Tiff

PS. Since Lemmy version 0.19 you can block an instance yourself without requiring us to defederate via going to your Profile, clicking Blocks, and entering in the instance you wish to be blocked.

Fun Graphs:

Instance Response Times:

Data Transfers:

 

The PoC thickens

 

UEFI IRC, the perfect companion to asking why your Linux boot partition no longer exists #joke

 

Basically, I'm sick of these network problems, and I'm sure you are too. We'll be migrating everything: pictrs, frontends & backends, database & webservers all to 1 single server in OVH.

First it was a cpu issue, so we work around that by ensuring pictrs is on another server, and have just enough CPU to keep us all okay. Everything was fine until the spammers attacked. Then we couldn't process the activities fast enough, and now we can't catch up.

We are having constant network drop outs/lag spikes where all the networking connections get "pooled" with a CPU steal of 15%. So we bought more vCPU and threw resources at the problem. Problem temporarily fixed, but we still had our "NVMe" VPS, which housed our database and lemmy applications showing an IOWait of 10-20% half the time. Unbeknown to me, that it was not IO related, but network related.

So we moved the database server off to another server, but unfortunately that caused another issue (the unintended side effects, of cheap hosting?). Now we have 1 main server accepting all network traffic, which then has to contact the NVMe DB server and pict-rs server as well. Then send all that information back to the users. This was part of the network problem.
Adding backend & frontend lemmy containers to the pict-rs server helped alleviate and is what you are seeing at the time of this post. Now a good 50% of the required database and web traffic is split across two servers which allows for our servers to not completely be saturated with request.

On top of the recent nonsense, it looks like we are limited to 100Mb/s, that's roughly 12MB/s. So downloading a 20MB video via pictrs would require the current flow: (in this example)

  • User requests image via cloudflare
  • (its not already cached so we request it from our servers)
  • Cloudflare proxies the request to our server (app1).
  • Our app1 server connects to the pictrs server.
  • Our app1 server downloads the file from pictrs at a maximum of 100Mb/s,
  • At the same time, the app1 server is uploading the file via cloudflare to you at a maximum of 100Mb/s.
  • During this point in time our connection is completely saturated and no other network queries could be handled.

This is of course an example of the network issue I found out we had after moving to the multi-server system. This is of course not a problem when you have everything on one beefy server.


Those are the board strokes of the problems.

Thus we are completely ripping everything out and migrating to a HUGE OVH box. I say huge in capital letters because the OVH server is $108/m and has 8 vCPU, 32GB RAM, & 160GB of NVMe. This amount of RAM allows for the whole database to fit into memory. If this doesn't help then I'd be at a loss at what will.
Currently (assuming we kept paying for the standalone postgres server) our monthly costs would have been around $90/m. ($60/m (main) + $9/m (pictrs) + $22/m (db))

Migration plan:

The biggest downtime will be the database migration as to ensure consistency we need to take it offline. Which is just simpler than

DB:

  • stop everything
  • start postgres
  • take a backup (20-25 mins)
  • send that backup to the new server (5-6 mins (Limited to 12MB/s)
  • restore (10-15 mins)

pictrs

  • syncing the file store across to the new server

app(s)

  • regular deployment

Which is the same process I recently did here so I have the steps already cemented in my brain. As you can see, taking a backup ends up taking longer than restoring. That's because, after testing the restore process on our OVH box we were no where near any IO/CPU limits and was, to my amazement, seriously fast. Now we'll have heaps of room to grow with a stable donation goal for the next 12 months.

See you on the other side.

Tiff

17
Forcing federation (reddthat.com)
submitted 1 year ago* (last edited 1 year ago) by ticoombs@reddthat.com to c/reddthat@reddthat.com
 

In the past 24-48 hours we've done a lot of work as you may have noticed, but you might be wondering why there are posts and comments with 0 upvotes.

Normally this would be because it is from a mastodon-like fediverse instance, but for now it's because of us and is on purpose.

Lemmy has an API where we can tell it to resolve content. This forces one of the most expensive parts of federation (for reddthat) , the resolving of post metadata, to no longer be blocking. So when new posts get federated we can answer quicker saying "we already know about that, tell us something we don't know"!

This makes the current activity graphs not tell the whole story anymore as we are going out-of-band. So while the graphs say we are behind, in reality we are closer than ever (for posts and comments).

Shout out to the amazing @example@reddthat.com who thought up this crazy idea.

Tldr, new posts are coming in hot and fast, and we are up-to-date with LW now for both posts and comments!

 

There was a period on an hour where we migrated our database to it's own server in the same data centre.

This server has faster CPUs than our current one, 3.5GHz v 3.0GHz, and if all goes well we will migrate the extra CPUs we bought to the DB server.

The outage was prolonged due to a PEBKAC issue. I mistyped an IP in our new postgres authentication as such I kept getting a "permission error".

The decision of always migrating the DB to it's own server was always going to happen but was pushed forward with the recent donations we got from Guest on our Ko-Fi page! Thank-you again!

If you see any issues relating to ungodly load times please feel free to let us know!

Cheers,
Tiff

 

Our server rebooted and unfortunately I had not saved our new firewall rules. This blocked all Lemmy services from accessing our databases.

This has now been fixed up! Sorry to everyone for the outage!

 

Edited: this post to be the Lemmy.World federation issue post.

We are now ready to upgrade to postgres 16. When this post is 30 mins old the maintenace will start


Updated & Pinned a Comment in this thread explaining my complete investigation and ideas on how the Lemmy app will & could move forward.

 

Recently beehaw has been hung out to dry with Open Collective dissolving their Foundation which was their hosted collective. Basically, the company that was holding onto all of beehaw's money, will no longer accept donations, and will close. You can read more in their alarmingly titled post: EMERGENCY ANNOUNCEMENT: Open Collective Foundation is dissolving, and Beehaw needs your help
While this does not affect us, it certainly could have been us. I hope the Beehaw admins are doing okay and manage to get their money back. From what I've seen it should be easy to zero-out your collective, but "closure" and "no longer taking donations" make for a stressful time.

Again, Reddthat isn't affected by this but it does point out that we are currently at the behest of one company (person?) from winning the powerball or just running away with the donations.

Fees

Upon investigating our financials we are actually getting stung on fees quite often. Forgive me if I go on a rant here, skip below for the donation links if you don't want to read about the ins and outs. OpenCollective uses Stripe, which take 3% + $0.30 per transaction. Thats pretty much an industry standard. 3% + 0.30. Occasionally it's 3% + 0.25 depending on the payment provider Stripe uses internally. What I didn't realise is that Open Collective also pre-fills the donation form. This defaults to 15.00%. So in reality when I have set the donation to be $10, you end up paying $11.50. $1.50 goes to Open Collective, and then Stripe takes $0.48 in transaction fees. Then, when I get reimbursed for the server payments, Stripe takes another payment fee, because our Hosted Collective also has to pay a transfer fee. (Between you and me, I'm not sure why that is when we are both in Australia... and inter-bank transactions are free). So of that $11.50 out of your pocket, $1.50 goes to Open Collective, Stripe takes $0.48, then at the end of the month I lose $0.56 per expense! We have 11 donators so and 3 expenses per month, which works out to be another $0.15 per donator. So at the end of the day, that $11.50 becomes $9.37. A total difference of $2.13 per donator per month.

As I was being completely transparent I broke these down to 3 different transactions. The Server, the extra ram, and our object storage. Clearly I can save $1.12/m by bundling all the transactions, but that is not ideal. In the past 3 months we have paid $26.83 in payment transaction fees.

After learning this information, anyone who has recurring donations setup should check if the would like to continue giving 15% to Open Collective or not, or pass that extra 15% straight to Reddthat instead!

So to help with this outcome we are going to start diversifying and promoting alternatives to Open Collective as well as attempting to publish, maybe a specific page which I can update like /donate. But for the moment we will update our sidebar and main funding page.

Donation Links

From now on we will be using the following services.

I'm looking into Librepay as well but it looks like I will need to setup stripe myself, if I can do that in a safe way (for me and you) then I'll add that too.

Next Steps

From now on I'll be bundling all expenses for the month into one "Expense" on Open Collective to minimise fees as that is where most of the current funding is. I'll also do my best to do a quarterly budget report with expenditures and a sum of any OpenCollective/Kofi/Crypto/etc we have.

Thank you all for sticking around!

Tiff

view more: โ€น prev next โ€บ