RagingHungryPanda

joined 2 years ago
[–] [email protected] 1 points 1 month ago* (last edited 1 month ago) (2 children)

I am not for the life of me seeing where to add a tag or a label. I checked in 3 different UIs, including the main one.

[–] [email protected] 3 points 1 month ago

I had thought whether there should be lemmy, pixelfed, and maybe mastodon for local cities.

[–] [email protected] 2 points 1 month ago

I've been saving all of these today. Thanks a bunch!

[–] [email protected] 24 points 1 month ago (4 children)

I wish we had 5 minute headways haha.

[–] [email protected] 1 points 1 month ago

Thanks for giving it a good read through! If you're getting on nvme ssds, you may find some of your problems just go away. The difference could be insane.

I was reading something recently about databases or disk layouts that were meant for business applications vs ones meant for reporting and one difference was that on disk they were either laid out by row vs by column.

[–] [email protected] 1 points 1 month ago
[–] [email protected] 1 points 1 month ago (2 children)

That was a bit of a hasty write, so there's probably some issues with it, but that's the gist

[–] [email protected] 1 points 1 month ago (5 children)

yes? maybe, depending on what you mean.

Let's say you're doing a job and that job will involve reading 1M records or something. Pagination means you grab N number at a time, say 1000, in multiple queries as they're being done.

Reading your post again to try and get context, it looks like you're identifying duplicates as part of a job.

I don't know what you're using to determine a duplicate, if it's structural or not, but since you're running on HDDs, it might be faster to get that information into ram and then do the job in batches and update in batches. This will also allow you to do things like writing to the DB while doing CPU processing.

BTW, your hard disks are going to be your bottleneck unless you're reaching out over the internet, so your best bet is to move that data onto an NVMe SSD. That'll blow any other suggestion I have out of the water.

BUT! there are ways to help things out. I don't know what language you're working in. I'm a dotnet dev, so I can answer some things from that perspective.

One thing you may want to do, especially if there's other traffic on this server:

  • use WITH (NOLOCK) so that you're not stopping other reads and write on the tables you're looking at
  • use pagination, either with windowing or LIMIT/SKIP to grab only a certain number of records at a time

Use a HashSet (this can work if you have record types) or some other method of equality that's property based. Many Dictionary/HashSet types can take some kind of equality comparer.

So, what you can do is asynchronously read from the disk into memory and start some kind of processing job. If this job does also not require the disk, you can do another read while you're processing. Don't do a write and a read at the same time since you're on HDDs.

This might look something like:

offset = 0, limit = 1000

task = readBatchFromDb(offset, limit)

result = await task

data = new HashSet\<YourType>(new YourTypeEqualityComparer()) // if you only care about the equality and not the data after use, you can just store the hash codes

while (!result.IsEmpty) {

offset = advance(offset)

task = readBatchFromDb(offset, limit) // start a new read batch



dataToWork = data.exclusion(result) // or something to not rework any objects

data.addRange(result)



dataToWrite = doYourThing(dataToWork)

// don't write while reading

result = await task



await writeToDb(dataToWrite) // to not read and write. There's a lost optimization on not doing any cpu work

}



// Let's say you can set up a read or write queue to keep things busy

abstract class IoJob {

public sealed class ReadJob(your args) : IoJob

{

Task\<Data> ReadTask {get;set;}

}

public sealed class WriteJob(write data) : IoJob

{

Task WriteTask {get;set;}

}

}



Task\<IoJob> executeJob(IoJob job){

switch job {

ReadJob rj => readBatchFromDb(rj.Offset, rj.Limit), // let's say this job assigns the data to the ReadJob and returns it

WriteJob wj => writeToDb(wj) // function should return the write job

}

}



Stack\<IoJob> jobs = new ();



jobs.Enqueue(new ReadJob(offset, limit));

jobs.Enqueue(new ReadJob(advance(offset), limit)); // get the second job ready to start



job = jobs.Dequeue();

do () {

// kick off the next job

if (jobs.Peek() != null) executeJob(jobs.Peek());



if (result is ReadJob rj) {



data = await rj.Task;

if (data.IsEmpty) continue;



jobs.Enqueue(new ReadJob(next stuff))



dataToWork = data.exclusion(data)

data.AddRange(data)



dataToWrite = doYourThing(dataToWork)

jobs.Enqueue(new WriteJob(dataToWrite))

}

else if (result is WriteJob wj) {

await writeToDb(wj.Data)

}



} while ((job = jobs.Dequeue()) != null)

[–] [email protected] 29 points 1 month ago (1 children)
[–] [email protected] 7 points 1 month ago* (last edited 1 month ago) (1 children)

Super hero movies kinda do that, but then they have them kill a bunch of people on the train for no reason to remind us that they must be stopped. Usually this happens right around the time you're like, "Well, they have a point." haha

I think this was a video that had talked about it: https://www.youtube.com/watch?v=LpitmEnaYeU

Superheroes usually manage to roll back the various apocalypses but rarely use their powers to build a better world. The villains are the ones constantly dreaming up big audacious schemes to transform the universe.

[–] [email protected] 2 points 1 month ago

I've got Idrive backups at 5TB for like $5 a month or something.

[–] [email protected] 27 points 1 month ago

And they PAID to be there!

4
submitted 3 months ago* (last edited 3 months ago) by [email protected] to c/[email protected]
 

I've been getting into self hosting, the fediverse, and federated blogging. I contacted freaking nomads and they suggested that I write about my experiences, so here it is! I hope you enjoy.

Comments aren't fully federated from the blog site, so I'm using mastodon as well.

5
submitted 3 months ago* (last edited 3 months ago) by [email protected] to c/[email protected]
 

I've been getting into self hosting, the fediverse, and federated blogging. I contacted freaking nomads and they suggested that I write about my experiences, so here it is! I hope you enjoy.

Comments aren't fully federated from the blog site, so I'm using mastodon as well.

1
submitted 3 months ago* (last edited 3 months ago) by [email protected] to c/[email protected]
 

I've been getting into self hosting, the fediverse, and federated blogging. I contacted freaking nomads and they suggested that I write about my experiences, so here it is! I hope you enjoy.

Comments aren't fully federated from the blog site, so I'm using mastodon as well.

 

Starting at midnight Thursday night through midnight Friday night, we will be joining with people across the country and beyond to demonstrate our collective outrage over the hostile takeover of our government by unelected billionaires and by those who put profits before people.  For one day, this Friday, we pledge not to buy anything from any major online or in-person retailers, and we pledge to refrain from using credit cards.  We recommend staying away from Facebook, Instagram, and “X.”   

 

This action began as a protest against those corporations who abandoned diversity, equity, and inclusion programs to placate a white supremacist administration.  Those corporations include Target, Citi Bank, Google, and Disney.  It quickly expanded into a “Buy Nothing Day,” with particular recognition of the role of finance capital.  The concept of Economic Blackout 2/28 has quickly spread on social media, propelled by activists, faith communities, students, and rank-and-file workers everywhere.  The movement goes beyond our borders. In Canada, consumers will target USA-based companies to protest Trump’s tariffs, and Mexicans will participate in the Latino Freeze Movement to protest US anti-immigrant and anti-DEI policies.

 

Please participate in this action! It is a simple act that we all can accomplish and that can quickly add up to a collective impact. 

Sign our pledge today!

 

In resistance,

National Board, CPUSA

 

I'm trying my hand at federated blogging! Here's a bit on some things that I got rid of and some things that I added while traveling as a nomad.

66
submitted 4 months ago* (last edited 4 months ago) by [email protected] to c/[email protected]
 

I'm starting to get in to self hosting and am looking at self-hosted blog solutions. It looks like WriteFreely is the main fediverse blog platform, with Plume as second though I don't see it used much.

But that got me thinking that it'd be good to follow federated blogs and have some long form reading that I follow, like we did back when RSS was the main way of doing things.

But how do I actually find bloggers? It looks like WriteFreely can federate with Mastodon, but it doesn't look like there's a federated blogging platform like lemmy or mastodon. Is this correct? Where I can I go (other than Medium) to find blogs and bloggers in the fediverse?

 

I previously posted about an issue where the nginx container for the Collabora application logs a GET to /robots.txt every 10 seconds. I tried modifying the files in the container, but they were reset on restart. I also tried to run the container with --log-driver=none, but was unsuccessful. Despite being a software dev, I'm new to the homelab world and trunas.

I solved it by changing the docker image and then committing those changes. The change I made was to set access_log off; in the nginx config. I did it at the server root because I don't really care about those logs for this app, but it could be done on the location level.

Here's how I did it: Here's the reference SO post that I used: https://stackoverflow.com/a/74515438

What I did was I shelled into the image:

  • sudo docker exec -it ix-collabora-nginx-1 bash
  • apt update && apt install vim
  • vi /etc/nginx/nginx.conf and add the access_log off;
    • if you're not familiar with vim, arrow key to the line you want then press 'a' to enter "append mode". Make your change, then esc, :wq!. You need the ! because the file is read only
  • apt remove vim
  • exit
  • sudo docker commit <image id>
  • sudo docker restart ix-collabora-nginx-1
 

I'm running TruNas Scale with a docker image for NextCloud and Collabora. Under Collabora, the nginx application is logging a GET to robots.txt about every second and I'm having a hard time filtering this out because it looks like the conf files for nginx get replaced on every restart. I also tried mounting my own version of the nginx.conf file, but that didn't reflect any changes.

 

These are my AllBirds after 1 year of travel. I've been looking to repair the soles, but it doesn't seem that easy. I want shoes that ventilate well and are good for a lot of walking.

These started showing wear after 3 months of just city walking.

Any recommendations? I'm posting here because there isn't much on the shoe communities.

view more: ‹ prev next ›