harsh3466

joined 2 years ago
[–] [email protected] 4 points 7 months ago* (last edited 7 months ago) (12 children)

Nicely done! Happy I could help.

There's a million ways to do it and none are "right", so I wouldn't call yours sloppy at all. I'm still learning and have lots of slop in my own expressions. 🀣

I'll turn around and ask you a question if you don't mind. That last bit you used, I kind of understand what it's doing, but not fully. I'm getting that it's excluding https, but if you could explain the syntax I'd really appreciate it!

This is the bit:

/https/ ! s|%20|-|g

Edit: removed a redundancy

[–] [email protected] 3 points 7 months ago* (last edited 7 months ago) (1 children)

Feel free to ping me if you want some help! I'd say I'm intermediate with regex, and I'm happy to help where I can.

Regarding the other file, you could pretty easily modify the command I gave you to adapt to the example you gave. There's two approaches you could take.

This is focused on the first regex in the command. The second should work unmodified on the other files if they follow the same pattern.

Here's the original chunk:

s|]\(#.+\)|\L&|

In the new example given, the # is preceded by readme.md. The easy modification is just to insert readme\.md before the # in the expression, adding the \ before the . to escape the metacharacter and match the actual period character, like so:

s|]\(readme\.md#.+\)|\L&|

However, if you have other files that have similar, but different patterns, like (faq.md#%20link%20text), and so on, you can make the expression more universal by using the .* metacharacter sequence. This is similar to the .+ metacharacter sequence, with one difference. The + indicates one or more times, while the * indicates zero or more times. So by using .* before the # you can likely use this on all the files if they follow the two pattern examples you gave.

If that will work, this would be the expression:

s|]\(.*#.+\)|\L&|

What this expression does is:

Find find a closing bracket followed by a opening parentheses followed by any sequence of characters (including no characters at all) until finding a pound/hash symbol then finding one or more characters until finding a closing parentheses, and convert that entire matched string to lowercase.

And with that modified expression, this would be the full command:

sed -ri 's|]\(#.+\)|\L&|; s|%20|-|g' /path/to/somefile

Edit: grammar

Edit 2: added the full modified command.

[–] [email protected] 8 points 7 months ago* (last edited 7 months ago) (17 children)

Okay, here's the command and a breakdown. I broke down every part of the command, not because I think you are dumb, but because reading these can be complicated and confusing. Additionally, detailed breakdowns like these have helped me in the past.

The command:

sed -ri 's|]\(#.+\)|\L&|; s|%20|-|g' /path/to/somefile

The breakdown:

sed - calls sed

-r - allows for the use of extended regular expressions

-i - edit the file given as an argument at the end of the command (note, the i flag must follow the r flag, or the extended regular expressions will not be evaluated)

Now the regex piece by piece. This command has two substitution regex to break down the goals into managable chunks.

Expression one is to convert the markdown links to lowercase. That expression is:

's|]\(#.+\)|\L&|;

The goal of this expression is to find markdown links, and to ignore https links. In your post you indicate the markdown links all start with a # symbol, so we don't have to explicitly ignore the https as much as we just have to match all links starting with #. Here's the breakdown:

' - begins the entire expression set. If you had to match the ' character in your expression you would begin the expression set with " instead of '.

s| - invoking find and replace (substitution). Note, Im using the | as a separator instead of the / for easier readability. In sed, you can use just about any separator you want in your syntax

]\(# - This is how we find the link we want to work on. In markdown, every link is preceded by ]( to indicate a closing of the link text and the opening of the actual url. In the expression, the ( is preceded by a \ because it is a special regex character. So \( tells sed to find an actual closing parentheses character. Finally the # will be the first character of the markdown links we want to convert to lowercase, as indicated by your example. The inclusion of the # insures no https links will be caught up in the processing.

.+ - this bit has two parts, . and +. These are two special regex characters. the . tells sed to find any character at all and the + tells it to find the preceding character one or more times. In the case of .+, it's telling sed to find one or more of any characters. You might think this will eat ALL of the text in the document and make it all lowercase, but it will not because of the next part of the regex.

\) - this tells sed to find a closing parentheses. Like the opening parentheses, it is a special regex character and needs to be escaped with the backslash to tell sed to find an actual closing parentheses character. This is what stops the command from converting the entire document to lowercase, because when you combine the previous bit with this bit like so .+\), you're telling sed to find one or more of any character UNTIL you find a closing parentheses.

| - This tells sed we're done looking for text to match. The next bits are about how to modify/replace that text

\L - This tells sed to convert the given text to all lowercase

& - This is the given text to modify. In this case the & is a special mertacharacter that tells sed to modify the entire pattern matched in the matching portion of the expression. So when the & is preceded by the \L, this tells sed Take everything that was matched in the pattern matching expression and convert it to lowercase.

; - this tells sed that this is the end of the first expression, and that more are coming.

So all together, what this first expression does is: Find a closing bracket followed by an opening parentheses followed by a pound/hash symbol followed by one or more of any characters until finding a closing parentheses. Then convert that entire chunk of text to lowercase. Because symbols don't have case you can just convert the entire matched pattern to lowercase. If there were specific parts that had to be kept case sensitive, then you'd have to match and modify more precisely.

The next expression is pretty easy, UNLESS any of your https links also include the string %20:

If no https links contain the %20 string, then this will do the trick:

s|%20|-|g'

s| - again opens the expression telling sed wer're looking to substitute/modify text

%20 - tells sed to find exactly the character sequence %20

| - ends the pattern matching portion of the expression

- - tells sed to replace the matched pattern with the exact character -

| - tells sed that's the end of the modification instructions

g - tells sed to do this globally throughout the document. In other words, to find all occurrances of the string %20 and replace them with the string -

' - tells sed that is the end of the expression(s) to be evaluated.

So all together, what this expression does is: Within the given document, find every occurrence of a percent symbol followed by the number two followed by the number zero and replace them with the dash character.

/path/to/somefile - tells sed what file to work on.

Part of using regex is understanding the contents of your own text, and with the information and examples given, this should work. However, if the markdown links have different formatting patterns, or as mentioned any of the https links have the %20 string in them, or other text in the document might falsely match, then you'd have to provide more information to get a more nuanced regex to match.

Edit: clarified the use of the & metacharacter.

Edit 2: clarified that the + metacharacter indicates finding the preceding character (or character set) one or more times.

[–] [email protected] 4 points 7 months ago (19 children)

I've got a sed regex that should work, just writing up a breakdown of the whole command so anyone interested can follow what it does. Will post in a bit.

[–] [email protected] 2 points 7 months ago

You more elegantly said what I came to say.

[–] [email protected] 10 points 7 months ago

Came to say something similar. Like I give a fuck that OpenAI's model/tech/whatever was "stolen" by Deepseek. Fuck that piece of shit Sam Altman.

[–] [email protected] 1 points 7 months ago* (last edited 7 months ago)

Hahaha. Indeed! Even funnier with that meme that was posted across a bunch of Linux communities yesterday.

Edit: The meme

[–] [email protected] 27 points 7 months ago (1 children)

If anyone actually thought that piece of shit trump was going to do anything about grocery prices, they are fucking fools.

And I fully understand that this letter is just theater. Stupid, shitty, ineffective theater.

Fuck this whole fucking piece of shit grift called a government.

[–] [email protected] 1 points 7 months ago

I've read the same, but in setting up this laptop I wanted to give it a go. So far the solution provided by @[email protected] is working. It's coming up with the vpn routing intact now that I adjusted networkd.conf.

If I hadn't found a solution I was prepared to go your route, though I didn't even think of setting it up so closing the lid would shut it down. That's great!

[–] [email protected] 2 points 7 months ago

Agree. I mean, fuck CVS, and managers can be shit, but it's not the lone person working in the store's fault that this shit system exists. I'd feel terrible fucking with them, since CVS cares as much about them as they do about their customers, which is to say, not at all.

[–] [email protected] 1 points 7 months ago

I'd be happy if people would just use signal, but no, it's goddam SMS, Snapchat, Instagram DMS, nothing with any kind of privacy or security. Just fucking awful.

I spun up a private matrix server for my immediate family, and I have gotten a few friends on it, but the vast majority just don't care and it's pretty frustrating

view more: β€Ή prev next β€Ί