Feel free to ping me if you want some help! I'd say I'm intermediate with regex, and I'm happy to help where I can.
Regarding the other file, you could pretty easily modify the command I gave you to adapt to the example you gave. There's two approaches you could take.
This is focused on the first regex in the command. The second should work unmodified on the other files if they follow the same pattern.
Here's the original chunk:
s|]\(#.+\)|\L&|
In the new example given, the #
is preceded by readme.md
. The easy modification is just to insert readme\.md
before the #
in the expression, adding the \
before the .
to escape the metacharacter and match the actual period character, like so:
s|]\(readme\.md#.+\)|\L&|
However, if you have other files that have similar, but different patterns, like (faq.md#%20link%20text)
, and so on, you can make the expression more universal by using the .*
metacharacter sequence. This is similar to the .+
metacharacter sequence, with one difference. The +
indicates one or more times, while the *
indicates zero or more times. So by using .*
before the #
you can likely use this on all the files if they follow the two pattern examples you gave.
If that will work, this would be the expression:
s|]\(.*#.+\)|\L&|
What this expression does is:
Find find a closing bracket followed by a opening parentheses followed by any sequence of characters (including no characters at all) until finding a pound/hash symbol then finding one or more characters until finding a closing parentheses, and convert that entire matched string to lowercase.
And with that modified expression, this would be the full command:
sed -ri 's|]\(#.+\)|\L&|; s|%20|-|g' /path/to/somefile
Edit: grammar
Edit 2: added the full modified command.
Nicely done! Happy I could help.
There's a million ways to do it and none are "right", so I wouldn't call yours sloppy at all. I'm still learning and have lots of slop in my own expressions. π€£
I'll turn around and ask you a question if you don't mind. That last bit you used, I kind of understand what it's doing, but not fully. I'm getting that it's excluding https, but if you could explain the syntax I'd really appreciate it!
This is the bit:
/https/ ! s|%20|-|g
Edit: removed a redundancy