sed and Multi-Line Search and Replace

I’ve been experimenting with getting regular expression patterns to match over multiple lines using sed. For example, one might want to change

<p>previous text</p>
<h2>
<a href="http://some-link.com">A title here</a>
</h2>
<p>following text</p>

to

<p>previous text</p>
No title here
<p>following text</p>

sed cycles through each line of input one line at a time, so the most obvious way to match a pattern that extends over several lines is to concatenate all the lines into what is called sed‘s “hold space,” then look for the pattern in that (long) string. That’s what I do in the following lines:

#!/bin/sh
sed -n '
# if the first line copy the pattern to the hold buffer
1h
# if not the first line then append the pattern to the hold buffer
1!H
# if the last line then ...
$ {
        # copy from the hold to the pattern buffer
        g
        # do the search and replace
        s/<h2.*</h2>/No title here/g
        # print
        p
}
'
sample.php > sample-edited.php;

A more compact version:


sed -n '1h;1!H;${;g;s/<h2.*</h2>/No title here/g;p;}' sample.php > sample-edited.php;
 

As far as I can tell, that’s the most efficient way to match general multi-line patterns. I initially thought it might be more efficient not to keep the complete input in the hold buffer, so I modified the algorithm to print out the string whenever a match is found:


#!/bin/sh
sed -n '1h
1!{
        # if the sought-after regex is not found, append the pattern space to hold space
        /<h2.*</h2>/ !H
        # copy hold space into pattern space
        g
        # if the regex is found, then...
        /<h2.*</h2>/ {
                # the regular expression
                s/<h2.*</h2>/No title here/g
                # print
                p
                # read the next line into the pattern space
                n
                # copy the pattern space into the hold space
                h
        }
        # copy pattern buffer into hold buffer
        h
}
# if the last line then print
$p
'
sample.php > sample-edited.php;
 

In the last example, sed concatenates lines only until it finds a match, and then it prints the line (after substituting the text). Then, it starts again to concatenate the following lines.

However, that approach is usually massively inefficient, as the regex work increases logarithmically. Unless a sed guru can point out a better way, I’m going to continue using the first approach.

I’ve put the following script, which I call “sedml,” for sed multi-line, in my bash path.

#!/bin/sh
if [ "$#" -lt 2 ]
then
exit;
fi

# change the input file if no 3rd argument
if [ -z "$3" ]
then
        outputfile="$1"
else
        outputfile="$3"
fi
sed -n '
# if the first line copy the pattern to the hold buffer
1h
# if not the first line then append the pattern to the hold buffer
1!H
# if the last line then ...
$ {
        # copy from the hold to the pattern buffer
        g
        # do the search and replace
        '
"$2"'
        # print
        p
}
'
$1 > $1.tmp;
mv -f $1.tmp $outputfile;
 

So I can replace multi-line patterns in multiple files like so:

 grep -rl '<h2' * | while read i; do sedml $i "s/<h2.*</h2>/No title here/g" $i.tmp; done;

39 Comments

  1. Thank you for this. You are a philosopher and a poet.

    -David

  2. Hi, filosofo

    Maybe I misunderstood something, nut I created the following bash script:

    #!/bin/bash
    SRCH="a\nb"
    file="test.txt"
    sed -i.bak -n '
    # if the first line copy the pattern to the hold buffer
    1h
    # if not the first line then append the pattern to the hold buffer
    1!H
    # if the last line then ...
    $ {
    # copy from the hold to the pattern buffer
    g
    # do the search and replace
    '"$SRCH"'
    # print
    p
    }
    ' $file;

    …and it cannot find the pattern in test.txt:

    a
    b

    any tips?

  3. Hi, filosofo

    Maybe I misunderstood something, but I created the following bash script:

    #!/bin/bash
    SRCH="a\nb"
    file="test.txt"
    sed -i.bak -n '
    # if the first line copy the pattern to the hold buffer
    1h
    # if not the first line then append the pattern to the hold buffer
    1!H
    # if the last line then ...
    $ {
    # copy from the hold to the pattern buffer
    g
    # do the search and replace
    '"$SRCH"'
    # print
    p
    }
    ' $file;

    …and it cannot find the pattern in test.txt:

    a
    b

    any tips?

  4. >> # do the search and replace
    >> ‘”$SRCH”‘

    If you want to search and replace, do something like this :
    s/’”$SRCH”‘/xxx/

  5. hi all,

    I’ve the following issue.
    some virus injecting php and html files with some malicious code,
    it’s something like:

    <iframe ……..
    ………….
    ………

    till i could stop this virus, i want to automate some script to match and delete this patern or replace with some empty string,

    i tried your example but it seems to have something wrong,
    could you help me please with this issue ?

    thanks in advance.

  6. $ things i notice:
    1. (i’m not sure about this since i didn’t try your sc). it looks like there is a problem in the invocation line:
    grep -rl '<h2' * | while read i; do
    sedml $i "s/<h2.*/No title here/g" $i.tmp
    done

    - in the arg for the sed ‘s’ cmd, isn’t that the / in will be taken by sed as the ‘s’ delimiter? yes, it can be simply changed to another char. but what bothers me (maybe it’s just me since i barely know regex) is that the sc user must be a bit familiar with regex & moreover with sed ‘s’ cmd

    - also, the ‘s’ cmd regex will go across multiple since .* match longest pattern

    2. as u said: for each input file, the hold buf contains all the complete input, meaning more text more mem consumed, worse for other app, more danger for sed to be killed

    3. (i’m assuming you’re concern about efficiency since u gave 2nd try to the problem for it)

    - for each file found by ‘grep’ containing at least 1 of the searched html tag (), it brings up a subsh which in turn brings up sed, meaning a pair of subsh & sed if started & ended for each input file.

    - it execs grep in this sh while it also starts a subsh to perform ‘while read i…’. inside the while loop, the subsh will starts another subsh (due to #!/bin/sh), then the last created subsh starts sed, & repetition begins

    - yes, how process is treated after it exits (particularly whether the core image is still in the mem / not) is kernel-dependent. but 1 thing for sure is that single invocation of an app to deal with multiple input files will always be more efficient than a single invocation of it for each input file, even for *nix where its philosophy of small tools requires low rsc cost for process creation

    $ the idea basically:
    sed -n ‘
    /<h2/b proc
    p; b

    :proc
    n
    s//no title here/p
    t
    b proc
    ‘ INPUT_FILE…

    $ some minimum enhancements i can think of:
    1. for the searched html tag:
    - it’s variable
    - user doesn’t need to know about regex
    - the tag can be specific by specifying its attribute

    > ex:
    input:
    - only replace that particular h1, other h1 with diff title including ones that don’t have title isn’t affected. also that the input need not be full. the previous input can be like in the html code
    - since all paired html tag looks like: start mark <TAG… and end mark </TAG…, user need only input the start mark, this also eliminate forcing user to know about regex

    2. the replacement text can be any char including ones that are special in sed's eye: & \N, the user don't need to know about it

    3. for sed to be invoked only 1 once for multiple input files while retaining the ability to save the original file / editing in place & without manual tmp file, there is only 1 way: use -i[SUFFIX]

    $ here's the sc, my apology for its ugliness, i don't know much about *nix:

    #! /bin/bash

    usage(){
    echo "usage: $0 [i][b SUFFIX] TAG TEXT [FILES...]"
    exit 1
    }

    while getopts ":ib:" OPT; do
    case $OPT in
    i) OPT_I=-i
    unset SUFFIX
    ;;
    b) OPT_I=-i
    SUFFIX="$OPTARG"
    ;;
    *) usage
    ;;
    esac
    done

    shift $(($OPTIND-1))

    # only TAG & TEXT must be present in invocation line
    if(($#, so that no need to give all tag’s attrs
    if [ "${START:${#START}-1}" == ">" ]; then
    START=${START:0:${#START}-1}
    fi

    # is there syntax / tool to do this cleanly? for substr based on char
    END=${START%% *} #get only up to the 1st space
    END=${END:0:1}/${END:1}

    TEXT=$2
    # reformat & and \ to be edible for sed ‘s’ cmd
    TEXT=”${TEXT//\\/\\\\}”
    TEXT=”${TEXT/&/\&}”
    #TEXT=”$(echo “$TEXT” | sed -r ‘s/\\/\\\\/g;’)”

    shift 2

    # preserve *nix philosophy of ‘small tool’ by expecting input of files from stdin & defaultly sending output to stdout
    if(($#==0)); then
    while read F; do
    FILES=”${FILES} $F”
    done
    else
    FILES=$*
    fi

    sed -n ${OPT_I}$SUFFIX ‘
    /’”$START”‘/b proc
    p
    b

    :proc
    n
    s|.*’”$END”‘.*|’”$TEXT”‘|p
    t #sc goes to eof if matching closing tag is found
    b proc
    ‘ $FILES

    $ i’m sure there’re lots more than can be done to it without breaking unix philosophy, such as its ability to change a tag at a particular nested lv, but i won’t know since i’m not a web dev

  7. $ gee, i was testing my last posted sc, bugs came out:
    - it doesn’t ignore spacing within <> (fixed)
    - it also modifies more specific tags if given a more general one of that tag, ex: given %lt;h1> it also modifies <h1 …>

    $ ‘f’ opt is added, if ‘f’ used, it performs fixed match for the given tag: <h1> only matches <h1>), while if no ‘f’ <h1> matches <h1 …>

    $ it still ignores tag nesting (gonna work on this one)

    #! /bin/bash

    set -o xtrace

    usage(){
    echo "usage: $0 [i][b SUFFIX][f] TAG TEXT [FILES...]"
    exit 1
    }

    FIXED=0

    while getopts ":ib:f" OPT; do
    case $OPT in
    i) OPT_I=-i
    unset SUFFIX
    ;;
    b) OPT_I=-i
    SUFFIX="$OPTARG"
    ;;
    f)
    FIXED=1
    ;;
    *) usage
    ;;
    esac
    done

    shift $(($OPTIND-1))

    # only TAG & TEXT must be present in invocation line
    if(($# < 2)); then
    usage
    fi

    # there must be no leading / trailing space around the tag given in cmd-line
    START=$1

    if [ "${START:0:1}" == " < " ]; then
    START=${START:1}
    fi
    if [ "${START:${#START}-1}" == " > " ]; then
    START=${START:0:${#START}-1}
    fi

    # is there syntax / tool to do this cleanly? for substr based on char
    END=/${START%% *} #get only up to the 1st space

    TEXT=$2
    # reformat & and \ to be edible for sed 's' cmd
    TEXT="${TEXT//\\/\\\\}"
    TEXT="${TEXT/&/\&}"

    shift 2

    # preserve *nix philosophy of 'small tool' by expecting input of files from stdin & defaultly sending output to stdout
    if(($#==0)); then
    while read F; do
    FILES="${FILES} $F"
    done
    else
    FILES=$*
    fi

    if(($FIXED==0)); then
    START=$START'.*'
    else
    START=$START' *'
    fi

    echo "$START"

    sed -n ${OPT_I}$SUFFIX '
    /[ *'"$START"']/b proc
    p;b

    :proc
    n
    s||'"$TEXT"'|p
    t
    b proc
    ' $FILES

    $ btw, i was actually found this page when googled “multiline sed”. i was learning sed & got stuck with gnu ext of multi-line M for ‘s’ cmd / pattern matching. gnu doc only defines it, i need examples, anyone?

  8. $ sorry, the sed part didn’t come right

    sed -n ${OPT_I}$SUFFIX '
    /< *'"$START"'>/b proc
    p
    b

    :proc
    n
    s|< *'"$END"' *>|'"$TEXT"'|p
    t
    b proc
    ' $FILES

  9. When trying to replicate this example by running the command suggested:
    sed -n '1h;1!H;${;g;s/<h2.*/No title here/g;p;}' sample.php

    I get the following error:
    sed: -e expression #1, char 26: unknown option to `s'

  10. Never mind. I figured it out. Although I can’t show the final code because I can’t figure out how to escape all the chars. Basically, you have to escape the backslash on the closing h2 tag.

  11. the line

    /<h2.*/No title here/g

    doesn’t the forward slash in the closing h2 tag confuse sed since you also use forward slash as your separator

  12. This works nice but if the text file has another it will not work correctly..

    as it will detect the second and not the first one..

    how can I fix this?

  13. How to do in reverse way? Print only the content within the tags?
    Print only the content within the table tags:
    @table@ … @/table@
    Where @ is ><
    And ignore the rest.

  14. Thanks! I found this incredibly helpful. In my case, the second approach works better than the first, though, because sed uses the “longest match” rule. I needed to replace several multiline sections, not just one. So the first approach (which replaced ALL sections with the new patter) didn’t work for me. The second “less efficient” approach did exactly what I wanted.

    Thanks again!

  15. Thanks you, your example works as expected. However, after viewing your code I came-up with a simpler version:


    sed -e '/<h2/,/<\/h2>/{s/<h2.*/No text here/p;d}' sample.php

  16. A variation of your sample I am using:


    sed -n -e '/<filter>/,/<\/filter>/ {
    /<criteriaItems>/,/<\/criteriaItems>/ {
    /<criteriaItems>/h
    /<criteriaItems>/!H
    /<\/criteriaItems>/ {
    g
    s/ *\n *//g
    p
    }
    }
    }
    ' "$i" > "$i".crit

    In this example, I am removing line wraps so I can easily do a grep on the contents between a beginning and ending XML statement.

  17. Your examples will fail, take care of / and \/!
    Change
    s/<h2.*</h2>/No title here/g
    to
    s/<h2.*<\/h2>/No title here/g
    or
    s@<h2.*</h2>@No title here@g

    Andreas

  18. Thank you very much!

  19. I’ve bookmarked this because I found it interesting. I would be very interested to hear more news on this. Thanks!

  20. HELP

    Iḿ trying, without success, to delete a huge text between html tags (recursively in almost 1000 files)

    the text to be delete starting at the first line after ‘body’ tag

    till the last line before the ‘div id=”page_content”‘

    (without the ‘ and with > and <)
    anyone can tell me how to do it?

    thanks in advance

    O: -)

  21. It delivers up a subsh which in turn delivers up sed, significance a couple of subsh & sed if began & finished for each feedback computer file. sickle cell disease

  22. I found it interesting. I would be very interested to hear more news on this. http://www.coinvex.com/

  23. This works nice but if the writing data file has another it will not work effectively.. continue reading this

  24. Flame channels are excellent applicants for energy-efficiency projects because they are really consistent; a fireplace place usually is managed for many years by the same company, with very reliable levels of employment, and with the same kind of work being done inside. my site

  25. However, that strategy is usually extremely ineffective, as the regex perform improves logarithmically. Unless a sed expert can factor out a better way, I’m going to proceed using the first strategy. property for sale in Nueva Andalucia

  26. Congratulation. What a wonderful daughter. My newest completed secondary university this season. It was not small task for a great boy with Asperger’s Autism. We are so extremely pleased of him. Thanks for the photos and delights to yourself and your family members members. http://www.accountantslondon.org

  27. In this example, I am eliminating range parcels so I can quickly do a grep on the material between a starting and finishing XML declaration. M88

  28. Until i could quit this malware, i want to improve some program to coordinate and remove this patern or substitute with some vacant sequence. New York divorce lawyer

  29. As far as the Mac executable is concerned, OSX will automatically prompt the user to download Java if it is not already installed. pax vaporizer review

  30. It was great to see the old printshop and everyone who works there again. I am excited to see a printing business still operating and growing, great job guys. Check Aol Mail

  31. This page is where I got the most useful information for my information gathering. Thanks for posting maybe we can see more on this.

  32. It’s brief, which is just what I need when I’m trying to keep the JavaScript dimension low. (Whenever dimension is not so much of a problem, such as on the management part of a web page. buy youtube views

  33. This is very handy, thx for sharing this, much appriciated!
    Hotel Florica Venus

  34. I have fairly extensive waist and wonderful sight, I’d do my women replicated so difficult we would fusionate into an androginous wonder, so wonderful and attractive the globe would implode and god would have a wet desire. accountants london

  35. Prefers substantially increase the identification within just your website website and together with the right choice rely; several men and ladies will likely be passionate about your consideration. mp3skull

  36. If you did, do you think Muslims would sit around and wonder if the God of the Christians was inherently bloodthirsty. streaming film

  37. The companies which provide these solutions, will provide you with non-automated or automatic likes and provide you with authentic Instagram likes. phone detective review

  38. As u said: for each feedback computer file, the keep buf contains all the finish feedback, significance more written text more mem absorbed, more intense for other app, more risk for sed to be killed. abc acai berry

Post a Comment

Your email is never shared. Required fields are marked *

*
*

17 Trackbacks

  • [...] if I wanted to wipe everything above that and substitute some include script? I’d use sed, [...]

  • [...] Ersetzen von Text über mehrere Zeilen hinweg mit sed [...]

  • [...] = "#002285"; netseer_network_id = 1040; [Log in to get rid of this advertisement] Hi, I found a example how to do multiline search and replace. I try to make this working to my needs but with no success [...]

  • SED behavior on May 14, 2010 at 8:46 am

    [...] for e.g. text="Title Linux kernel UUID=asdlkkjaljdhakjsdhakjsd blabla init blabla " (I was inspired here) – but I found problems. I had to replace new line characters on enf of lines. So in real it is not [...]

  • [...] sed -n ' # if the first line copy the pattern to the hold buffer 1h # if not the first line then append the pattern to the hold buffer 1!H # if the last line then … $ { # copy from the hold to the pattern buffer g # do the search and replace s/([[:alnum:]_]+):[[:blank:]]?function[[:blank:]]?1?(([[:alnum:]_[:blank:],]*)) {(([^}])*)}/1: (function 1(2) {3})/g # print p } ' file reference: http://austinmatzko.com/2008/04/26/s…h-and-replace/ [...]

  • [...] This post was mentioned on Twitter by Thiago Leite, Thiago Leite. Thiago Leite said: http://austinmatzko.com/2008/04/26/sed-multi-line-search-and-replace/ [...]

  • Refining Linux on December 19, 2010 at 6:01 pm

    #20: Multi-line sed search and replace…

    This article is part of the 2010 Advent calendar series “24 Short Linux Hints”. This series focuses on little (and sometimes longer) tricks, tips and hints to solve common problems and really improve your workflow. sed is built to process strings (eith…

  • [...] This posting from austin matzko ‘s blog looked really useful, but ultimately The Grymoire awed me with its comprehensiveness and clarity, in this case, for all things sed. [...]

  • [...] sed command tells to put a space and remove the line break. using the multiline search and replace method The second just gets rid of the leading white space at the beginning of line) [...]

  • [...] upper case in the snippet above. Line 24 of the script, which I gratefully copied and adapted from Austin Matzko’s blog, is the sed way to find and replace text patterns over two consecutive lines. Running the script [...]

  • tweak linux by masterofthewind - Pearltrees on January 13, 2012 at 5:10 pm

    [...] mv -f $ 1 .tmp $outputfile ; sed and Multi-Line Search and Replace – Austin Matzko's Blog [...]

  • Bash/CLI by dchirila - Pearltrees on January 13, 2012 at 5:13 pm

    [...] sed and Multi-Line Search and Replace – Austin Matzko's Blog [...]

  • [...] Dunlopillo 1137248 “Test-Gut-Matratze” Multi Care H2 90 x 200 cmDunlopillo 1137248 "Test-Gut-Matratze" Multi Care H2 90 x 200 cm Coltex KernElastisches DoppeltuchBezug mit "Amicor pur", aus 35% Polyamid, 30% Polyester, 20% Baumwolle, 15% Polyacryl Multicare von Dunlopillo. Die Matratze hat einen 15 cm hohen 7-Zonen Kern aus Coltex. Dieser Spezialschaumstoff bietet ein perfektes Zusammenspiel von Atmungsaktivität und Leichtigkeit. Der Bezug ist 95°C waschbar und sehr angenehm. Coltex Matratze Multicare von Dunlopillo: Die Multicare Matratze hat einen 15 cm hohen Kern aus Coltex. Dieser Spezialschaumstoff bietet ein perfektes Zusammenspiel von Atmungsaktivität und Leichtigkeit. Der Kern ist in 7 ergonomische Zonen unterteilt. Der Amicore pure+ Bezug ist zum waschen abnehmbar und ermöglicht durch die Wendeschlaufen ein leichtes Drehen und Wenden der Matratze. Diesen schönen Matratzenbezug kann man mittels Reißverschluss bequem abnehmen und bis 95 Grad waschen. Aus diesem Grund ist diese Matratze auch sehr gut für Hausstauballergikern geeignet! Ausserdem wirkt das Amicore im Bezug wie ein permanenter Frischekick an der Fadenoberfläche des Bezuges gegen Staubmilben und andere Mikroorganismen. Für mehr Gesundheit und Wohlbefinden!Array List Price: EUR 229,00 Price: [wpramaprice asin="B000ZN1L0U"] [wpramareviews asin="B000ZN1L0U"]Dunlopillo 1137248 "Test-Gut-Matratze" Multi Care H2 90 x 200 cm Coltex KernElastisches DoppeltuchBe…Matratze-Multi-Care/dp/B000ZN1L0U%3FSubscriptionId%3DAKIAIN5XB2DFE43FTHOA%26tag%3Dbalitrekkin0e-21%26linkCode%3Dxm2%26camp%3D2025%26creative%3D165953%26creativeASIN%3DB000ZN1L0U" rel="nofollow">Coltex KernElastisches DoppeltuchBezug mit "Amicor pur", aus 35% Polyamid, 30% Polyester, 20% Baumwolle, 15% Polyacryl Multicare von Dunlopillo. Die Matratze hat einen 15 cm hohen 7-Zonen Kern aus Coltex. Dieser Spezialschaumstoff bietet ein perfektes Zusammenspiel von Atmungsaktivität und Leichtigkeit. Der Bezug ist 95°C waschbar und sehr angenehm. Coltex Matratze Multicare von Dunlopillo: Die Multicare Matratze hat einen 15 cm hohen Kern aus Coltex. Dieser Spezialschaumstoff bietet ein perfektes Zusammenspiel von Atmungsaktivität und Leichtigkeit. Der Kern ist in 7 ergonomische Zonen unterteilt. Der Amicore pure+ Bezug ist zum waschen abnehmbar und ermöglicht durch die Wendeschlaufen ein leichtes Drehen und Wenden der Matratze. Diesen schönen Matratzenbezug kann man mittels Reißverschluss bequem abnehmen und bis 95 Grad waschen. Aus diesem Grund ist diese Matratze auch sehr gut für Hausstauballergikern geeignet! Ausserdem wirkt das Amicore im Bezug wie ein permanenter Frischekick an der Fadenoberfläche des Bezuges gegen Staubmilben und andere Mikroorganismen. Für mehr Gesundheit und Wohlbefinden!Array List Price: EUR 229,00 Price: [wpramaprice asin="B000ZN1L0U"] [wpramareviews asin="B000ZN1L0U"]WordPress › Errorhtml { background: #f9f9f9; } body { background: #fff; color: #333; font-family: sans-serif; margin: 2em auto; padding: 1em 2em; -webkit-border-radius: 3px; border-radius: 3px; border: 1px solid #dfdfdf; max-width: 700px; } #error-page { margin-top: 50px; } #error-page p { font-size: 14px; line-height: 1.5; margin: 25px 0 20px; } #error-page code { font-family: Consolas, Monaco, monospace; } ul li { margin-bottom: 10px; font-size: 14px ; } a { color: #21759B; text-decoration: none; } a:hover { color: #D54E21; } .button { font-family: sans-serif; text-decoration: none; font-size: 14px !important; line-height: 16px; padding: 6px 12px; cursor: pointer; border: 1px solid #bbb; color: #464646; -webkit-border-radius: 15px; border-radius: 15px; -moz-box-sizing: content-box; -webkit-box-sizing: content-box; box-sizing: content-box; background-color: #f5f5f5; background-image: -ms-linear-gradient(top, #ffffff, #f2f2f2); background-image: -moz-linear-gradient(top, #ffffff, #f2f2f2); background-image: -o-linear-gradient(top, #ffffff, #f2f2f2); background-image: -webkit-gradient(linear, left top, left bottom, from(#ffffff), to(#f2f2f2)); background-image: -webkit-linear-gradient(top, #ffffff, #f2f2f2); background-image: linear-gradient(top, #ffffff, #f2f2f2); } .button:hover { color: #000; border-color: #666; } .button:active { background-image: -ms-linear-gradient(top, #f2f2f2, #ffffff); background-image: -moz-linear-gradient(top, #f2f2f2, #ffffff); background-image: -o-linear-gradient(top, #f2f2f2, #ffffff); background-image: -webkit-gradient(linear, left top, left bottom, from(#f2f2f2), to(#ffffff)); background-image: -webkit-linear-gradient(top, #f2f2f2, #ffffff); background-image: linear-gradient(top, #f2f2f2, #ffffff); } [...]

  • [...] with an empty string and otherwise leave the file alone. The simplest way I've found so far (using sed multi-line search and replace) is [...]

  • [...] an empty string and otherwise leave the file alone. The simplest way I’ve found so far (using sed multi-line search and replace) is [...]

  • [...] a way to do this with sed? Keep in mind the file is very large (over 50 million lines). I found this blog with a multi-line search I could try, but if I’m understanding it correctly, this would [...]

  • [...] it operates one line at a time. The only decent technique I’ve ever seen to do this is this one, which involves storing the entire file in sed’s hold buffer and then operating on it all at [...]