March 31st, 2006

PHP – parse a string between two strings

This is a handy little function to strip out a string between two specified pieces of text. This could be used to parse XML text, bbCode, or any other delimited code/text for that matter.


function get_string_between($string, $start, $end){
	$string = " ".$string;
	$ini = strpos($string,$start);
	if ($ini == 0) return "";
	$ini += strlen($start);
	$len = strpos($string,$end,$ini) - $ini;
	return substr($string,$ini,$len);
}

$fullstring = "this is my [tag]dog[/tag]";
$parsed = get_string_between($fullstring, "[tag]", "[/tag]");

echo $parsed; // (result = dog)

  • Share/Bookmark

44 Responses to ' PHP – parse a string between two strings '

Subscribe to comments with RSS or TrackBack to ' PHP – parse a string between two strings '.

  1. Bashar said,

    on September 20th, 2006 at 7:51 am

    Thanks for this fine piece of work, I was exactly searching for this.

  2. bcb206 said,

    on November 13th, 2006 at 8:03 pm

    This is a life saver!

  3. Slick said,

    on December 15th, 2006 at 9:05 am

    This is a nice little function that I'm using to parse data from emails.

  4. Alex said,

    on December 30th, 2006 at 9:12 am

    It´s a nice function, thanks!

  5. thanks said,

    on January 19th, 2007 at 1:16 pm

    solved it perfectly. thanks.

  6. Justin Cook said,

    on January 19th, 2007 at 1:29 pm

    Glad it helped. You can always consider the $2 donation as mentioned above ;)

  7. Brian G said,

    on January 27th, 2007 at 8:04 pm

    Great script…though, what if the string contains multiple instances of the same tag…a bold for example?

  8. Justin Cook said,

    on January 28th, 2007 at 11:12 am

    It will just parse the first instance. In which case, just add a third, optional parameter to indicate the starting character search position (int)

  9. Haris N H said,

    on February 9th, 2007 at 12:50 am

    Greate script.It helps me a lot.

  10. George Arkouzis said,

    on May 25th, 2007 at 12:16 pm

    Hi-I was wondering how it would be made to parse this string> "dog+cat+mouse+fish"
    so that the result would be
    "dogcatmousefish"
    Thanks in advance.

  11. Justin Cook said,

    on May 25th, 2007 at 12:25 pm

    I think you'd have to use eregi for that

  12. peter said,

    on May 26th, 2007 at 10:00 am

    str_replace("+","","dog+cat+mouse+fish") = "dogcatmousefish"

  13. Pimm said,

    on July 14th, 2007 at 8:46 am

    Life saver.

  14. sidsevensix said,

    on November 9th, 2007 at 10:11 pm

    whoop! tanks!

  15. Shane10101 said,

    on June 7th, 2008 at 11:45 am

    I haven't tried this yet, but it looks like this function wouldn't work if $start appears at the very beginning of $string: strpos($string,$start) would return "0", which would make the function react as though strpos($string,$start) were "false" (ie, boolean 0), & thus the function would return "".

    Perhaps using the === operator would solve this:

    if ($ini === 0) return ""; would be evaluated as false if strpos($string,$start) returned "0" (as in "position zero"), but evaluated as "true" if strpos($string,$start) returned false (ie, boolean zero).

    Does this make sense? (I'll need to give it a try to know for sure.)

    I hope this helps.

    Shane1010

  16. akum said,

    on July 17th, 2008 at 10:53 am

    super .) this is exactly what i need. thanks

  17. Matt R said,

    on September 20th, 2008 at 5:18 pm

    Awesome–just what I was looking for, thanks mucho!

  18. scvinodkumar said,

    on September 23rd, 2008 at 3:04 am

    Hi,

    suppose if i want to find the string between &

    how can i do?

  19. Vivek said,

    on November 1st, 2008 at 3:31 pm

    This helped a lot, I was trying to parse some web page and this helped thanks !

  20. Luc said,

    on November 5th, 2008 at 7:29 am

    I've done some code to get all every strings between two tags in a given string.

    I tested it a little, so take it as it is… it works perfect for my requirement.


    function get_all_strings_between($string,$start,$end)
    {
    //Returns an array of all values which are between two tags in a set of data
    $strings = array();
    $startPos = 0;
    $i = 0;
    //echo strlen($string)."\n";
    while($startPos < strlen($string) && $matched = get_string_between(substr($string,$startPos),$start,$end))
    {
    if ($matched == null || $matched[1] == null || $matched[1] == '') break;
    $startPos = $matched[0]+$startPos+1;
    array_push($strings,$matched[1]);
    $i++;
    }
    return $strings;
    }

    function get_string_between($string, $start, $end){
    //$string = " ".$string;
    $ini = strpos($string,$start);
    if ($ini == 0) return null;
    $ini += strlen($start);
    $len = strpos($string,$end,$ini) - $ini;
    return array($ini+$len,substr($string,$ini,$len));
    }

  21. clonejo said,

    on November 29th, 2008 at 3:34 pm

    This function should be available in PHP without including.

    Good job!

  22. roni said,

    on December 2nd, 2008 at 11:13 pm

    FREAKIN' GENIUS!
    Just had to say that. :~)

  23. dbemowsk said,

    on January 10th, 2009 at 1:09 am

    This does not appear that it will work with a string such as this

    [tag]dog[tag]pet[/tag][/tag]

    $parsed = get_string_between($fullstring, "[tag]", "[/tag]");

    will return "dog[tag]pet" and not "dog[tag]pet[/tag]" as one should expect.

    Also, your use of "if ($ini == 0)…" to evaluate the strpos is incorrect. I am assuming that is why you start by adding a space to the beginning of your string in the beginning of your function. The correct way to do this would be:

    //REMOVE THIS… $string = " ".$string;
    $ini = strpos($string,$start);
    //CHANGE THIS… if ($ini == 0) return "";
    if ($ini === false) return "";

    Here is a revised version of this function that handles embeded tags.

    <?php

    function get_string_between($string, $start, $end){
    //Calculate the length of the start and end tags
    $lenStart = strlen($start);
    $lenEnd = strlen($end);
    $startTag = strpos($string, $start);
    ///If there is no initial match to the $start string, return an empty string
    if ($startTag === false) return "";
    //Calculate the start tag position and the first end tag position
    $strStart = $startTag + $lenStart;
    $strEnd = strpos($string, $end);
    //Set a counter for the tags
    $tagCount = 0;
    //Use $test to see if there is another $start string after the first, but before the $strEnd position
    $test = strpos($string, $start, $strStart);
    //Use this while loop to check if there are other matching tags
    while($test !== false && $strEnd > $test) {
    $tagCount ++;
    $next = $test + $lenStart;
    $test = strpos($string, $start, $next);
    }
    //If there is more than one tag, calculate the new end tag position
    if ($tagCount) {
    for($i = 0; $i < $tagCount; $i++) {
    $strEnd = strpos($string, $end, $strEnd + $lenEnd);
    }
    }
    return substr($string, $strStart, ($strEnd – $strStart));
    }

    $string = "I say, [tag]this is my [tag] animal [tag] cat [tag] tabby [/tag][/tag][/tag][/tag]";
    //Use a while loop to parse until there are no matching tags left
    while(($parsed = get_string_between($string, "[tag]", "[/tag]")) != "") {
    $string = $parsed;
    echo "\"".$parsed."\"</br>";
    }
    echo "DONE";

    ?>

    Hope this helps some of you…

  24. Dave said,

    on January 20th, 2009 at 3:27 pm

    A small modification so you don't have to pad the string:

    function getBetween($string, $start, $end) {
    $ini = strpos($string,$start);
    if ($ini === false) return "";
    $ini += strlen($start);
    $len = strpos($string,$end,$ini) – $ini;
    return substr($string,$ini,$len);
    }


  25. on February 14th, 2009 at 1:41 pm

    Dear all,
    I used this get_string_between function for a long time ago
    but finally I found Luc's function (get_all_strings_between) and it was a big event,
    but after testing get_all_strings_between against 4740 matches I realized something very important:

    Getting 4740 matches using Luc's function took: 1.1274 seconds ( that's huge )

    but using another very simple regexp function to get all strings took only 0.0504 seconds

    the new function is:

    function get_string_between($string, $start, $end){
    preg_match_all( "/$start(.*)$end/U", $string, $match );
    return $match[1];
    }

    And Yes just like that

    but you must be aware of escaping the start and end. i.e:

    if you want to get all strings between (+( and )+)
    you have to:

    get_all_strings_between($text, "\(\+\(", "\)\+\)");

    instead of:

    get_all_strings_between($text, "(+(", ")+)");

    and you will save more than one seconds when processing very big text..

    hope that was useful


  26. on February 14th, 2009 at 1:42 pm

    Sorry the new function is:

    function get_all_strings_between($string, $start, $end){
    preg_match_all( "/$start(.*)$end/U", $string, $match );
    return $match[1];
    }

    forget the _all_ in function name ;-)

  27. shitij said,

    on April 22nd, 2009 at 4:40 am

    Thanks,Its a real life saver

  28. tobi_girst said,

    on May 22nd, 2009 at 3:59 am

    Thank you very much!


  29. on June 13th, 2009 at 7:29 pm

    [...] looking in the internet for a good function to strip a string, I found this blog where this function is [...]

  30. VIJESH said,

    on June 15th, 2009 at 11:38 pm

    Thank you very much for this code.!!

  31. PERSIA said,

    on June 24th, 2009 at 1:01 am

    Hi !
    Its PERFECT !!!
    Thank YOU !!

  32. Joel Webb said,

    on June 28th, 2009 at 7:29 am

    Exactly what I was looking for. Simple but very effective.

    Thanks.


  33. on July 25th, 2009 at 2:27 am

    [...] looking in the Internet for a good function to strip a string, I found this blog where this function is implemented. [code brush="php"] function get_string_between($string, $start, [...]


  34. on August 3rd, 2009 at 11:47 pm

    Nice Man… It Worked

  35. Mike said,

    on September 19th, 2009 at 1:52 pm

    Thank you very much for sharing. I expect to use this over and over and over and over and over and over…

    Looking forward to viewing the rest of your site for more gems like this.

  36. Preethi said,

    on October 23rd, 2009 at 4:55 am

    Hello people!

    This code works great! thanks

    However when i intend to find a particular string pattern in between the strings "[[" and "|" , the last string pattern alone does not return proper results.

    For eg :

    Say $string = " B comes [[afterA|just an example]] and C comes [[afterB|just an example]]";

    $start = "[["
    $end = "|"

    In this case , following is the string pattern i am supposed to get :

    1)afterA
    2)afterB

    But what i get is
    1)afterA
    2)just an example]]";

    Can someone please help me identify the problem?

  37. St Tosin said,

    on October 24th, 2009 at 1:29 pm

    You have no idea. This is just what the doctor ordered. works well in filtering and parsing data out of an xml response. u b d man

  38. Deepak said,

    on October 29th, 2009 at 2:57 pm

    Great piece of code. Exactly what I needed.

  39. D.Eitner said,

    on November 21st, 2009 at 4:13 pm

    I'm working with Bassel Safadi's function from February 14, 2009 but having problems when there are additional (different) HTML tags between my start and end tags.

    I am extracting title and meta description elements out of a file. To use this function I create my own custom tags ( and ) and place these around my other tags like so:

    This is my site

    When I run the function on this, it fails to return the stuff in between, and I believe it's because the stuff in between contains slashes as part of the other tags, which I want to keep. Is there some way to escape embedded slashes within the (.*) bit? Many thanks.

  40. D.Eitner said,

    on November 21st, 2009 at 4:16 pm

    Oops, can't use angle brackets. Let me try that again.

    My custom tags are [desctag] and [/desctag] (replace [ and ] with angle brackets).

    Where it reads "This is my site" it should be:

    [desctag]
    [title]This is my site[/title]
    [meta name="description" content="some info about my site"]
    [/desctag]

  41. D.Eitner said,

    on November 21st, 2009 at 8:32 pm

    I had some troubles using Bassel Safadi's function from February 14, 2009 but I realized the problem was newline characters. I had my own custom HTML-like tags surrounding a block of standard HTML tags which I wanted to pass as a variable into a template. If the tags were on separate lines, the function failed. If they are all on one line, it works, though I do have to use:

    return $match[1][0];

    to get the correct output. Anyone know how to make this work when the enclosed block of text/tags contains newlines?

  42. D.eitner said,

    on November 23rd, 2009 at 8:00 pm

    A friend pointed me to a regexp help site which offered this gem. Change the capital U in the following to a small s.

    preg_match_all( "/$start(.*)$end/U", $string, $match );

    Then there can be multiple lines of text or code between my custom tags. Now my input and output look "normal" and I can extract any chunk of text data I like from my content file.

  43. Benedikt said,

    on February 13th, 2010 at 10:44 am

    I have the same probleme as D.eitner but using

    preg_match_all( "/$start(.*)$end/s", $string, $match );

    like he suggested does not work for me. I do only get one result including everything from the first $start character to the last $end character.

  44. Benedikt said,

    on February 14th, 2010 at 6:14 am

    Solution for my probleme:
    preg_match_all( "/$start(.*)$end/Us", $string, $match );

    As I learned today the so-called modifiers (which is /U, /s, etc.) can not only be used as one OR the other but also one AND the other and that is the solution. /s makes the string being single-lined and /U takes care that the parser does not get greedy by trying to give back the longest matching pattern, but several patterns instead.

Leave a reply

*
To prove you're a person (not a spam script), type the security word shown in the picture. Click on the picture to hear an audio file of the word.
Click to hear an audio file of the anti-spam word