March 31st, 2006
PHP – parse a string between two strings
This is a handy little function to strip out a string between two specified pieces of text. This could be used to parse XML text, bbCode, or any other delimited code/text for that matter.
function get_string_between($string, $start, $end){
$string = " ".$string;
$ini = strpos($string,$start);
if ($ini == 0) return "";
$ini += strlen($start);
$len = strpos($string,$end,$ini) - $ini;
return substr($string,$ini,$len);
}
$fullstring = "this is my [tag]dog[/tag]";
$parsed = get_string_between($fullstring, "[tag]", "[/tag]");
echo $parsed; // (result = dog)


on September 20th, 2006 at 7:51 am
Thanks for this fine piece of work, I was exactly searching for this.
on November 13th, 2006 at 8:03 pm
This is a life saver!
on December 15th, 2006 at 9:05 am
This is a nice little function that I'm using to parse data from emails.
on December 30th, 2006 at 9:12 am
It´s a nice function, thanks!
on January 19th, 2007 at 1:16 pm
solved it perfectly. thanks.
on January 19th, 2007 at 1:29 pm
Glad it helped. You can always consider the $2 donation as mentioned above
on January 27th, 2007 at 8:04 pm
Great script…though, what if the string contains multiple instances of the same tag…a bold for example?
on January 28th, 2007 at 11:12 am
It will just parse the first instance. In which case, just add a third, optional parameter to indicate the starting character search position (int)
on February 9th, 2007 at 12:50 am
Greate script.It helps me a lot.
on May 25th, 2007 at 12:16 pm
Hi-I was wondering how it would be made to parse this string> "dog+cat+mouse+fish"
so that the result would be
"dogcatmousefish"
Thanks in advance.
on May 25th, 2007 at 12:25 pm
I think you'd have to use eregi for that
on May 26th, 2007 at 10:00 am
str_replace("+","","dog+cat+mouse+fish") = "dogcatmousefish"
on July 14th, 2007 at 8:46 am
Life saver.
on November 9th, 2007 at 10:11 pm
whoop! tanks!
on June 7th, 2008 at 11:45 am
I haven't tried this yet, but it looks like this function wouldn't work if $start appears at the very beginning of $string: strpos($string,$start) would return "0", which would make the function react as though strpos($string,$start) were "false" (ie, boolean 0), & thus the function would return "".
Perhaps using the === operator would solve this:
if ($ini === 0) return ""; would be evaluated as false if strpos($string,$start) returned "0" (as in "position zero"), but evaluated as "true" if strpos($string,$start) returned false (ie, boolean zero).
Does this make sense? (I'll need to give it a try to know for sure.)
I hope this helps.
Shane1010
on July 17th, 2008 at 10:53 am
super .) this is exactly what i need. thanks
on September 20th, 2008 at 5:18 pm
Awesome–just what I was looking for, thanks mucho!
on September 23rd, 2008 at 3:04 am
Hi,
suppose if i want to find the string between &
how can i do?
on November 1st, 2008 at 3:31 pm
This helped a lot, I was trying to parse some web page and this helped thanks !
on November 5th, 2008 at 7:29 am
I've done some code to get all every strings between two tags in a given string.
I tested it a little, so take it as it is… it works perfect for my requirement.
function get_all_strings_between($string,$start,$end)
{
//Returns an array of all values which are between two tags in a set of data
$strings = array();
$startPos = 0;
$i = 0;
//echo strlen($string)."\n";
while($startPos < strlen($string) && $matched = get_string_between(substr($string,$startPos),$start,$end))
{
if ($matched == null || $matched[1] == null || $matched[1] == '') break;
$startPos = $matched[0]+$startPos+1;
array_push($strings,$matched[1]);
$i++;
}
return $strings;
}
function get_string_between($string, $start, $end){
//$string = " ".$string;
$ini = strpos($string,$start);
if ($ini == 0) return null;
$ini += strlen($start);
$len = strpos($string,$end,$ini) - $ini;
return array($ini+$len,substr($string,$ini,$len));
}
on November 29th, 2008 at 3:34 pm
This function should be available in PHP without including.
Good job!
on December 2nd, 2008 at 11:13 pm
FREAKIN' GENIUS!
Just had to say that. :~)
on January 10th, 2009 at 1:09 am
This does not appear that it will work with a string such as this
[tag]dog[tag]pet[/tag][/tag]
$parsed = get_string_between($fullstring, "[tag]", "[/tag]");
will return "dog[tag]pet" and not "dog[tag]pet[/tag]" as one should expect.
Also, your use of "if ($ini == 0)…" to evaluate the strpos is incorrect. I am assuming that is why you start by adding a space to the beginning of your string in the beginning of your function. The correct way to do this would be:
//REMOVE THIS… $string = " ".$string;
$ini = strpos($string,$start);
//CHANGE THIS… if ($ini == 0) return "";
if ($ini === false) return "";
Here is a revised version of this function that handles embeded tags.
<?php
function get_string_between($string, $start, $end){
//Calculate the length of the start and end tags
$lenStart = strlen($start);
$lenEnd = strlen($end);
$startTag = strpos($string, $start);
///If there is no initial match to the $start string, return an empty string
if ($startTag === false) return "";
//Calculate the start tag position and the first end tag position
$strStart = $startTag + $lenStart;
$strEnd = strpos($string, $end);
//Set a counter for the tags
$tagCount = 0;
//Use $test to see if there is another $start string after the first, but before the $strEnd position
$test = strpos($string, $start, $strStart);
//Use this while loop to check if there are other matching tags
while($test !== false && $strEnd > $test) {
$tagCount ++;
$next = $test + $lenStart;
$test = strpos($string, $start, $next);
}
//If there is more than one tag, calculate the new end tag position
if ($tagCount) {
for($i = 0; $i < $tagCount; $i++) {
$strEnd = strpos($string, $end, $strEnd + $lenEnd);
}
}
return substr($string, $strStart, ($strEnd – $strStart));
}
$string = "I say, [tag]this is my [tag] animal [tag] cat [tag] tabby [/tag][/tag][/tag][/tag]";
//Use a while loop to parse until there are no matching tags left
while(($parsed = get_string_between($string, "[tag]", "[/tag]")) != "") {
$string = $parsed;
echo "\"".$parsed."\"</br>";
}
echo "DONE";
?>
Hope this helps some of you…
on January 20th, 2009 at 3:27 pm
A small modification so you don't have to pad the string:
function getBetween($string, $start, $end) {
$ini = strpos($string,$start);
if ($ini === false) return "";
$ini += strlen($start);
$len = strpos($string,$end,$ini) – $ini;
return substr($string,$ini,$len);
}
on February 14th, 2009 at 1:41 pm
Dear all,
I used this get_string_between function for a long time ago
but finally I found Luc's function (get_all_strings_between) and it was a big event,
but after testing get_all_strings_between against 4740 matches I realized something very important:
Getting 4740 matches using Luc's function took: 1.1274 seconds ( that's huge )
but using another very simple regexp function to get all strings took only 0.0504 seconds
the new function is:
function get_string_between($string, $start, $end){
preg_match_all( "/$start(.*)$end/U", $string, $match );
return $match[1];
}
And Yes just like that
but you must be aware of escaping the start and end. i.e:
if you want to get all strings between (+( and )+)
you have to:
get_all_strings_between($text, "\(\+\(", "\)\+\)");
instead of:
get_all_strings_between($text, "(+(", ")+)");
and you will save more than one seconds when processing very big text..
hope that was useful
on February 14th, 2009 at 1:42 pm
Sorry the new function is:
function get_all_strings_between($string, $start, $end){
preg_match_all( "/$start(.*)$end/U", $string, $match );
return $match[1];
}
forget the _all_ in function name
on April 22nd, 2009 at 4:40 am
Thanks,Its a real life saver
on May 22nd, 2009 at 3:59 am
Thank you very much!
on June 13th, 2009 at 7:29 pm
[...] looking in the internet for a good function to strip a string, I found this blog where this function is [...]
on June 15th, 2009 at 11:38 pm
Thank you very much for this code.!!
on June 24th, 2009 at 1:01 am
Hi !
Its PERFECT !!!
Thank YOU !!
on June 28th, 2009 at 7:29 am
Exactly what I was looking for. Simple but very effective.
Thanks.
on July 25th, 2009 at 2:27 am
[...] looking in the Internet for a good function to strip a string, I found this blog where this function is implemented. [code brush="php"] function get_string_between($string, $start, [...]
on August 3rd, 2009 at 11:47 pm
Nice Man… It Worked
on September 19th, 2009 at 1:52 pm
Thank you very much for sharing. I expect to use this over and over and over and over and over and over…
Looking forward to viewing the rest of your site for more gems like this.
on October 23rd, 2009 at 4:55 am
Hello people!
This code works great! thanks
However when i intend to find a particular string pattern in between the strings "[[" and "|" , the last string pattern alone does not return proper results.
For eg :
Say $string = " B comes [[afterA|just an example]] and C comes [[afterB|just an example]]";
$start = "[["
$end = "|"
In this case , following is the string pattern i am supposed to get :
1)afterA
2)afterB
But what i get is
1)afterA
2)just an example]]";
Can someone please help me identify the problem?
on October 24th, 2009 at 1:29 pm
You have no idea. This is just what the doctor ordered. works well in filtering and parsing data out of an xml response. u b d man
on October 29th, 2009 at 2:57 pm
Great piece of code. Exactly what I needed.
on November 21st, 2009 at 4:13 pm
I'm working with Bassel Safadi's function from February 14, 2009 but having problems when there are additional (different) HTML tags between my start and end tags.
I am extracting title and meta description elements out of a file. To use this function I create my own custom tags ( and ) and place these around my other tags like so:
This is my site
When I run the function on this, it fails to return the stuff in between, and I believe it's because the stuff in between contains slashes as part of the other tags, which I want to keep. Is there some way to escape embedded slashes within the (.*) bit? Many thanks.
on November 21st, 2009 at 4:16 pm
Oops, can't use angle brackets. Let me try that again.
My custom tags are [desctag] and [/desctag] (replace [ and ] with angle brackets).
Where it reads "This is my site" it should be:
[desctag]
[title]This is my site[/title]
[meta name="description" content="some info about my site"]
[/desctag]
on November 21st, 2009 at 8:32 pm
I had some troubles using Bassel Safadi's function from February 14, 2009 but I realized the problem was newline characters. I had my own custom HTML-like tags surrounding a block of standard HTML tags which I wanted to pass as a variable into a template. If the tags were on separate lines, the function failed. If they are all on one line, it works, though I do have to use:
return $match[1][0];
to get the correct output. Anyone know how to make this work when the enclosed block of text/tags contains newlines?
on November 23rd, 2009 at 8:00 pm
A friend pointed me to a regexp help site which offered this gem. Change the capital U in the following to a small s.
preg_match_all( "/$start(.*)$end/U", $string, $match );
Then there can be multiple lines of text or code between my custom tags. Now my input and output look "normal" and I can extract any chunk of text data I like from my content file.
on December 13th, 2009 at 3:41 pm
Thank you sooo much! This was a huge life saver for me. Here I was trying to use preg_match and preg_replace…which worked, but this is entirely way more reliable.
Thanks again!
on January 16th, 2010 at 11:44 am
thanks a lot…..i was searching for this kind of code………
on February 13th, 2010 at 10:44 am
I have the same probleme as D.eitner but using
preg_match_all( "/$start(.*)$end/s", $string, $match );
like he suggested does not work for me. I do only get one result including everything from the first $start character to the last $end character.
on February 14th, 2010 at 6:14 am
Solution for my probleme:
preg_match_all( "/$start(.*)$end/Us", $string, $match );
As I learned today the so-called modifiers (which is /U, /s, etc.) can not only be used as one OR the other but also one AND the other and that is the solution. /s makes the string being single-lined and /U takes care that the parser does not get greedy by trying to give back the longest matching pattern, but several patterns instead.
on June 3rd, 2010 at 1:23 pm
Fantastic! This was exactly what I was looking for. Thanks!
on June 4th, 2010 at 7:09 pm
function string_between($string, $start, $end){
preg_match($end,$string,$match);
$string = " ".$string;
$ini = strpos($string,$start);
if ($ini == 0) return "";
$ini += strlen($start);
$len = strpos($string,$match[0],$ini) – $ini;
return $start.substr($string,$ini,$len).$match[0];
}
on October 6th, 2010 at 9:43 am
found this option as well:
http://www.webhostingtalk.com/archive/index.php/t-562850.html
<?php
$start_limiter = '';
$haystack = "Hello world!";
# Step 1. Find the start limiter's position
$start_pos = strpos($haystack,$start_limiter);
if ($start_pos === FALSE)
{
die("Starting limiter ".$start_limiter." not found in ".$haystack);
}
# Step 2. Find the ending limiters position, relative to the start position
$end_pos = strpos($haystack,$end_limiter,$start_pos);
if ($end_pos === FALSE)
{
die("Ending limiter ".$end_limiter." not found in ".$haystack);
}
# Step 3. Extract the string between the starting position and ending position
# Our starting is the position of the start limiter. To find the string we must take
# the ending position of our end limiter and subtract that from the start limiter
# — thus giving us the length of our needle.
# We must add 1 to the start position, since it includes our limiter, and we must subtract 1 from the end position
$needle = substr($haystack, $start_pos+1, ($end_pos-1)-$start_pos);
echo "Found $needle between $start_limiter and $end_limiter in $haystack";
?>
When you run this code, you'll get this:
Found there between in Hello world!
You can package that into a function. Since you are just doing simple searches, its better to avoid the overhead of regular expressions. Use regular expressions when your needle is complex — for example, find all phone numbers in a paragraph.
on November 8th, 2010 at 7:15 am
There is a slight bug with this function which I had problems with when I was looping through arrays and looking for content between two spans.
For some reason despite having content between $start and $end, NULL results were being randomly returned. I managed to fix this by replacing the ..
$if ($ini == 0) return ""; with.. if ($ini === false) return null;
Here is the solution:
function get_string_between($string, $start, $end) {
$ini = strpos($string,$start);
if ($ini === false) return null;
$ini += strlen($start);
$len = strpos($string,$end,$ini) – $ini;
return array($ini+$len,substr($string,$ini,$len));
}
on December 6th, 2010 at 3:24 am
Thanks… it worked awesome for me… I like that it worked for the same exact text, so I used ^^ for both the start and stop, and it worked out fine.
Thank you so so so so much.
on February 17th, 2011 at 12:31 am
Thank you so much! I have tried doing this myself but just got a headache, and your function is the only one that works for multiple things in one string that I've found. Thank you!!
on April 20th, 2011 at 4:26 pm
Hey Justin
Wanted to compliment you on your header graphic ..Nice!
I wouldn't include the "complete waste of time" alternate text though.
You never know who is coming here and how they will take it.
Only my opinion.
Here at your site looking at some PHP code. Thanks for the help.
Dan
on April 26th, 2011 at 7:54 am
This is a optimized version supported no $end attribute (and then getting all content of the line:
/// BEGIN CODE /////////////////////////////////////
function GetBetween($content,$start,$end = ""){
$r = explode($start, $content);
if (isset($r[1])){
if (!empty($end)){
$r = explode($end, $r[1]);
return $r[0];
}else{
return $r[1];
}
}
return ";
}
/// END CODE ///////////////////////////////////////
Regards,
Mat
on June 22nd, 2011 at 8:10 am
This is great for tag parsing; thanks.
on July 27th, 2011 at 8:36 am
I could kiss you on the lips. (not really, but THANKS!!!)
on August 5th, 2011 at 8:23 am
Thanks. this works like charm for me.
on September 8th, 2011 at 5:14 am
Simply Super BBBBBBBB
on October 7th, 2011 at 1:23 am
You rock!! Thank you, thank u, thanku!!
on December 3rd, 2011 at 9:49 am
I was writing a php based search engine script for my site and wanted to:
1. fetch the content of a page
2. find an occurence of the searched string
3. store the url and some text around the found string
4. get all links on that page and
5. loop throught these links and repeat steps 1-4.
I wont write the whole script, but only the extended default function that Justin has put up. So here it is:
- added a while loop which cycles throught the whole string (could be a whole html page)
- added an limit var to control how many occurences I want to fetch
- added an extra parameter $distinct which enables me to NOT store strings that allready have been stored in the results array – nice for duplicate URLs
NOTE: I am sure this function could be improved so dont hesitate do to so
To test it copy this code:
function get_string_between($string, $start, $end, $limit, $distinct=true)
{
$result=array();
$num_of_occurences=0;
while(($ini = strpos($string,$start, $new_start))!==FALSE)
{
$ini += strlen($start);
$new_start = strpos($string,$end,$ini);
$len=$new_start – $ini;
$new_start+= strlen($end);
$found_string=substr($string,$ini,$len);
if($distinct)
{
if(!in_array($found_string, $result))
{
$result[]=$found_string;
$num_of_occurences++;
}
}
else
{
$result[]=$found_string;
$num_of_occurences++;
}
if($limit==$num_of_occurences && $limit!=0)
break;
}
return $result;
}
$content = file_get_contents('http://www.selectif.si/');
$result = get_string_between($content, "href=\"", "\"", 0, true);
echo "";
print_r($result);
echo "";