//// extract links ////
function extract_links($text) {
preg_match_all(‘/<\s*a[^<>]*?href=[\’”]?([^\s<>\’”]*)[\’”]?[^<>]*>(.*?)<\/a>/si’,
$text,
$match_array,
PREG_SET_ORDER);
$return = array() ;
foreach ($match_array as $serp) {
$full_anchor = $serp[0];
$href = $serp[1];
$anchortext = $serp[2];
if ( (preg_match(“/http:/i”,$href)) &&
(!preg_match(“/cache/i”,$href)) &&
(!preg_match(“/google.com/i”,$href)) &&
(!preg_match(“/youtube.com/i”,$href)) &&
(!preg_match(“/wikipedia.org/i”,$href)) &&
($href[0]!= ‘/’) ) {
$anchor_array = array($href,$anchortext) ;
array_push($return,$anchor_array) ;
}
}

return $return ;
}
/////////////////////////

Get The Blockchain Sector Newsletter, binge the YouTube channel and connect with me on Twitter

The Blockchain Sector newsletter goes out a few times a month when there is breaking news or interesting developments to discuss. All the content I produce is free, if you’d like to help please share this content on social media.

Thank you.

Disclaimer: Not a financial advisor, not financial advice. The content I create is to document my journey and for educational and entertainment purposes only. It is not under any circumstances investment advice. I am not an investment or trading professional and am learning myself while still making plenty of mistakes along the way. Any code published is experimental and not production ready to be used for financial transactions. Do your own research and do not play with funds you do not want to lose.

Posted

01/04/2011

Code, SEO

James

Tags:

extract links, function, get, grab, html, php, scraping, webpage

Two php functions for scraping content and extracting links