To get the response, we will use the file_get_contents function for simplicity, although other options are possible (like cURL, etc).
$response = file_get_contents($request);
if ($response === false) { die('The request fails.'); }
$unserialized = unserialize($response); $display= $unserialized['ResultSet']; $display1=$display['Result']; $totalbacklinksnotunique=$display['totalResultsAvailable'];
Principles of the above basic statements are discussed here:
http://www.devshed.com/c/a/PHP/Using-the-Yahoo-Site-Explorer-Inbound-Links-API/
http://www.devshed.com/c/a/PHP/Getting-Data-from-Yahoo-Site-Explorer-Inbound-Links-API-using-PHP/
Since it is serialized, it will be converted to an unserialized data format, and the "totalResultsAvailable: is the total back link quantity pointing to either the entire domain or to a specific URL depending on the value set by "options" on the API request URL.
Loop the results and extract the domain name only
The strategy is to store the URLs in an array in the loop process and initialize the count to 0. The size of the array which will be used to terminate the loop process is also important:
$domainarray = array(); $x=0; $count= sizeof($display1);
And the entire loop statements (discussion on PHP comments):
while ($x < $count) {
//store URLs from unserialized PHP array to myurl variable
$myurl= $display1[$x]['Url'];
//after storing the variable, do string manipulation to get domain name only //example http://www.domainurl.com/test1.htm will now become domainurl.com //This is a 9 step process.
$httpprotocolcheck =substr($myurl, 0, 7); $httpsprotocolcheck =substr($myurl, 0, 8); if ($httpprotocolcheck=="http://") {
//Step1: http version
$backlinkprotocol= "http://"; } if ($httpsprotocolcheck=="https://") {
//Step 2: https version
$backlinkprotocol= "https://"; }
//Step 3: Assign offset for each protocol
if ($backlinkprotocol=="http://") { $offsetcount= 7; } if ($backlinkprotocol=="https://") { $offsetcount= 8; }
//Step 4: Determine 1st occurrence of trailing slash after the protocol
$positiontrailingslash = strpos($myurl, '/', $offsetcount); $actualposition = $positiontrailingslash - $offsetcount;
//Step 5: Extract domain portion for test of non-www version
$domainportion = substr($myurl, $offsetcount, $actualposition);
//Step 6: Count the number of dots, if only one dot found it is non-www version
$dotcount= substr_count($domainportion, '.'); if ($dotcount==1) { $domain = $domainportion; } if (($dotcount==2) || ($dotcount==3)) {
//Step 7: This is NOT a non-www version, find position of 1st dot
$firstdotposition = strpos($myurl, '.'); $firstdotrealoffset= $firstdotposition +1;
//Step 8: Find position of 2nd dot, adjustments for TLDs like .co.uk is included
$seconddotposition = strpos($myurl, '.', $firstdotrealoffset); $differenceofdots= $seconddotposition - $firstdotposition -1; if ($differenceofdots=="2") { $domain = $domainportion; } else { $firstdotoffset = $firstdotposition + 1; $trailingoffset = $positiontrailingslash - $firstdotposition -1; $domain = substr($myurl,$firstdotoffset, $trailingoffset); } } if ($dotcount==3) { $firstdotoffset = $firstdotposition + 1; $trailingoffset = $positiontrailingslash - $firstdotposition -1; $domain = substr($myurl,$firstdotoffset, $trailingoffset); }
//Step 9: Finally, after the string manipulation, store only the domain name to the array
$domainarray[]=$domain;
//after storing the values, increment the loop by 1.
$x++; }
blog comments powered by Disqus |
|
|
|
|
|
|
|