Managing Secure Protocol in Apache-Based Websites using PHP

When trying to maintain a secure protocol on an Apache-based website, you can expect to deal with certain issues, especially if you’re also trying to rank well in the search engines. This article provides you with some solutions for two of the more difficult problems: duplicate content and 301 redirects.

The secure protocol (https://) is highly important for security, particularly for commercial websites. If you are a beginner with Internet technology you may have met the two most common protocols used by Internet browsers. They are:

  • HTTP (Hypertext Transfer Protocol) is the used protocol for governing the communication between the client (most likely the browser) and the server (where the website contents are located).

  • HTTPS (Secure hypertext transfer protocol) is the protocol for governing the communication between the client and the server, but encrypts the data as it travels in the data communication network, in the form of packets. This makes it impossible for someone to intercept data in the network, because it cannot be understood due to encryption.

HTTP is common and easy to manage; a lot of website templates will be able to run without problems. But things become a bit complicated when running HTTPS in a website.

Apache-based websites are common on the Internet today, occupying a higher market share than IIS (Microsoft Servers). The primary reason is simple: Apache is an open source web server, which means it is free to use.

PHP is the most common server side scripting language; it is commonly used with Apache websites. And in modern online business technology, https is becoming more and more popular due to increasing security risks. Thus, it is necessary to manage secure protocols in Apache websites.

This article provides tips and solutions to help any web developer effectively manage the two most difficult problems in maintaining the secure protocol side of any website. These are the:

  • Duplicate content

  • 301 redirection from the non-https to http version

The primary objective is to not cause any issues pertaining to search engine optimization. All recommended scripts in this article will use PHP, as it runs smoothly with an Apache website.

{mospagebreak title=First Potential Problem: Duplicate Content!}

If your security certificate is now issued or running and the configuration of the https was done right on your server, the first thing you need to avoid is duplicate content in the https protocol. A lot of websites make this mistake, and Google ends up indexing both protocols (http and https). This is considered a serious duplicate content issue in Google.

If both the http and https versions of your web site can be indexed, you will see serious ranking issues stemming from the duplicate content. These issues arise because Google splits up the link authority between the http and https protocol instead of concentrating it in the http protocol, which is preferable to rank in Google. This weakening of the internal link structure affects the site’s search engine optimization efforts.

There are basically two ways to correct this. The first method seems to be the most effective in start up websites (if Google has not yet indexed the https version). The second method will be applicable if Google already indexes the https version aside from the canonical http version, and the https version has some Google Page rank.

FIRST SOLUTION: In all https (secure) pages, place a Meta no index tag.

Wow, what is Meta no index tag anyway? The most recommended format is:

<META NAME="ROBOTS" CONTENT="NOINDEX">

According to Google, when they see this tag on the page, Googlebot will still crawl the page as well as the links, but will not index the site to place it in the search results. This should avoid issues with duplicate content.

So what is the PHP script for the first solution?


<?php

if (isset($_SERVER['HTTPS']) && strtolower($_SERVER['HTTPS']) == ‘on’)

{

echo ‘<META NAME="ROBOTS" CONTENT="NOINDEX">’. "n";

}

else

{

echo ‘<META NAME="ROBOTS" CONTENT="INDEX, FOLLOW">’;

}

?>


This script will check to see if the protocol running is https. If it is a secure protocol, it will return the meta noindex tag; otherwise, the page will be indexable by Googlebot.

To use this script, you must insert it script somewhere in the <head> </head> section of your website template (one is that used by all pages, like header.php in WordPress). To find out if this is working according to plan, open any URL in your website which is not https. Try to view the source code of that page. You should see this:

META NAME="ROBOTS" CONTENT="INDEX, FOLLOW">

Otherwise, if it is a secure protocol, it should have this instead:

<META NAME="ROBOTS" CONTENT="NOINDEX">

{mospagebreak title=Second Duplicate Content Solution}

SECOND SOLUTION: In all https (secure pages), place a link rel canonical tag in the <head> section of the main website templates pointing to the equivalent http version of the URLs.

If the secure protocol already earns some Google page rank juice, using the meta no index tag is not the best solution. Instead, use the link rel canonical tag:

<link rel="canonical" href="http://www.thisisyourwebsite.biz/" />

To use this tag, place it somewhere on the <head> section of your website template. This is how it works: when the Google bot visits an https version of the URL, the server returns the https version but on the <head> section of the source code, you can see this tag: <link rel="canonical" href="http://www.thisisthehttpversion.biz/" /> Google will crawl the http version but not the https version.

It acts like a 301 redirect, but the URL on the address bar is not even redirected at all. In this situation, Google will award any page rank or any URL properties to the canonical http version. Therefore, even if the https version is indexable, Google will only display the http version in its search results. And if  previously the https version has earned a Google page rank, it will now be transferred to the http version, which is the canonical version.

A sample PHP script that will execute this job is:


<?php

$URL=$_SERVER["SERVER_NAME"].$_SERVER["REQUEST_URI"];

if (isset($_SERVER['HTTPS']) && strtolower($_SERVER['HTTPS']) == ‘on’)

{

echo ‘<link rel="canonical" href="http://’.$URL.’" />’;

}

?>


If your website is using osCommerce or other popular software packages, there is a more recommended PHP script to cover canonical issues – not only the secure vs insecure issues, but also the non-www and www issues.

In my article on using the link rel="canonical" tag to solve for canonical issues in Apache/PHP powered websites, I recommended this PHP script:

<?php

//place this script between the <head> and </head> section of your header.php or related dynamic website template

//such as index.php, product_info.php in the OsCommerce templates

//this script is applicable when the CANONICAL PROTOCOL IS HTTP AND USING WWW VERSION.

//this script is NOT APPLICABLE to a subdomain of a main domain.

//Example: if your canonical version is www.mysite.com, you should NOT be using the script in any of its subdomain.

//First step eliminate any session IDs in the URL:

$requestedurl = $_SERVER["REQUEST_URI"];

//Define array of most common open source session IDs

$id=array(‘osCsid’,'zenid’,'PHPSESSID’);

if (preg_match("/osCsid/i", $requestedurl))

{

$x=0;

}

elseif (preg_match("/zenid/i", $requestedurl))

{

$x=1;

}

elseif (preg_match("/PHPSESSID/i", $requestedurl))

{

$x=2;

}

if ((preg_match("/osCsid/i", $requestedurl)) || (preg_match("/zenid/i", $requestedurl)) || (preg_match("/PHPSESSID/i", $requestedurl)) )

{

//URL is session ID based

$position=(strpos($requestedurl,$id[$x]))- 1;

}

else

//no session ID

{

$position=strlen($requestedurl);

}

//trim the URLs any session ID

$cleanrequest=substr($requestedurl,0,$position);

//set protocol to http:// since this the canonical protocol

$protocol=’http://’;

//check if the server name contains www

if (preg_match("/www/i", $_SERVER["SERVER_NAME"]))

{

//the URL is using the www version

//display the complete canonical URL without any session ID

$canonical=$protocol.$_SERVER["SERVER_NAME"].$cleanrequest;

}

else

{

//append the canonical www version to the server name and display the canonical www version

$URL=’www.’.$_SERVER["SERVER_NAME"];

$canonical=$protocol.$URL.$cleanrequest;

}

//Final step defining the final link rel canonical element

echo ‘<link rel="canonical" href="’.$canonical.’" />’;

?>


This script is only applicable if the canonical version is HTTP (not HTTPS), and uses the www version of the site. A sample canonical URL could be:

http://www.thisisasampleurl.com

{mospagebreak title=Second Potential Problem: 301 Redirection from non-secure to secure protocol}

There are times when you do really need to redirect the http to the https version. A good example would be to convert an http URL to the https version of the URL so that any transaction between the browser and the server that happens on that URL will be encrypted.

Another good example would be submitting sensitive data such as passwords, security numbers or even credit card information.

So how do you go about converting the non-secure protocol to a secure protocol? Using PHP you can force a specific URL to 301 redirect to its equivalent https version.

See an example below:


<?php

function url() {

$urlofthepage = ‘http’;

if ($_SERVER["HTTPS"] == "on") {$urlofthepage .= "s";}

$urlofthepage .= "://";

if ($_SERVER["SERVER_PORT"] != "80") {

$urlofthepage .= $_SERVER["SERVER_NAME"].":".$_SERVER["SERVER_PORT"].$_SERVER["REQUEST_URI"];

} else {

$urlofthepage .= $_SERVER["SERVER_NAME"].$_SERVER["REQUEST_URI"];

}

return $urlofthepage;

}

$urlofthepage=url();

if ($urlofthepage=="http://www.thisistheurlthatyouneedtoredirecttohttpsversion.com/
securecontactexampleform.php")

{

// Permanent redirection

header("HTTP/1.1 301 Moved Permanently");

header("Location: https:// www.thisistheurlthatyouneedtoredirecttohttpsversion.com/securecontactexampleform.php ");

exit();

}

?>


Place this code on top of any HTML code, or before it outputs any HTML. This script must be placed on top of all PHP or HTML code. If it isn’t, it won’t redirect at all, and in fact will return an error.

Note that the above sample script only redirects one http version URL to its equivalent https version. You can add any number of redirections; for example, if you have a lot of secure form URLs in the domain, but are using the same template.

In this case, just add another if statement right below the first redirection. See the example below:

<?php

function url() {

$urlofthepage = ‘http’;

if ($_SERVER["HTTPS"] == "on") {$urlofthepage .= "s";}

$urlofthepage .= "://";

if ($_SERVER["SERVER_PORT"] != "80") {

$urlofthepage .= $_SERVER["SERVER_NAME"].":".$_SERVER["SERVER_PORT"].$_SERVER["REQUEST_URI"];

} else {

$urlofthepage .= $_SERVER["SERVER_NAME"].$_SERVER["REQUEST_URI"];

}

return $urlofthepage;

}

$urlofthepage=url();

if ($urlofthepage=="http://www.thisistheurlthatyouneedtoredirecttohttpsversion.com/
securecontactexampleform.php")

{

// Permanent redirection

header("HTTP/1.1 301 Moved Permanently");

header("Location: https:// www.thisistheurlthatyouneedtoredirecttohttpsversion.com/
securecontactexampleform.php ");

exit();

}

//this is the second URL to be redirected to its https version

if ($urlofthepage==
"http://www.thisisthesecondurlthatyouneedtoredirecttohttpsversion.com/
securecontactexampleform.php")

{

// Permanent redirection

header("HTTP/1.1 301 Moved Permanently");

header("Location: https:// www.thisisthesecondurlthatyouneedtoredirecttohttpsversion.com/
securecontactexampleform.php ");

exit();

}

?>


[gp-comments width="770" linklove="off" ]
antalya escort bayan antalya escort bayan