PHP
  Home arrow PHP arrow Page 2 - Developing a User Personalization Syst...
Administration  
AJAX  
Apache  
BrainDump  
DHTML  
Flash  
Java  
JavaScript  
Multimedia  
MySQL  
Oracle  
Perl  
PHP  
Practices  
Python  
Reviews  
Security  
Style-Sheets  
Web Services  
XML  
Zend  
Zope  
Forums Sitemap 
IBM® developerWorks 
Sun Developer Network 
E-Commerce Hosting 
Linux Web Hosting 
Managed Hosting 
Small Business Hosting 
Mobile Linux 
App Generation ROI 
VPS Hosting 
Weekly Newsletter

 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
PHP

Developing a User Personalization System with PHP and Cookies
By: Duncan Lamb
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 4 stars4 stars4 stars4 stars4 stars / 6
    1999-09-20

    Table of Contents:
  • Developing a User Personalization System with PHP and Cookies
  • Grabbing Headlines
  • User Login
  • Reading from Cookies
  • Conclusion

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT


    Developing a User Personalization System with PHP and Cookies - Grabbing Headlines


    (Page 2 of 5 )

    To grab headlines of some popular news sites, we'll use Perl, undoubtedly the workhorse when it comes to searching through text (or html) files. The script is easily adaptable for use on other sites, as you'll see later on.

    The way we will grab the headlines is to fetch the page they reside on, then parse the html, looking for a pattern which indicates the beginning of a headline (normally a font size or color declaration). Then the script will save the text after that match, and before a match indicating the end of the headline.

    Probably the easiest example to use are the headlines off of Slashdot. Slashdot encourages this sort of thing (within reason) by providing a page with the essential information describing each article. You can view this file here: http://slashdot.org/slashdot.xml

    Here's the whole script:


    #!/usr/bin/perl $pagename="Slashdot"; $newsurl ="www.slashdot.org/slashdot.xml"; $homeurl="http://slashdot.org/"; $file="slashdot.lnk"; $before = "<title>"; $after = "<"; $webdog = "story"; #don't search till this is found #$' is post match, $` is prematch. @lines = `perl webget.pl -q $newsurl`; #First line: make it proper home page URL @headlines[0] = "<a href=\"$homeurl\" class=\"newstable\"><b>$pagename</b><font size=1>\n"; $found = 0; $count = 0; foreach $line (@lines) { if ($line =~ /$webdog/i) { $found = 1;} if (($found) and ($line =~ /$before/i)) { $_ = $'; #grabs everything after match /$after/i; $headline = $`; push (@headlines,("<br>".$headline."\n")); #make all font changes, colors, etc. on this line $count++; last if ($count == 5); } } push (@headlines,"</font></a>"); #print @headlines; open (FILE, ">$file"); foreach $headline(@headlines){ print FILE $headline; } close FILE;

    This file constructs a very small file with just the headlines that fits nicely into a table cell. Lets step through this file a chunk at a time to better understand what is going on.

    Lines 3-9 define the variables we'll use to grab the headlines. $before and $after hold stings of text which are before and after the headlines -- the text between these two matches on the same line will be grabbed as the headline. $webdog is a variable put in to make searches a little faster the real search doesn't start until this tag is reached.

    Line 13 makes a system call to the webget script (with the "quiet" modifier) and puts the results in an array, @lines. I chose webget.pl because it is a single script which requires no external libraries, and is freely available. Similar results can be achieved using the LWP library and its "GET" function. At this point, the entire html file of the remote news page is the @lines array, and now we can begin to manipulate it as we wish.

    Line 16 sets array element $headlines[0] with the title of the page, font sizes, etc. On 18-19, $found is a flag for if the search for the headlines can begin, while $count will keep track of how many headlines have been found.

    Line 18 begins the loop that searches line by line through the page for headlines. Each line is checked to see if there is a match for our $webdog variable. If there is no match, the next line is checked. When a match is finally found, the $found flag is changed to "1".

    Once the $found flag is set, the script looks for the $before text in each line (Line 21). If the text matches, we grab all the text after that match with the $' function (also called $POSTMATCH). The two functions used here a very useful, and allow the script to grab a string we don't necessarily have a match for (like constantly changing headlines). Here are the functions:

    $' grabs text after a successful match. $` grabs text before a successful match.

    Using these together allow us to grab a headline by knowing the html tags on either side. On Line 22 we place the text after the first match into the default variable ($_), match the tag after the headline, then strip the headline (which is before that last match) and put it in the $headline variable. Whew!

    Now that the headline is in an easily handled variable, we just push it onto the @headlines array and add a <BR> for readability (Line 25).

    Line 26 and 27 are used to limit the number of headlines we grab. Once a limit to the number of headlines is reached (in this case 5), the script breaks the loop. If this limit is never reached, the loop will end after all the lines in the file have been examined.

    Line 30 on are for housekeeping. The open font and link tags are closed on line 30. Line 31 is commented out for normal operations, but is very useful when debugging your match choices at the command line. After that, we print the @headlines array to the file we specified in $file at the beginning of the script.

    Note that this is a very simple script that takes you right to the news page. It would be fairly easy to have a separate url for each story, or even more features. And it's easy to customize for other sites by changing the variables at the top of the script.

    The script creates a file for two reasons: the content will be used many times, and site owners usually don't mind a query every hour or so, but it could be a bit much if your site gets a lot of traffic, and this script is run every time. So the best thing to do with this script, and others you customize to get news from other sites, is to put them in a cron job. After you have a few ascripts, each regularly polling a site, you should have several small text files being regularly produced with their headlines. To keep everything organized, put all of your headline-grabbing scripts into a subdirectory called "news".

    In this example, we will take some headlines from a couple of sites some people may visit often, then make a form to collect logins and preferences, and store it all in database. Every time the user returns, a script reads his cookie, retrieves his preferences, and builds a page showing what they want to see. We'll start out with a smattering of Perl to help us automate collecting those headlines.

    First, lets create the table in mysql, in a database named "project":


    CREATE TABLE users ( login char(16) NOT NULL, password char(10) NOT NULL, lastlogin date DEFAULT '0000-00-00' NOT NULL, news1 char(20), news2 char(20), news3 char(20), PRIMARY KEY (login) );

    More PHP Articles
    More By Duncan Lamb


     

       

    PHP ARTICLES

    - Working With Different Namespaces in PHP 5
    - User Management Explained: Overview
    - Using Namespaces in PHP 5
    - Database Security: Guarding Against SQL Inje...
    - Building a Modular Exception Class in PHP 5
    - Database and Password Security for Web Appli...
    - Handling MySQL Data Set Failures in PHP 5
    - Building Site Registration for Web Applicati...
    - Intercepting Customized Exceptions in PHP 5
    - Securing Your Web Application Against Attacks
    - Sub Classing Exceptions in PHP 5
    - Authentication for Web Application Security
    - Building a Content Management System with Co...
    - Filters and Login Systems for Web Applicatio...
    - Working with the Email Class in Code Igniter





    © 2003-2008 by Developer Shed. All rights reserved. DS Cluster 2 hosted by Hostway
    Stay green...Green IT