Scraping Putlocker.bz Links to Movies

Maybe I was a bit of a bitch, after promising some free scraping classes and then lunching it out, so here’s one while you wait for my next posts. Essentially, it scrapes all of the alternative movie links from putlocker.bz

Disclaimer: I in no way endorse the illegal watching or downloading of movies. Go buy the fucking DVD or watch it on NetFlix or whatever…

It’s a very much simplified version of a part of something I was working on for a client recently.

<?php

	class Putlocker {
	
		public function searchMovieLinks($title) {
			
			$query = str_replace(' ', '+', htmlspecialchars_decode($title, ENT_QUOTES));
			
			$searchPage = $this->curlGet('http://putlocker.bz/search/search.php?q=' . $query);
			
			$searchPageXPath = $this->returnXPathObject($searchPage);
			
			$searchPageLinks = $searchPageXPath->query('//div[@class="content-box"]/table[last()]/*/*/a/@href');
			$searchPageNames = $searchPageXPath->query('//div[@class="content-box"]/table[last()]/*/*/div/a');
			
			if ($searchPageLinks->length > 0) {
				for ($i = 0; $i <= $searchPageLinks->length; $i++) {
					if (trim(strtolower($searchPageNames->item($i)->nodeValue)) == trim(strtolower(htmlspecialchars_decode($title, ENT_QUOTES)))) {
						$moviePageLink = $searchPageLinks->item($i)->nodeValue;
					}
				}
			}
			
			$moviePage = $this->curlGet($moviePageLink);
			$moviePageXPath = $this->returnXPathObject($moviePage);
			$moviePageLinks = $moviePageXPath->query('//td[@class="entry"]/a/@href');
			
			if ($moviePageLinks->length > 2) {
				for ($i = 2; $i < $moviePageLinks->length; $i++) {
					$movieLinks [] = $moviePageLinks->item($i)->nodeValue;
				}
			}
			
			return $movieLinks;
				
        }
        
		// Method to return XPath object
		public function returnXPathObject($item) {
			$xmlPageDom = new DomDocument();	// Instantiating a new DomDocument object
			@$xmlPageDom->loadHTML($item);	// Loading the HTML from downloaded page
			$xmlPageXPath = new DOMXPath($xmlPageDom);	// Instantiating new XPath DOM object
			return $xmlPageXPath;	// Returning XPath object
		}	
		
		// Method for making a GET request using cURL
		public function curlGet($url) {
			$ch = curl_init();	// Initialising cURL session
			// Setting cURL options
			curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);	// Returning transfer as a string
			curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);	// Follow Location: headers
			curl_setopt($ch, CURLOPT_URL, $url);	// Setting URL
			$results = curl_exec($ch);	// Executing cURL session
			curl_close($ch);	// Closing cURL session
			return $results;	// Return the results
		}
			
	}

Save it as putlocker.php

To use it:

<?php

include('putlocker.php');

$putlocker = new Putlocker();

$title = 'whatever movie you want';

try {
	$links = $putlocker->searchMovieLinks($title);
} catch (Exception $e) {
	// Add your error handling class in here...
}

if ($links) {
	print_r($links);
}

?>

…and you’ll get an array of the ‘alternative movie links’ from putlocker.bz

See, I told you it was pointless posting uncommented code (well, where the only comments are from previously posted methods) without any reference or instructions to the new methods…now you can scrape a few links, but you probably don’t know how you did it.

6 thoughts on “Scraping Putlocker.bz Links to Movies

  1. Hello mate,
    Any way to get in contact with you directly?
    Tried the contact form but the page spins out indefinitely when you submit the form.

    Cheers

    1. Sure, I’ve been meaning to get that fixed. In the mean time, you can email me directly: jacob …you know what goes in here for an email address… candyboxmedia.com

Leave a Reply