Hotfile link extractor
A mate of mine needed a link extractor for his hotfile.com account so that he could drop a bunch of download page links in and get the premium links to the actual file back, all in a single page, or file. From that page or file he can load them into his download program. These are all legit links btw, no illegal shit.
Anyways I made a bot that has a form to start with where you dump your download urls. Then it goes to the hotfile site, logs in with the given user credentials, runs through each link that the bot was given, and extracts the premium download URL to the file. Then it either prints the links to the page OR it saves the links into a files.txt in the web folder and forces a download in the browser.
The manual way to do that is go to every hotfile.com download link, click on the download file button to start the download, and repeat for every download url you have. This you can throw 50 hotfile.com download links in the form, hit go and it will give you all the premium download links at once. It probably breaks the hotfile terms and services so you could put your account in jeopardy (not my fault) but I don’t use the script, I just solved a problem.
Here’s the code for it if anyone wants to use it. Would be cool if you let me know if it was useful as well ![]()
Please note you will also need this (SimpleHTMLDOM) and this (Sean Hubers Curl Wrapper)
/**************************************************************************
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*
* There is one condition on the use of this software and that is that
* you accept that you are not part of or party to any DCMA agents or
* companies, nor do you represent any anti-piracy groups or industry
* distributers.
*
****************************************************************************
hotfilebot.php VERSION 1.5
This bot will take the URLs for a download link from hotfile.com and extract
the URLs of the actual file in each URL. This is so you can run it on a
personal web server, and automatically extract every file URL from every
download page without having to go through them 1 by 1. You need to have
a hotfile account for this to work and you need to fill out your details
below.
Please let me know if you find this bot useful :)
Copyright (c) 2009 Robert McLeod <hamstar[[@]]telescum.co.nz>
http://www.hamstar.co.nz/2009/09/24/hotfile-link-extractor/
*/
// Disable errors
ini_set('display_errors',0);
// Define your own username and password for hotfile.com
// And also the filename and extension you want to save the urls to
define('HOTFILE_USER', 'your_username');
define('HOTFILE_PASS', 'your_password');
define('OUTPUT_FILE_NAME','files');
define('OUTPUT_FILE_EXT','.txt');
define('SLEEP_TIME',100); // Time to sleep between requests in milliseconds
// Don't touch these
define('TO_FILE',$_POST['tofile']);
define('OUTPUT_FILE', OUTPUT_FILE_NAME . OUTPUT_FILE_EXT);
// Error messages to use when iterating through the failuresArray
$errors[1] = 'the file was removed from hotfile.';
$errors[2] = 'the HTTP status code was not 200.';
// Function to check if its a file output
function tofile() {
return TO_FILE; // Should return a true or false
}
// Force a download of the file
if($_GET['getfile'] == true) {
// Turn of output compression for IE
if(ini_get('zlib.output_compression')) { ini_set('zlib.output_compression', 'Off'); }
// Print the contents of the file to the browser
header('Content-type: application/force-download');
header('Content-Disposition: inline; filename="' . OUTPUT_FILE . '"');
header('Content-Transfer-Encoding: Binary');
header('Content-length: '.filesize(OUTPUT_FILE));
header('Content-Type: application/octet-stream');
header('Content-Disposition: attachment; filename="' . OUTPUT_FILE . '"');
readfile(OUTPUT_FILE);
exit;
}
?>
<html>
<head>
<title>Hotfile FileURL Fetcher</title>
</head>
<body>
<?php
// Check that there is a filelist incoming
if($_POST['urls']) {
// Include CURL and HTML
include 'lib_curl.php';
include 'lib_html.php';
// Set username and pass
$user = HOTFILE_USER;
$pass = HOTFILE_PASS;
// Set login and post data
$loginUrl = 'http://hotfile.com/login.php';
$postData = "user=$user&pass=$pass";
$authString = '<b>' . HOTFILE_USER . '</b> | <a href="/logout.php';
// Start curl
$c = new Curl;
$c->user_agent = $_SERVER['HTTP_USER_AGENT'];
// Check if already logged in
$r = $c->get('http://hotfile.com');
$html = $r->body;
$count = 0;
while(!strstr($html,$authString)) {
// Try logging in with the user details
usleep(SLEEP_TIME);
$r = $c->post($loginUrl,$postData);
$html = $r->body;
// For some reason we couldn't log in (after 2 tries)
if($count == 2) {
die('Couldn\'t log in for some reason');
}
// Increment the counter
++$count;
} # end login while loop
// Split up the urls by newline character
$urls = explode("\n",$_POST['urls']);
// Print messages
echo '<div style="font-family: arial; margin: 0px auto; text-align: center;">';
echo '<p>OK, here are your '.count($urls).' file URLs</p>';
// Run through each URL
foreach($urls as $u) {
// Trim whitespace characters so we don't
// get a 400 bad request error
$u = trim($u);
// Unset HTML first
unset($html,$c,$r,$h,$fn,$fileurl);
// Get the html of the current URL
$c = new Curl;
$c->user_agent = $_SERVER['HTTP_USER_AGENT'];
$r = $c->get($u);
$h = $r->headers;
// If the status code is 200 we are good to go
if($h['Status-Code'] == 200) {
// Parse the HTML into html
$html = str_get_html($r->body);
// Check that the file has not been removed
if(!strstr($html,'This file is either removed due to copyright claim or is deleted by the uploader.')) {
// Get the href (protip: its in h3)
$fileurl = $html->find('h3',0)->find('a',0)->href;
// Get the filename
$fn = substr($fileurl,strrpos($fileurl,'/')+1);
// Put the url and filename into the files array
$filesArray[] = array('fn' => $fn, 'url' => $fileurl);
} else {
// Notify removal into failures array
$failuresArray[] = array('url' => $u, 'errno' => 1);
} # end file removal check
} else {
// Notify status into failures array
$failuresArray[] = array('url' => $u, 'errno' => 2);
} # end status code check
// Dont hammer
usleep(SLEEP_TIME);
} # end URL forloop
// Wipe the contents of our file cos this is a fresh load
if(tofile()) { file_put_contents(OUTPUT_FILE, ''); }
// Run through the files array
foreach($filesArray as $f) {
if(tofile()) {
// Add the current URL to the file
file_put_contents(OUTPUT_FILE, $f['url']."\r\n", FILE_APPEND);
} else {
// Else print it to the screen
echo "<p><a href='{$f['url']}'>{$f['fn']}</a></p>";
}
}
// Print the file download link
if(tofile()) {
echo '<p>Here is your file of <a href="'. OUTPUT_FILE . '">files</a></p>';
echo '<meta http-equiv="refresh" content="1; url=' . $_SERVER['SCRIPT_NAME'] . '?getfile=true"/>';
}
// Print the failures no matter what
foreach($failuresArray as $f) {
// Print the failure
echo "<p><span style='color: red;'>Failure</span>: {$f['url']} because <span style='color: red;'>{$errors[$f['errno']]}</span></p>";
}
// Print the redo button
echo '<p><button onclick="location.href=\''.$_SERVER['SCRIPT_NAME'].'\';">KK Gimme some more!</button></p>';
echo '</div>';
} else {
?>
<div style="font-family: arial; margin: 0px auto; text-align: center;">
<form action="<?=$_SERVER['SCRIPT_NAME'];?>" method="post">
<p>Give me some download URLs and I will munch them up and spit out file URLs</p>
<p><textarea name="urls" style="width: 1000px; height: 400px;"></textarea></p>
<p style="font-size: small;">* make sure you put a newline between each file</p>
<p><input type="submit" value="Gimme file URLs in a webpage!"/> |
<input type="submit" name="tofile" value="Gimme the file URLs in a file!"/></p>
</form>
</div>
<?php
} # end post urls check
?>
</body>
</html>
The next step is to try to get it to integrate with the download managers wget and axel. That could be tricky, basically a php backend/process monitor.
Tags: bot, code, download, hotfile, link, php, premium, spider, url, web, web development
Here is a good Rapidshare, Megaupload, and Hotfile Premium Link Generator: http://leechdl.com